U-SQL 入門⑧ ~ 行番号を振る ~

行に番号を振るには ROW_NUMBER ウィンドウ関数を使います。下記にいくつかの例を紹介します。

1. 普通に行番号を振る場合

@employees = 
    SELECT * FROM 
        ( VALUES
        ("Noah",   "Engineering", 100, 10000),
        ("Sophia", "Engineering", 100, 15000),
        ("Liam",   "Engineering", 100, 30000),
        ("Amy",    "Engineering", 100, 35000),
        ("Justin", "Engineering", 100, 15000),
        ("Emma",   "HR",          200, 8000),
        ("Jacob",  "HR",          200, 8000),
        ("Olivia", "HR",          200, 8000),
        ("Mason",  "Executive",   300, 50000),
        ("Ava",    "Marketing",   400, 15000),
        ("Ethan",  "Marketing",   400, 9000) 
        ) AS T(EmpName, DeptName, DeptID, Salary);

@result =
    SELECT ROW_NUMBER() OVER () AS RowNumber,
    EmpName, DeptName, Salary
    FROM @employees;

OUTPUT @result
TO "/Output/ReferenceGuide/Ranking/row_number/exampleA.csv"
USING Outputters.Csv(outputHeader: true);

出力結果

RowNumber EmpName DeptName Salary
1 Noah Engineering 10000
2 Sophia Engineering 15000
3 Liam Engineering 30000
4 Amy Engineering 35000
5 Justin Engineering 15000
6 Emma HR 8000
7 Jacob HR 8000
8 Olivia HR 8000
9 Mason Executive 50000
10 Ava Marketing 15000
11 Ethan Marketing 9000

2. Salary の高い順に振る場合

@employees = 
    SELECT * FROM 
        ( VALUES
        ("Noah",   "Engineering", 100, 10000),
        ("Sophia", "Engineering", 100, 15000),
        ("Liam",   "Engineering", 100, 30000),
        ("Amy",    "Engineering", 100, 35000),
        ("Justin", "Engineering", 100, 15000),
        ("Emma",   "HR",          200, 8000),
        ("Jacob",  "HR",          200, 8000),
        ("Olivia", "HR",          200, 8000),
        ("Mason",  "Executive",   300, 50000),
        ("Ava",    "Marketing",   400, 15000),
        ("Ethan",  "Marketing",   400, 9000) 
        ) AS T(EmpName, DeptName, DeptID, Salary);

// ORDER BY を使って Salary 列の降順で並べている
@result =
    SELECT ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNumber,
    EmpName, DeptName, Salary
    FROM @employees;

OUTPUT @result
TO "/Output/ReferenceGuide/Ranking/row_number/exampleB.csv"
USING Outputters.Csv(outputHeader: true);

出力結果

RowNumber EmpName DeptName Salary
1 Mason Executive 50000
2 Amy Engineering 35000
3 Liam Engineering 30000
4 Sophia Engineering 15000
5 Justin Engineering 15000
6 Ava Marketing 15000
7 Noah Engineering 10000
8 Ethan Marketing 9000
9 Emma HR 8000
10 Jacob HR 8000
11 Olivia HR 8000

3. 部署ごとに Salary の高い順に振る場合

@employees = 
    SELECT * FROM 
        ( VALUES
        ("Noah",   "Engineering", 100, 10000),
        ("Sophia", "Engineering", 100, 15000),
        ("Liam",   "Engineering", 100, 30000),
        ("Amy",    "Engineering", 100, 35000),
        ("Justin", "Engineering", 100, 15000),
        ("Emma",   "HR",          200, 8000),
        ("Jacob",  "HR",          200, 8000),
        ("Olivia", "HR",          200, 8000),
        ("Mason",  "Executive",   300, 50000),
        ("Ava",    "Marketing",   400, 15000),
        ("Ethan",  "Marketing",   400, 9000) 
        ) AS T(EmpName, DeptName, DeptID, Salary);

// PARTITION BY を使って DeptID ごとに分けている
@result =
    SELECT ROW_NUMBER() OVER(PARTITION BY DeptID ORDER BY Salary DESC) AS RowNumberByDept,
           EmpName, DeptName, Salary
    FROM @employees;

OUTPUT @result
TO "/Output/ReferenceGuide/Ranking/row_number/exampleC.csv"
USING Outputters.Csv(outputHeader: true);
RowNumberByDept EmpName DeptName Salary
1 Amy Engineering 35000
2 Liam Engineering 30000
3 Sophia Engineering 15000
4 Justin Engineering 15000
5 Noah Engineering 10000
1 Emma HR 8000
2 Jacob HR 8000
3 Olivia HR 8000
1 Mason Executive 50000
1 Ava Marketing 15000
2 Ethan Marketing 9000

この記事はこちらの公式ドキュメントを参考にしたものです。
https://msdn.microsoft.com/en-us/azure/data-lake-analytics/u-sql/row-number-u-sql