U-SQL 入門⑥ ~ 集約関数 ~
U-SQL にはビルトインの集約関数が提供されています。これらの集約関数は SELECT 文の中で使用できます。
集約関数の一覧
ANY_VALUE | グループの中から任意の 1 つの値を取得する |
ARRAY_AGG | グループごとに SQL.ARRAY 値を生成する |
AVG | 平均 |
COUNT | 個数のカウント |
MAP_AGG | Key, Value 値を取り SQL.MAP 値を生成する |
MAX | 最大値 |
MIN | 最小値 |
SUM | 合計 |
STDEV | 標本標準偏差 |
STDEVP | 母標準偏差 |
VAR | 標本分散 |
VARP | 母分散 |
集約関数の実行例
@rs = SELECT Region, COUNT() AS NumSessions, SUM(Duration) AS TotalDuration, AVG(Duration) AS AvgDuration, MAX(Duration) AS MaxDuration, MIN(Duration) AS MinDuration, STDEV(Duration) AS StdevDuration, STDEVP(Duration) AS StdevPDuration, VAR(Duration) AS VarianceDuration, VARP(Duration) AS VariancePDuration FROM @searchlog GROUP BY Region;
その他集約関数
ANY_VALUE
ANY_VALUE 関数はグループの中から任意の 1 つの値を取得します。
以下ドキュメントにあるサンプルクエリを実行した結果です。
https://msdn.microsoft.com/en-us/azure/data-lake-analytics/u-sql/any-value-u-sql
クエリ
@employees = SELECT * FROM ( VALUES (1, "Noah", "Engineering", 100, 10000), (2, "Sophia", "Engineering", 100, 20000), (3, "Liam", "Engineering", 100, 30000), (4, "Emma", "HR", 200, 10000), (5, "Jacob", "HR", 200, 10000), (6, "Olivia", "HR", 200, 10000), (7, "Mason", "Executive", 300, 50000), (8, "Ava", "Marketing", 400, 15000), (9, "Ethan", "Marketing", 400, 10000) ) AS T(EmpID, EmpName, DeptName, DeptID, Salary); @result = SELECT DeptName, ANY_VALUE(EmpName) AS ArbitraryEmployee FROM @employees GROUP BY DeptName; OUTPUT @result TO "/Output/ReferenceGuide/any_value/exampleB.csv" USING Outputters.Csv(outputHeader: true);
実行結果
DeptName | ArbitraryEmployee |
---|---|
Engineering | Noah |
Executive | Mason |
HR | Emma |
Marketing | Ava |
ARRAY_AGG
ARRAY_AGG 関数はグループごとに SQL.ARRAY 値を生成します。
以下ドキュメントにあるサンプルクエリを実行した結果です。
https://msdn.microsoft.com/en-us/azure/data-lake-analytics/u-sql/array-agg-u-sql
クエリ
@employees = SELECT * FROM ( VALUES ("Noah", "cell:030-0074321"), ("Noah", "office:030-0076545"), ("Sophia", "cell:(5) 555-4729"), ("Sophia", "office:(5) 555-3745"), ("Liam", "cell:(5) 555-3932"), ("Amy", "cell:(171) 555-7788"), ("Amy", "office:(171) 555-6750"), ("Amy", "home:(425) 555-6238"), ("Justin", "cell:0921-12 34 65"), ("Justin", "office:0921-12 34 67"), ("Emma", (string)null), ("Jacob", ""), ("Olivia", "cell:88.60.15.31"), ("Olivia", "office:88.60.15.32"), ("Mason", "cell:(91) 555 22 82"), ("Mason", "office:(91) 555 91 99"), ("Mason", "home:(425) 555-2819"), ("Ava", "cell:91.24.45.40"), ("Ava", "office:91.24.45.41"), ("Ethan", "cell:(604) 555-4729"), ("Ethan", "office:(604) 555-3745"), ("David", "cell:(171) 555-1212"), ("Andrew", "cell:(1) 135-5555"), ("Andrew", "office:(1) 135-4892"), ("Jennie", "cell:(5) 555-3392"), ("Jennie", "office:(5) 555-7293") ) AS T(EmpName, PhoneNumber); @result = SELECT EmpName, string.Join(", ", ARRAY_AGG(PhoneNumber)) AS PhoneNumbers FROM @employees WHERE !string.IsNullOrEmpty(PhoneNumber) GROUP BY EmpName; OUTPUT @result TO "/Output/ReferenceGuide/Aggregate/array_agg/exampleA.csv" USING Outputters.Csv(outputHeader: true);
実行結果
EmpName | PhoneNumbers |
---|---|
Amy | cell:(171) 555-7788, office:(171) 555-6750, home:(425) 555-6238 |
Andrew | office:(1) 135-4892, cell:(1) 135-5555 |
Ava | cell:91.24.45.40, office:91.24.45.41 |
David | cell:(171) 555-1212 |
Ethan | cell:(604) 555-4729, office:(604) 555-3745 |
Jennie | office:(5) 555-7293, cell:(5) 555-3392 |
Justin | cell:0921-12 34 65, office:0921-12 34 67 |
Liam | cell:(5) 555-3932 |
Mason | cell:(91) 555 22 82, office:(91) 555 91 99, home:(425) 555-2819 |
Noah | cell:030-0074321, office:030-0076545 |
Olivia | office:88.60.15.32, cell:88.60.15.31 |
Sophia | office:(5) 555-3745, cell:(5) 555-4729 |
MAP_AGG
MAP_AGG 関数は Key, Value 値を取り SQL.MAP 値を生成します。
以下ドキュメントにあるサンプルクエリを実行した結果です。
https://msdn.microsoft.com/en-us/azure/data-lake-analytics/u-sql/map-agg-u-sql
クエリ
@employees = SELECT * FROM ( VALUES ("Noah", "cell", "030-0074321"), ("Noah", "office", "030-0076545"), ("Sophia", "cell", "(5) 555-4729"), ("Sophia", "office", "(5) 555-3745"), ("Liam", "cell", "(5) 555-3932"), ("Amy", "cell", "(171) 555-7788"), ("Amy", "office", "(171) 555-6750"), ("Amy", "home", "(425) 555-6238"), ("Justin", "cell", "0921-12 34 65"), ("Justin", "office", "0921-12 34 67"), ("Emma", (string)null, (string)null), ("Jacob", "", ""), ("Olivia", "cell", "88.60.15.31"), ("Olivia", "office", "88.60.15.32"), ("Mason", "cell", "(91) 555 22 82"), ("Mason", "office", "(91) 555 91 99"), ("Mason", "home", "(425) 555-2819"), ("Ava", "cell", "91.24.45.40"), ("Ava", "office", "91.24.45.41"), ("Ethan", "cell", "(604) 555-4729"), ("Ethan", "office", "(604) 555-3745"), ("David", "cell", "(171) 555-1212"), ("Andrew", "cell", "(1) 135-5555"), ("Andrew", "office", "(1) 135-4892"), ("Jennie", "cell", "(5) 555-3392"), ("Jennie", "office", "(5) 555-7293") ) AS T(EmpName, PhoneType, PhoneNumber); @result = SELECT EmpName, String.Join(",", MAP_AGG(PhoneType, PhoneNumber).Select(p => String.Format("{0}:{1}", p.Key, p.Value))).Trim() AS PhoneNumbers FROM @employees WHERE !string.IsNullOrWhiteSpace(PhoneType) GROUP BY EmpName; OUTPUT @result TO "/Output/ReferenceGuide/Aggregate/map_agg/exampleA.csv" USING Outputters.Csv(outputHeader: true);
実行結果
EmpName | PhoneNumbers |
---|---|
Amy | cell:(171) 555-7788,home:(425) 555-6238,office:(171) 555-6750 |
Andrew | cell:(1) 135-5555,office:(1) 135-4892 |
Ava | cell:91.24.45.40,office:91.24.45.41 |
David | cell:(171) 555-1212 |
Ethan | cell:(604) 555-4729,office:(604) 555-3745 |
Jennie | cell:(5) 555-3392,office:(5) 555-7293 |
Justin | cell:0921-12 34 65,office:0921-12 34 67 |
Liam | cell:(5) 555-3932 |
Mason | cell:(91) 555 22 82,home:(425) 555-2819,office:(91) 555 91 99 |
Noah | cell:030-0074321,office:030-0076545 |
Olivia | cell:88.60.15.31,office:88.60.15.32 |
Sophia | cell:(5) 555-4729,office:(5) 555-3745 |
この記事はこちらの公式ドキュメントを参考にしたものです。
https://msdn.microsoft.com/en-us/azure/data-lake-analytics/u-sql/aggregate-functions-u-sql