Skip to content

summarize

rel.summarize([:a, :b, ...], x: :sum, y: :max, ...)
rel.summarize([:a, :b, ...], x: -> (t,m) { m ? (m + t[:x]) : t[:x] } )

Place tuples in groups and calculate aggregate values for each group.

Example: For every city where we have stores, how big is the total revenue per city?

The first argument specifies attributes to group by, the second argument specifies aggregation operations.

Like the group operation, summarize uses a number of grouping attributes to slice up the input relation. However, summarize does not create a nested relation. Instead, summarize extends the output relation with one or several attributes containing aggregated values computed over tuples that match the grouping tuples on all their attributes.

The aggregation is specified as a hash with attribute names as keys and aggregation operations as values.

NameSQL?Numerical?Description
avgNumerical average
collectCreates an array containing all values (including duplicates)
concatString concatenation
countNumber of values (including duplicates)
distinctLike collect, with duplicates removed
distinct_countLike count, but duplicates not counted
maxPicks out the largest value
minPicks out the smallest value
sumThe sum of all values

Consult the Overview page for the data model used in this example.

supplies.summarize([:sid], qty: :sum).to_a
=> [{:sid=>"S1", :qty=>1300},
{:sid=>"S2", :qty=>700},
{:sid=>"S3", :qty=>200},
{:sid=>"S4", :qty=>900}]
SELECT `t1`.`sid`, sum(`t1`.`qty`) AS 'qty'
FROM `supplies` AS 't1'
GROUP BY `t1`.`sid`