Aggregation DSL in Mongoid

AuthorMáximo Mussini
·2 min read

Mongoid provides support for MongoDB's Aggregation Framework, but writing raw queries can be confusing and is extremely verbose:

Product.collection.aggregate [
  { '$match' => { 'country' => 'US' } },
  { '$project' => { 'categories' => 1, 'price' => 1 } },
  { '$unwind' => '$category_ids' },
  {
    '$group' => {
      '_id' => '$category_ids',
      'avg_price' => { '$avg' => '$price' }
    }
  },
  { '$sort' => { 'avg_price' => -1 } },
  { '$limit' => 30 }
]

Turns out Origin—the gem that powers Mongoid's query DSL—already provides methods for each aggregation operation. The only problem is that it doesn't provide a way to execute the aggregation, and there's no documentation about it 😬

Every aggregation method call adds an operation to an internal pipeline, which we can manually retrieve by calling the pipeline method:

products = Product.where(country: :US).
  project(categories: 1, price: 1).
  unwind('$category_ids').
  group(_id: '$category_ids', :avg_price.avg => '$price').
  desc(:avg_price).
  limit(30)

products.pipeline
# {"$match"=>{"country"=>:US}}
# {"$project"=>{"categories"=>1, "price"=>1}}
# {"$unwind"=>"$category_ids"}
# {"$group"=>{"_id"=>"$category_ids", "avg_price"=>{"$avg"=>"$price"}}}
# {"$sort"=>{"avg_price"=>-1}}
# {"$limit"=>30}

Notice how using the DSL we obtain the same aggregation pipeline that we have in the first example, except it's way more concise, and we are able to use Symbol extensions like avg, max, or add_to_set, which add a great deal of expressiveness and make our queries more concise.

Now all we have to do is pass the pipeline to the aggregate method:

Product.collection.aggregate products.pipeline

We can add a little syntax sugar using refinements and make it even more convenient:

module AggregationRefinements
  refine Mongoid::Criteria do
    def aggregate
      collection.aggregate pipeline
    end
  end
end

using AggregationRefinements

Product.project(categories: 1, price: 1).
  unwind('$category_ids').
  group(_id: '$category_ids', :avg_price.avg => '$price').
  aggregate

Or if you are using query objects, it's as simple as adding a method to the objects where you need to perform aggregations:

def aggregate
  queryable.collection.aggregate(queryable.pipeline)
end

Now we can enjoy Mongoid's fluent DSL for aggregations 😃