27 - MongoDB#

Cursor Functions/Methods#

The result of a find() query is a cursor object. A cursor is a pointer to the result set of a query, and it is an iterable object (forward only).

Cursor functions apply functions to the result of a query (e.g. limit(), skip(), etc.).

db.collection.find().skip(10).limit(1)

There are several methods you can use on a cursor:

  • cursor.count()

db.collection.find().count()
  • cursor.pretty()

db.collection.find().pretty()
  • cursor.sort()

db.collection.find().sort({field:1})
  • cursor.toArray()

db.collection.find().toArray
  • cursor.hasNext(), cursor.next()

db.collection.find().hasNext()
db.collection.find().next()

If the cursor returned from a command such as db.collection.find() is not assigned to a variable using the var keyword, then the mongo shell automatically iterates the cursor up to 20 times. You must indicate if you want it to iterate 20 more times.

var myCursor = db.users.find({type:2})

To iterate 20 more times:

myCursor

Alternatively, you can use:

myCursor.next()

To print using mongo shell script in the command line:

  1. Set a variable equal to a cursor

var c = db.testData.find()
  1. Print with a while loop:

while (c.hasNext()) printjson(c.next());

You can use the toArray() function to iterate the cursor and return the documents in an array. This function loads all documents returned by the cusor into RAM. You can use an index with this function (e.g. array[3]).

Matching an Array Field#

When using the find() function to query an array field, the arguments listed yield the associated results:

  • {field:[val1, val2]}: the array field has exactly two vals in the order specified

  • {field:{$all:[val1, val2]}}: the array contains the two vals

  • {field:val}: the array contains the val

  • {field:{$elemMatch:{$gte:80,$lt:90}}}: the array contains at least one element satisfying the conditions

  • {field:{$size:2}}: the array has the given size

Other Functions#

Test if a document has a field:

fieldname:{$exists:true}

Test if a document has no value for a field:

fieldname:null

Test for an empty array:

fieldname:[]

Aggregation#

There are 3 ways to perform an aggregation:

  • Single purpose

  • Pipeline

  • MapReduce (deprecated in MongoDB 5.0)

Single Purpose Aggregation#

Single purpose aggregation aggregates documents into a single collection, but lacks the capability to pipeline the data. It includes the count() and distinct() operations.

db.collection.count({type:"MemberEvent"})
db.collection.distinct("type)

Pipeline Aggregation#

Pipeline aggregation is modeled after data processing pipelines. It includes basic operations, which apply filters like queries do; operations to sort documents; and grouping/aggregate operations. These operations include:

  • Stage operators: $project, $match, $limit, $group, and $sort

  • Boolean operators: $and, $or, and $not

  • Set operators: $setEquals, $setUnion, etc.

  • Comparison operators: $eq, $gt, $gte, $lt, and $lte

  • Arithmetic operators: $add, $mod, etc.

  • String operators: $concat, $substr, etc.

  • Text search: $meta

  • Date, variable, literal, and conditional operators

  • Accumulators: $sum, $max, etc.

You can use $match and $project in a similar manner to find(). The following produce the same output:

db.books.find({name:"Thomas"},{title:1, author:1})
db.books.aggregate([
    {$match:{name:"Thomas"}},
    {$project:{title:1,author:1}}
])

Similarly, the following function the same:

db.NYC.find({number_of_employees:{$lt:10}},{number_of_employees:1})
db.NYC.aggregate([
    {$match:{number_of_emplyees:{$lt:10}},{number_of_employees:1}}
])

Note that the following are different:

db.NYC.aggregate([
    {$project:{StartDate:1,_id:0}},
    {$match:{RequestID:82134}}
])
db.NYC.aggregate([
    {$match:{RequestID:82134}},
    {$project:{StartDate:1,_id:0}}
]);

If using cursor sort, you can apply sort after find():

find(...).sort({sortkey:1})

If using aggregation, you can use a collection sort:

db.collection.aggregate([{sort:{sort_key:1}}])

Array operators to use with projection:

Grouping query example: In the NYC database, list each AgencyName and the total sum of the ContractAmount for the agency:

This query has 2 stages: group and project.

db.NYC.aggregate{[
    {$group:{_id:"$AgencyName",total:{$sum:"$ContractAmount"}}}
]}

In the example above, you can see that you can rename a field in the output in the $project stage. This is done by listing newname:"$oldname". Renaming the _id field is difficult, since you must specify _id:0.

db.NYC.aggregate([
    {$group:{_id:"$AgencyName",total:{$sum:"$ContractAmount"}}},
    {$project:{Agency:"$_id",total:1,_id:0}}
]);

In order to sum the same field across all documents, you must specify _id:null:

db.NYC.aggregate([
    {$group:{_id:null,total:{$sum:"$ContractAmount"}}},
    {$project:{total:1,_id:0}}
])

Map-Reduce (Deprecated)#

We don’t need to know this since it’s deprecated.