27 - MongoDB#
Cursor Functions/Methods#
The result of a find() query is a cursor object. A cursor is a pointer to the result set of a query, and it is an iterable object (forward only).
Cursor functions apply functions to the result of a query (e.g. limit(), skip(), etc.).
db.collection.find().skip(10).limit(1)
There are several methods you can use on a cursor:
cursor.count()
db.collection.find().count()
cursor.pretty()
db.collection.find().pretty()
cursor.sort()
db.collection.find().sort({field:1})
cursor.toArray()
db.collection.find().toArray
cursor.hasNext(), cursor.next()
db.collection.find().hasNext()
db.collection.find().next()
If the cursor returned from a command such as db.collection.find()
is not assigned to a variable using the var keyword, then the mongo shell automatically iterates the cursor up to 20 times. You must indicate if you want it to iterate 20 more times.
var myCursor = db.users.find({type:2})
To iterate 20 more times:
myCursor
Alternatively, you can use:
myCursor.next()
To print using mongo shell script in the command line:
Set a variable equal to a cursor
var c = db.testData.find()
Print with a while loop:
while (c.hasNext()) printjson(c.next());
You can use the toArray()
function to iterate the cursor and return the documents in an array. This function loads all documents returned by the cusor into RAM. You can use an index with this function (e.g. array[3]).
Matching an Array Field#
When using the find() function to query an array field, the arguments listed yield the associated results:
{field:[val1, val2]}
: the array field has exactly two vals in the order specified{field:{$all:[val1, val2]}}
: the array contains the two vals{field:val}
: the array contains the val{field:{$elemMatch:{$gte:80,$lt:90}}}
: the array contains at least one element satisfying the conditions{field:{$size:2}}
: the array has the given size
Other Functions#
Test if a document has a field:
fieldname:{$exists:true}
Test if a document has no value for a field:
fieldname:null
Test for an empty array:
fieldname:[]
Aggregation#
There are 3 ways to perform an aggregation:
Single purpose
Pipeline
MapReduce (deprecated in MongoDB 5.0)
Single Purpose Aggregation#
Single purpose aggregation aggregates documents into a single collection, but lacks the capability to pipeline the data. It includes the count()
and distinct()
operations.
db.collection.count({type:"MemberEvent"})
db.collection.distinct("type)
Pipeline Aggregation#
Pipeline aggregation is modeled after data processing pipelines. It includes basic operations, which apply filters like queries do; operations to sort documents; and grouping/aggregate operations. These operations include:
Stage operators:
$project
,$match
,$limit
,$group
, and$sort
Boolean operators:
$and
,$or
, and$not
Set operators:
$setEquals
,$setUnion
, etc.Comparison operators:
$eq
,$gt
,$gte
,$lt
, and$lte
Arithmetic operators:
$add
,$mod
, etc.String operators:
$concat
,$substr
, etc.Text search:
$meta
Date, variable, literal, and conditional operators
Accumulators:
$sum
,$max
, etc.
You can use $match
and $project
in a similar manner to find()
. The following produce the same output:
db.books.find({name:"Thomas"},{title:1, author:1})
db.books.aggregate([
{$match:{name:"Thomas"}},
{$project:{title:1,author:1}}
])
Similarly, the following function the same:
db.NYC.find({number_of_employees:{$lt:10}},{number_of_employees:1})
db.NYC.aggregate([
{$match:{number_of_emplyees:{$lt:10}},{number_of_employees:1}}
])
Note that the following are different:
db.NYC.aggregate([
{$project:{StartDate:1,_id:0}},
{$match:{RequestID:82134}}
])
db.NYC.aggregate([
{$match:{RequestID:82134}},
{$project:{StartDate:1,_id:0}}
]);
If using cursor sort, you can apply sort after find()
:
find(...).sort({sortkey:1})
If using aggregation, you can use a collection sort:
db.collection.aggregate([{sort:{sort_key:1}}])
Array operators to use with projection:
$size
$arrayElemAt
$slice
https://www.mongodb.com/docs/v3.2/reference/operator/aggregation-array/
Grouping query example: In the NYC database, list each AgencyName and the total sum of the ContractAmount for the agency:
This query has 2 stages: group and project.
db.NYC.aggregate{[
{$group:{_id:"$AgencyName",total:{$sum:"$ContractAmount"}}}
]}
In the example above, you can see that you can rename a field in the output in the $project
stage. This is done by listing newname:"$oldname"
. Renaming the _id field is difficult, since you must specify _id:0
.
db.NYC.aggregate([
{$group:{_id:"$AgencyName",total:{$sum:"$ContractAmount"}}},
{$project:{Agency:"$_id",total:1,_id:0}}
]);
In order to sum the same field across all documents, you must specify _id:null
:
db.NYC.aggregate([
{$group:{_id:null,total:{$sum:"$ContractAmount"}}},
{$project:{total:1,_id:0}}
])
Map-Reduce (Deprecated)#
We don’t need to know this since it’s deprecated.