MongoDB change streams
Overview
A real-time stream of the database changes that flow from a database to an application is known as a MongoDB change stream.
What is a Change Stream
A real-time stream of database changes is known as a change stream. It flows from a database to an application. Change stream allows your application to react in real-time to changes in the data in a single database, collection, or a whole deployment. Change streams are critical for those applications that work on the changing data notification.
Availability
MongoDB change streams are available for the sharded clusters and replica sets.
- Storage Engine: WiredTiger storage engine must be used by the sharded clusters and replica sets. We can also use change streams for the deployment the employ the encryption-at-rest feature of MongoDB.
- Replica set protocol version 1(pv1): must be used by the sharded clusters and replica sets.
- Read Concern majority Enablement: In MongoDB version 4.0 and prior versions if "majority" read concern support is enabled then only MongoDB change streams are available. And from the 4.2 version of MongoDB change streams are available without the "majority" read concern enablement this means that change streams can be used with the enabled(default) or disabled read concern majority support.
Connect
DNS seed lists along with the +srv connection option or individual listing of servers in the connection string are used by the change stream to establish connections. If the connection goes down or the connection to the change stream is lost by the driver, then it tries connection re-establishment by the other node of the cluster having the same read preference, and the exception is thrown in case the driver was not able to search the node in the cluster with the same read preference. Read preference determines the way of routing the read operation to the replica set members.
Watch a Collection, a Database, and a Deployment
MongoDB change streams can be opened against a collection, database, or deployment. A collection: Change stream cursor can be opened for a single collection (excluding any collection available in the config, local or admin database, or any system collection). The change stream cursor is opened by db.collection.watch() on the database. A database: From MongoDB 4.0, the change stream cursor is allowed to be opened for the single database except the config, local, and admin database for looking at the modification to all its non-system collections. The change stream cursor is opened by db.watch() for the database for reporting on all its non-system collections. A deployment: From MongoDB 4.0, a change stream cursor can be opened for the deployment (sharded cluster or replica set) for looking at the modifications for all non-system collections of all the databases(excluding config, local, and admin databases). Change stream cursor is opened by mongo.watch() for the sharded cluster or a replica set for reporting on all its non-system collections of its database(except the config, local, and admin database).
MongoDB Change Streams Features
Below are some of the features of the MongoDB change streams that provide a better understanding of the MongoDB change streams working:
- Filterable: Filter can be applied by the application for receiving only the required change notification.
- Resumable: Every response comes along with the resume token that's why change streams are resumable. The token allows the application to resume the stream from where it had left off.
- In order: Notification of changes is in the same order as the updates in the database.
- Durable: Only the majority-committed changes are included in the MongoDB change streams. And the reason behind it is that if the listening application sees every change then sometimes in case of a failure scenario it becomes durable.
- Secure: Users are allowed to create the change stream on the collection if they have read rights to that collection.
- Easy to use: Existing drivers of MongoDB and query language are used by the change stream API syntax.
Change Streams with MongoDB Atlas
If you don't have a development environment for the change stream but you want to experiment with the change streams of the MongoDB then simply create an account on the MongoDB and select the free cluster option. After a few minutes, a cluster is available to you in which change streams are supported and also it is available free for your whole life.
How to Open a Change Stream
- For opening the change stream for the replica set, an open change stream operation can be issued from any data-bearing member of the cluster.
- For opening the change stream for the sharded cluster, an open change stream operation must be issued from mongos.
Modifying Change Streams Output
MongoDB change stream output can be controlled by providing the array which contains one or more than one pipeline stage at the time of configuration of the MongoDB change streams. Pipelines stages that can be provided in the array can be:
- $unset(this is available starting in MongoDB version 4.2)
- $set(this is available starting in MongoDB version 4.2)
- $redact
- $replaceWith(this is available starting in MongoDB version 4.2)
- $replaceRoot
- $project
- $match
- $addFields
Resume a Change Stream
By specifying the resume token to either startAfter or resumeAfter at the time of opening the cursor, change streams can be resumed.
resumeAfter for Change Streams Change Streams can be resumed after a particular event by specifying a resume token to resumeAfter at the time of opening of the cursor. Below is the NodeJS code for showing the example of resumeAfter
startAfter for Change Streams A new change Stream can be started after a particular event by specifying the resume token to startAfter at the time of opening of the cursor.
Access Control and Event Notifications
Access Control
- If a change stream is required to be opened against a particular collection, then the application must have privileges that allow finding and changeStream action to be implemented on the respective collection.
- If a change stream is required to be opened on a single database, then it is mandatory for the application to have privileges that allow finding and changeStream actions on all the non-system collections of the database.
- If a change stream is required to be opened on a whole deployment, it is mandatory for the application to have the privileges that allow finding and changeStream actions on all non-system collections present for all databases in the deployment.
Event Notification Only those data changes are notified by the change streams that have persisted to most of the data-bearing members of the replica set. So, that notification will be triggered only whenever there are majority-committed changes and must be durable in failure scenarios. For instance, let us assume a replica set of 3 members and it has an open change stream cursor against the primary. If the insert operation is issued by the client, then the application will be notified by the change stream only when the insert operation has persisted mostly data-bearing members. Suppose an operation is related to a transaction, then in this condition, the lsid and the txnNumber are included in the change event document.
MongoDB Change Streams Use Cases
Architectures with reliant business systems benefit from change streams, which can tell downstream systems once data changes are durable. Developers can save their time through change streams while implementing services, collaboration functionality, cross-platform synchronization, Extract, Transform, and Load (ETL), and many more.
Examples
Change Stream with NodeJs
Through this example, a change stream is opened for the collection, and iterating over the cursor is required to get the documents of the change stream. It is supposed that a connection to the MongoDB replica set is already established and a database along with a collection can be accessed. Here streams are used to process all the change events in the student collection.
But an iterator is another option for processing all the change events. For this, iterate the cursor of the change stream.
Change Stream With Python
It is very easy and similar to opening a change stream using node js. For the collection, we have to open a change stream in Python and get the documents of the change stream by iterating over the cursor. Through this example, it is supposed that you are connected with a MongoDB replica set and a database along with a student collection can be accessed.
FAQs
Q. Explain MongoDB change streams.
A. A real-time stream of the database change flows from a database to an application is known as MongoDB change stream.
Q. Do duplicates are allowed by MongoDB?
A. Duplicates are allowed in MongoDB until a unique index is not created by you for the fields or group of fields.
Q. Is MongoDB a real-time database or not?
A. By implementing change streams, MongoDB can be used as a real-time database.
Conclusion
- A real-time stream of the database change flows from a database to an application is known as a MongoDB change stream.
- Availability of change streams is for the sharded clusters and replica sets.
- MongoDB change streams can be opened against a collection, database, or deployment.
- Filterable, Resumable, Durable, and secure are some of the features of change streams.
- Only those data changes are notified by the change streams that have persisted to most of the data-bearing members of the replica set.
- Architectures having the informing downstream systems, reliant business systems are benefited from the change streams while the data changes become durable.