How To Create a MongoDB Database: 6 Critical Aspects To Know

Based on your requirements for your software, you might prioritize flexibility, scalability, performance, or speed. Hence, developers and businesses are often confused while picking a database for their needs. If you need a database that provides high flexibility and scalability, and data aggregation for customer analytics, MongoDB may be the right fit for you!

In this article, we’ll be discussing the structure of the MongoDB database and how to create, monitor, and manage your database! Let’s get started.

How Is a MongoDB Database Structured?

MongoDB is a schema-less NoSQL database. This means you don’t specify a structure for the tables/databases as you do for SQL databases.

Did you know that NoSQL databases are actually faster than relational databases? This is due to characteristics like indexing, sharding, and aggregation pipelines. MongoDB is also known for its speedy query execution. This is why it’s preferred by companies like Google, Toyota, and Forbes.

Below, we’ll explore some key characteristics of MongoDB.

Documents

MongoDB has a document data model that stores data as JSON documents. The documents map naturally to the objects in the application code, making it more straightforward for developers to use.

In a relational database table, you must add a column to add a new field. That’s not the case with fields in a JSON document. Fields in a JSON document can differ from document to document, so they won’t be added to every record in the database.

Documents can store structures like arrays that can be nested to express hierarchical relationships. Additionally, MongoDB converts documents into a binary JSON (BSON) type. This ensures faster access and increased support for various data types like string, integer, boolean number, and much more!

Replica Sets

When you create a new database in MongoDB, the system automatically creates at least 2 more copies of your data. These copies are known as “replica sets,” and they continuously replicate data between them, ensuring improved availability of your data. They also offer protection against downtime during a system failure or planned maintenance.

Collections

A collection is a group of documents associated with one database. They’re similar to tables in relational databases.

Collections, however, are much more flexible. For one, they don’t rely on a schema. Secondly, the documents needn’t be of the same data type!

To view a list of the collections that belong to a database, use the command listCollections.

Aggregation Pipelines

You can use this framework to club several operators and expressions. It’s flexible because it allows you to process, transform, and analyze data of any structure.

Because of this, MongoDB allows fast data flows and features across 150 operators and expressions. It also has several stages, like the Union stage, which flexibly puts together results from multiple collections.

Indexes

You can index any field in a MongoDB document to increase its efficiency and improve query speed. Indexing saves time by scanning the index to limit the documents inspected. Isn’t this far better than reading every document in the collection?

You can use various indexing strategies, including compound indexes on multiple fields. For example, say you’ve got several documents containing the employee’s first and last names in separate fields. If you’d want the first and last name to be returned, you can create an index that includes both “Last name” and “First name”. This would be much better than having one index on “Last name” and another on “First name”.

You can leverage tools like Performance Advisor to further understand which query could benefit from indexes.

Sharding

Sharding distributes a single dataset across multiple databases. That dataset can then be stored on multiple machines to increase the total storage capacity of a system. This is because it splits larger datasets into smaller chunks and stores them in various data nodes.

MongoDB shards data at the collection level, distributing documents in a collection across the shards in a cluster. This ensures scalability by allowing the architecture to handle the largest applications.

How To Create a MongoDB Database

You’ll need to install the right MongoDB package suitable for your OS first. Go to the ‘Download MongoDB Community Server‘ page. From the available options, select the latest “version”, “package” format as zip file, and “platform” as your OS and click “Download” as depicted below:

This image depicts the available options- Version, Platform, and Package while downloading MongoDB Community Server. — MongoDB community server download process. (Image source: MongoDB Community Server)

The process is quite straightforward, so you’ll have MongoDB installed in your system in no time!

Once you’ve done the installation, open your command prompt and type in mongod -version to verify it. If you don’t get the following output and instead see a string of errors, you might have to reinstall it:

This is a code snippet to check the MongoDB version after installation. — Verifying MongoDB version. (Image source: configserverfirewall)

Using MongoDB Shell

Before we get started, make sure that:

Your client has Transport Layer Security and is on your IP allowlist.
You have a user account and password on the desired MongoDB cluster.
You’ve installed MongoDB on your device.

Step 1: Access the MongoDB Shell

Start the MongoDB server by following the instructions of each OS. For Windows, type the following command. For other OSs, refer to the MongoDB documentation.

net start MongoDB

This should give the following output:

This is a code snippet to initialize the MongoDB server — Running MongoDB server (Image source: c-sharpcorner)

The previous command initialized the MongoDB server. To run it, we’d have to type in mongo in the command prompt.

This is a code snippet to run the MongoDB server. — Running MongoDB shell. (Image source: bmc)

Here in the MongoDB shell, we can execute commands to create databases, insert data, edit data, issue administrative commands, and delete data.

Step 2: Create Your Database

Unlike common relational databases, MongoDB doesn’t have a database creation command. Instead, there is a keyword called use which switches to a specified database. If the database doesn’t exist, it’ll create a new database, else, it’ll link to the existing database.

For example, to initiate a database called “company”, type in:

use Company

This is a code snippet to create a database in MongoDB. — Creating database in MongoDB.

You can type in db to confirm the database you just created in your system. If the new database you created pops up, you’ve successfully connected to it.

If you want to check the existing databases, type in show dbs and it will return all the databases in your system:

This is a code snippet to view the existing databases in the system. — Viewing databases in MongoDB.

By default, installing MongoDB creates the admin, config, and local databases.

Did you notice that the database we created isn’t displayed? This is because we haven’t saved values into the database yet! We will be discussing insertion under the database management section.

Using Atlas UI

You could also get started with MongoDB’s database service, Atlas. While you may need to pay to access some features of Atlas, most database functionalities are available with the free tier. The features of the free tier are more than enough to create a MongoDB database.

Before we get started, make sure that:

Your IP is on the allowlist.
You have a user account and password on the MongoDB cluster you want to use.

To create a MongoDB Database with AtlasUI, open a browser window and log in to https://cloud.mongodb.com. From your cluster page, click Browse Collections. If there are no databases in the cluster, you can create your database by clicking on the Add My Own Data Button.

The prompt will ask you to provide a database and collection name. Once you’ve named them, click Create, and you’re done! You can now enter new documents or connect to the database using drivers.

Managing Your MongoDB Database

In this section, we’ll go over a few nifty ways to manage your MongoDB database effectively. You can do this by either using the MongoDB Compass or through collections.

Using Collections

While relational databases possess well-defined tables with specified data types and columns, NoSQL has collections instead of tables. These collections don’t have any structure, and documents can vary — you can have different data types and fields without having to match another document’s format in the same collection.

To demonstrate, let’s create a collection called “Employee” and add a document to it:

db.Employee.insert(
  {
   	"Employeename" : "Chris",
   	"EmployeeDepartment" : "Sales"
  }
)

If the insertion is successful, it will return WriteResult({ "nInserted" : 1 }):

This code snippet returns WriteResult({ — Successful insertion in MongoDB.

Here, “db” refers to the currently connected database. “Employee” is the newly created collection on the company database.

We haven’t set a primary key here because MongoDB automatically creates a primary key field called “_id” and sets a default value to it.

Run the below command to check out the collection in JSON format:

db.Employee.find().forEach(printjson)

Output:

{
  "_id" : ObjectId("63151427a4dd187757d135b8"),
  "Employeename" : "Chris",
  "EmployeeDepartment" : "Sales"
}

While the “_id” value is assigned automatically, you could change the value of the default primary key. This time, we’ll insert another document into the “Employee” database, with the “_id” value as “1”:

db.Employee.insert(
  {  
   	"_id" : 1,
   	"EmployeeName" : "Ava",
   	"EmployeeDepartment" : "Public Relations"
  }
)

On running the command db.Employee.find().forEach(printjson) we get the following output:

The output shows the documents in the Employee collection along with their primary key — Documents in the collection with their primary key.

In the above output, the “_id” value for “Ava” is set to “1” instead of being assigned a value automatically.

Now that we’ve successfully added values into the database, we can check if it shows up under the existing databases in our system using the following command:

show dbs

The output shows the Employee collection in the existing databases in our system. — Displaying the list of databases.

And voila! You have successfully created a database in your system!

Using the MongoDB Compass

Although we can work with MongoDB servers from the Mongo shell, it can sometimes be tedious. You might experience this in a production environment.

However, there is a compass tool (appropriately named Compass) created by MongoDB that can make it easier. It has a better GUI and added functionalities like data visualization, performance profiling, and CRUD (create, read, update, delete) access to data, databases, and collections.

You can download the Compass IDE for your OS and install it with its straightforward process.

Next, open the application and create a connection with the server by pasting the connection string. If you can’t find it, you can click Fill in connection fields individually. If you didn’t change the port number while installing MongoDB, just click the connect button, and you’re in! Else, just enter the values you set and click Connect.

This image shows the New Connection window, where you can choose to paste the connection url. — New Connection window in MongoDB.. (Image source: mongodb)

Next, provide the Hostname, Port, and Authentication in the New Connection window.

In MongoDB Compass, you can create a database and add its first collection simultaneously. Here’s how you do it:

Click Create Database to open the prompt.
Enter the name of the database and its first collection.
Click Create Database.

You can insert more documents into your database by clicking on your database’s name, and then clicking on the collection’s name to see the Documents tab. You can then click the Add Data button to insert one or more documents into your collection.

While adding your documents, you may enter them one at a time or as multiple documents in an array. If you’re adding multiple documents, ensure these comma-separated documents are enclosed in square brackets. For example:

{ _id: 1, item: { name: "apple", code: "123" }, qty: 15, tags: [ "A", "B", "C" ] },
{ _id: 2, item: { name: "banana", code: "123" }, qty: 20, tags: [ "B" ] },
{ _id: 3, item: { name: "spinach", code: "456" }, qty: 25, tags: [ "A", "B" ] },
{ _id: 4, item: { name: "lentils", code: "456" }, qty: 30, tags: [ "B", "A" ] },
{ _id: 5, item: { name: "pears", code: "000" }, qty: 20, tags: [ [ "A", "B" ], "C" ] },
{ _id: 6, item: { name: "strawberry", code: "123" }, tags: [ "B" ] }

Finally, click Insert to add the documents to your collection. This is what a document’s body would look like:

{
  "StudentID" : 1
  "StudentName" : "JohnDoe"
}

Here, the field names are “StudentID” and “StudentName”. The field values are “1” and “JohnDoe” respectively.

Useful Commands

You can manage these collections through role management and user management commands.

User Management Commands

MongoDB user management commands contain commands that pertain to the user. We can create, update, and delete the users using these commands.

dropUser

This command removes a single user from the specified database. Below is the syntax:

db.dropUser(username, writeConcern)

Here, username is a required field that specifies the name of the user to remove from the database. The optional field writeConcern contains the level of write concern for the removal operation. The level of write concern can be determined by the optional field writeConcern.

Before dropping a user who has the userAdminAnyDatabase role, make sure that there is at least one other user with user administration privileges.

In this example, we’ll drop the user “user26” in the test database:

use test
db.dropUser("user26", {w: "majority", wtimeout: 4000})

Output:

> db.dropUser("user26", {w: "majority", wtimeout: 4000});
true

createUser

This command creates a new user for the specified database as follows:

db.createUser(user, writeConcern)

Here, user is a required field that contains the document with authentication and access information about the user to create. The optional field writeConcern contains the level of write concern for the creation operation. The level of write concern can be determined by the optional field, writeConcern.

createUser will return a duplicate user error if the user already exists on the database.

You can create a new user in the test database as follows:

use test
db.createUser(
  {
    user: "user26",
    pwd: "myuser123",
    roles: [ "readWrite" ]  
  }
);

The output is as follows:

Successfully added user: { "user" : "user26", "roles" : [ "readWrite", "dbAdmin" ] }

grantRolesToUser

You can leverage this command to grant additional roles to a user. To use it, you need to keep the following syntax in mind:

db.runCommand(
  {
    grantRolesToUser: "<user>",
    roles: [ <roles> ],
    writeConcern: { <write concern> },
    comment: <any> 
  }
)

You can specify both user-defined and built-in roles in the roles mentioned above. If you want to specify a role that exists in the same database where grantRolesToUser runs, you can either specify the role with a document, as mentioned below:

{ role: "<role>", db: "<database>" }

Or, you can simply specify the role with the role’s name. For instance:

"readWrite"

If you want to specify the role that’s present in a different database, you’ll have to specify the role with a different document.

To grant a role on a database, you need the grantRole action on the specified database.

Here’s an example to give you a clear picture. Take, for instance, a user productUser00 in the products database with the following roles:

"roles" : [
  {
    "role" : "assetsWriter",
    "db" : "assets"
  }
]

The grantRolesToUser operation provides “productUser00” the readWrite role on the stock database and the read role on the products database:

use products
db.runCommand({
  grantRolesToUser: "productUser00",
  roles: [
    { role: "readWrite", db: "stock"},
    "read"
  ],
  writeConcern: { w: "majority" , wtimeout: 2000 }
})

The user productUser00 in the products database now possesses the following roles:

"roles" : [
  {
    "role" : "assetsWriter",
    "db" : "assets"
  },
  {
    "role" : "readWrite",
    "db" : "stock"
  },
  {
    "role" : "read",
    "db" : "products"
  }
]

usersInfo

You can use the usersInfo command to return information about one or more users. Here’s the syntax:

db.runCommand(
  {
    usersInfo: <various>,
    showCredentials: <Boolean>,
    showCustomData: <Boolean>,
    showPrivileges: <Boolean>,
    showAuthenticationRestrictions: <Boolean>,
    filter: <document>,
    comment: <any> 
  }
)
{ usersInfo: <various> }

In terms of access, users can always look at their own information. To look at another user’s information, the user running the command must have privileges that include the viewUser action on the other user’s database.

On running the userInfo command, you can obtain the following information depending on the specified options:

{
  "users" : [
    {
      "_id" : "<db>.<username>",
      "userId" : <UUID>, // Starting in MongoDB 4.0.9
      "user" : "<username>",
      "db" : "<db>",
      "mechanisms" : [ ... ],  // Starting in MongoDB 4.0
      "customData" : <document>,
      "roles" : [ ... ],
      "credentials": { ... }, // only if showCredentials: true
      "inheritedRoles" : [ ... ],  // only if showPrivileges: true or showAuthenticationRestrictions: true
      "inheritedPrivileges" : [ ... ], // only if showPrivileges: true or showAuthenticationRestrictions: true
      "inheritedAuthenticationRestrictions" : [ ] // only if showPrivileges: true or showAuthenticationRestrictions: true
      "authenticationRestrictions" : [ ... ] // only if showAuthenticationRestrictions: true
    },
  ],
  "ok" : 1
}

Now that you have the general idea of what you can accomplish with the usersInfo command, the obvious next question that might pop up is, what commands would come in handy to look at specific users and multiple users?

Here are two handy examples to illustrate the same:
To look at the specific privileges and information for specific users, but not the credentials, for a user “Anthony” defined in the “office” database, execute the following command:

db.runCommand(
  {
    usersInfo:  { user: "Anthony", db: "office" },
    showPrivileges: true
  }
)

If you want to look at a user in the current database, you can only mention the user by name. For instance, if you are in the home database and a user named “Timothy” exists in the home database, you can run the following command:

db.getSiblingDB("home").runCommand(
  {
    usersInfo:  "Timothy",
    showPrivileges: true
  }
)

Next, you can use an array if you wish to look at the information for various users. You can either include the optional fields showCredentials and showPrivileges, or you can choose to leave them out. This is what the command would look like:

db.runCommand({
usersInfo: [ { user: "Anthony", db: "office" }, { user: "Timothy", db: "home" } ],
  showPrivileges: true
})

revokeRolesFromUser

You can leverage the revokeRolesFromUser command to remove one or more roles from a user on the database where the roles are present. The revokeRolesFromUser command has the following syntax:

db.runCommand(
  {
    revokeRolesFromUser: "<user>",
    roles: [
      { role: "<role>", db: "<database>" } | "<role>",
    ],
    writeConcern: { <write concern> },
    comment: <any> 
  }
)

In the syntax mentioned above, you can specify both user-defined and in-built roles in the roles field. Similar to the grantRolesToUser command, you can specify the role you want to revoke in a document or use its name.

To successfully execute the revokeRolesFromUser command, you need to have the revokeRole action on the specified database.

Here’s an example to drive the point home. The productUser00 entity in the products database had the following roles:

"roles" : [
  {
    "role" : "assetsWriter",
    "db" : "assets"
  },
  {
    "role" : "readWrite",
    "db" : "stock"
  },
  {
    "role" : "read",
    "db" : "products"
  }
]

The following revokeRolesFromUser command will remove two of the user’s roles: the “read” role from products and the assetsWriter role from the “assets” database:

use products
db.runCommand( { revokeRolesFromUser: "productUser00",
  roles: [
    { role: "AssetsWriter", db: "assets" },
    "read"
  ],
  writeConcern: { w: "majority" }
} )

The user “productUser00” in the products database now only has one remaining role:

"roles" : [
  {
    "role" : "readWrite",
    "db" : "stock"
  }
]

Role Management Commands

Roles grant users access to resources. Several built-in roles can be used by administrators to control access to a MongoDB system. If the roles don’t cover the desired privileges, you can even go further to create new roles in a particular database.

dropRole

With the dropRole command, you can delete a user-defined role from the database on which you run the command. To execute this command, use the following syntax:

db.runCommand(
  {
    dropRole: "<role>",
    writeConcern: { <write concern> },
    comment: <any> 
  }
)

For successful execution, you must have the dropRole action on the specified database. The following operations would remove the writeTags role from the “products” database:

use products
db.runCommand(
  {
    dropRole: "writeTags",
    writeConcern: { w: "majority" }
  }
)

createRole

You can leverage the createRole command to create a role and specify its privileges. The role will apply to the database on which you choose to run the command. The createRole command would return a duplicate role error if the role already exists in the database.

To execute this command, follow the given syntax:

db.adminCommand(
  {
    createRole: "<new role>",
    privileges: [
      { resource: { <resource> }, actions: [ "<action>", ... ] },
    ],
    roles: [
      { role: "<role>", db: "<database>" } | "<role>",
    ],
    authenticationRestrictions: [
      {
        clientSource: ["<IP>" | "<CIDR range>", ...],
        serverAddress: ["<IP>" | "<CIDR range>", ...]
      },
    ],
    writeConcern: <write concern document>,
    comment: <any> 
  }
)

A role’s privileges would apply to the database where the role was created. The role can inherit privileges from other roles in its database. For instance, a role made on the “admin” database can include privileges that apply to either a cluster or all databases. It can also inherit privileges from roles present in other databases.

To create a role in a database, you need to have two things:

The grantRole action on that database to mention privileges for the new role as well as to mention roles to inherit from.
The createRole action on that database resource.

The following createRole command will create a clusterAdmin role on the user database:

db.adminCommand({ createRole: "clusterAdmin",
  privileges: [
    { resource: { cluster: true }, actions: [ "addShard" ] },
    { resource: { db: "config", collection: "" }, actions: [ "find", "remove" ] },
    { resource: { db: "users", collection: "usersCollection" }, actions: [ "update", "insert" ] },
    { resource: { db: "", collection: "" }, actions: [ "find" ] }
  ],
  roles: [
    { role: "read", db: "user" }
  ],
  writeConcern: { w: "majority" , wtimeout: 5000 }
})

grantRolesToRole

With the grantRolesToRole command, you can grant roles to a user-defined role. The grantRolesToRole command would affect roles on the database where the command is executed.

This grantRolesToRole command has the following syntax:

db.runCommand(
  {
    grantRolesToRole: "<role>",
    roles: [
     { role: "<role>", db: "<database>" },
    ],
    writeConcern: { <write concern> },
    comment: <any> 
  }
)

The access privileges are similar to the grantRolesToUser command — you need a grantRole action on a database for the proper execution of the command.

In the following example, you can use the grantRolesToRole command to update the productsReader role in the “products” database to inherit the privileges of the productsWriter role:

use products
db.runCommand(
  { 
    grantRolesToRole: "productsReader",
    roles: [
      "productsWriter"
    ],
    writeConcern: { w: "majority" , wtimeout: 5000 }
  }
)

revokePrivilegesFromRole

You can use revokePrivilegesFromRole to remove the specified privileges from the user-defined role on the database where the command is executed. For proper execution, you need to keep the following syntax in mind:

db.runCommand(
  {
    revokePrivilegesFromRole: "<role>",
    privileges: [
      { resource: { <resource> }, actions: [ "<action>", ... ] },
    ],
    writeConcern: <write concern document>,
    comment: <any> 
  }
)

To revoke a privilege, the “resource document” pattern must match that privilege’s “resource” field. The “actions” field can either be an exact match or a subset.

For example, consider the role manageRole in the products database with the following privileges that specify the “managers” database as the resource:

{
  "resource" : {
    "db" : "managers",
    "collection" : ""
  },
  "actions" : [
    "insert",
    "remove"
  ]
}

You cannot revoke the “insert” or “remove” actions from just one collection in the managers database. The following operations cause no change in the role:

use managers
db.runCommand(
  {
    revokePrivilegesFromRole: "manageRole",
    privileges: [
      {
        resource : {
          db : "managers",
          collection : "kiosks"
        },
        actions : [
          "insert",
          "remove"
        ]
      }
    ]
  }
)

db.runCommand(
  {
    revokePrivilegesFromRole: "manageRole",
    privileges:
      [
        {
          resource : {
          db : "managers",
          collection : "kiosks"
        },
        actions : [
          "insert"
        ]
      }
    ]
  }
)

To revoke the “insert” and/or the “remove” actions from the role manageRole, you need to match the resource document exactly. For instance, the following operation revokes just the “remove” action from the existing privilege:

use managers
db.runCommand(
  {
    revokePrivilegesFromRole: "manageRole",
    privileges:
      [
        {
          resource : {
            db : "managers",
            collection : ""
        },
        actions : [ "remove" ]
      }
    ]
  }
)

The following operation will remove multiple privileges from the “executive” role in the managers database:

use managers
db.runCommand(
  {
    revokePrivilegesFromRole: "executive",
    privileges: [
      {
        resource: { db: "managers", collection: "" },
        actions: [ "insert", "remove", "find" ]
      },
      {
        resource: { db: "managers", collection: "partners" },
        actions: [ "update" ]
      }
    ],
    writeConcern: { w: "majority" }
    }
)

rolesInfo

The rolesInfo command will return privilege and inheritance information for specified roles, including both built-in and user-defined roles. You can also leverage the rolesInfo command to retrieve all roles scoped to a database.

For proper execution, follow this syntax:

db.runCommand(
  {
    rolesInfo: { role: <name>, db: <db> },
    showPrivileges: <Boolean>,
    showBuiltinRoles: <Boolean>,
    comment: <any> 
  }
)

To return information for a role from the current database, you can specify its name as follows:

{ rolesInfo: "<rolename>" }

To return information for a role from another database, you can mention the role with a document that mentions the role and the database:

{ rolesInfo: { role: "<rolename>", db: "<database>" } }

For example, the following command returns the role inheritance information for the role executive defined in the managers database:

db.runCommand(
   {
      rolesInfo: { role: "executive", db: "managers" }
   }
)

This next command will return the role inheritance information: accountManager on the database on which the command is executed:

db.runCommand(
   {
      rolesInfo: "accountManager"
   }
)

The following command will return both the privileges and role inheritance for the role “executive” as defined on the managers database:

db.runCommand(
   {
     rolesInfo: { role: "executive", db: "managers" },
     showPrivileges: true
   }
)

To mention multiple roles, you can use an array. You can also mention each role in the array as a string or document.

You should use a string only if the role exists on the database on which the command is executed:

{
  rolesInfo: [
    "<rolename>",
    { role: "<rolename>", db: "<database>" },
  ]
}

For example, the following command will return information for three roles on three different databases:

db.runCommand(
   {
    rolesInfo: [
      { role: "executive", db: "managers" },
      { role: "accounts", db: "departments" },
      { role: "administrator", db: "products" }
    ]
  }
)

You can get both the privileges and the role inheritance as follows:

db.runCommand(
  {
    rolesInfo: [
      { role: "executive", db: "managers" },
      { role: "accounts", db: "departments" },
      { role: "administrator", db: "products" }
    ],
    showPrivileges: true
  }
)

Embedding MongoDB Documents for Better Performance

Document databases like MongoDB let you define your schema according to your needs. To create optimal schemas in MongoDB, you can nest the documents. So, instead of matching your application to a data model, you can build a data model that matches your use case.

Embedded documents let you store related data that you access together. While designing schemas for MongoDB, it’s recommended you embed documents by default. Use database-side or application-side joins and references only when they’re worthwhile.

Make sure that the workload can retrieve a document as often as required. At the same time, the document should also have all the data it needs. This is pivotal for your application’s exceptional performance.

Below, you’ll find a few different patterns to embed documents:

Embedded Document Pattern

You can use this to embed even complicated sub-structures in the documents they’re used with. Embedding connected data in a single document can decrease the number of read operations needed to get data. Generally, you should structure your schema so that your application receives all of its required information in a single read operation. Hence, the rule to keep in mind here is what’s used together should be stored together.

Embedded Subset Pattern

The embedded subset pattern is a hybrid case. You’d use it for a separate collection of a long list of related items, where you can keep some of those items at hand for display.

Here’s an example that lists movie reviews:

> db.movie.findOne()
{   
  _id: 321475,   
  title: "The Dark Knight"
}  
> db.review.find({movie_id: 321475})
{   
  _id: 264579,   
  movie_id: 321475,   
  stars: 4   
  text: "Amazing"   
}
{   
  _id: 375684,   
  movie_id: 321475,   
  stars:5,   
  text: "Mindblowing"
}

Now, picture a thousand similar reviews, but you only plan to display the most recent two when you show a movie. In this scenario, it makes sense to store that subset as a list within the movie document:

> db.movie.findOne({_id: 321475})   
{   
  _id: 321475,   
  title: "The Dark Knight",   
  recent_reviews: [   
    {_id: 264579, stars: 4, text: "Amazing"},   
    {_id: 375684, stars: 5, text: "Mindblowing"}   
  ]   
}

Simply put, if you routinely access a subset of related items, make sure you embed it.

Independent Access

You might want to store sub-documents in their collection to separate them from their parent collection.

For example, take a company’s product line. If the company sells a small set of products, you might want to store them within the company document. But if you want to reuse them across companies or access them directly by their stock keeping unit (SKU), you’d also want to store them in their collection.

If you manipulate or access an entity independently, make a collection to store it separately for best practice.

Unbounded Lists

Storing short lists of related information in their document has a drawback. If your list continues to grow unchecked, you shouldn’t be putting it in a single document. This is because you wouldn’t be able to support it for very long.

There are two reasons for this. First, MongoDB has a limit on the size of a single document. Second, if you access the document at too many frequencies, you’ll see negative results from uncontrolled memory usage.

To put it simply, if a list starts growing unboundedly, make a collection to store it separately.

Extended Reference Pattern

The extended reference pattern is like the subset pattern. It also optimizes information that you regularly access to store on the document.

Here, instead of a list, it’s leveraged when a document refers to another that is present in the same collection. At the same time, it also stores some fields from that other document for ready access.

For instance:

> db.movie.findOne({_id: 245434})
{   
  _id: 245434,   
  title: "Mission Impossible 4 - Ghost Protocol",   
  studio_id: 924935,   
  studio_name: "Paramount Pictures"   
}

As you can see, “the studio_id” is stored so that you can look up more information on the studio that created the film. But the studio’s name is also copied to this document for simplicity.

To embed information from modified documents regularly, remember to update documents where you’ve copied that information when it is modified. In other words, if you routinely access some fields from a referenced document, embed them.

How To Monitor MongoDB

You can use monitoring tools like Kinsta APM to debug long API calls, slow database queries, long external URL requests, to name a few. You can even leverage commands to improve database performance. You can also use them to inspect the health of your database instances.

Why Should You Monitor MongoDB Databases?

A key aspect of database administration planning is monitoring your cluster’s performance and health. MongoDB Atlas handles the majority of administration efforts through its fault-tolerance/scaling abilities.

Despite that, users need to know how to track clusters. They should also know how to scale or tweak whatever they need before hitting a crisis.

By monitoring MongoDB databases, you can:

Observe the utilization of resources.
Understand the current capacity of your database.
React and detect real-time issues to enhance your application stack.
Observe the presence of performance issues and abnormal behavior.
Align with your governance/data protection and service-level agreement (SLA) requirements.

Key Metrics To Monitor

While monitoring MongoDB, there are four key aspects you need to keep in mind:

MongoDB Hardware Metrics

Here are the primary metrics for monitoring hardware:

Normalized Process CPU

It’s defined as the percentage of time spent by the CPU on application software maintaining the MongoDB process.

You can scale this to a range of 0-100% by dividing it by the number of CPU cores. It includes CPU leveraged by modules such as kernel and user.

High kernel CPU might show exhaustion of CPU via the operating system operations. But the user linked with MongoDB operations might be the root cause of CPU exhaustion.

Normalized System CPU

It’s the percentage of time the CPU spent on system calls servicing this MongoDB process. You can scale it to a range of 0-100% by dividing it by the number of CPU cores. It also covers the CPU used by modules such as iowait, user, kernel, steal, etc.

User CPU or high kernel might show CPU exhaustion through MongoDB operations (software). High iowait might be linked to storage exhaustion causing CPU exhaustion.

Disk IOPS

Disk IOPS is the average consumed IO operations per second on MongoDB’s disk partition.

Disk Latency

This is the disk partition’s read and write disk latency in milliseconds in MongoDB. High values (>500ms) show that the storage layer might affect MongoDB’s performance.

System Memory

Use the system memory to describe physical memory bytes used versus available free space.

The available metric approximates the number of bytes of system memory available. You can use this to execute new applications, without swapping.

Disk Space Free

This is defined as the total bytes of free disk space on MongoDB’s disk partition. MongoDB Atlas provides auto-scaling capabilities based on this metric.

Swap Usage

You can leverage a swap usage graph to describe how much memory is being placed on the swap device. A high used metric in this graph shows that swap is being utilized. This shows that the memory is under-provisioned for the current workload.

MongoDB Cluster’s Connection and Operation Metrics

Here are the main metrics for Operation and Connection Metrics:

Operation Execution Times

The average operation time (write and read operations) performed over the selected sample period.

Opcounters

It is the average rate of operations executed per second over the selected sample period. Opcounters graph/metric shows the operations breakdown of operation types and velocity for the instance.

Connections

This metric refers to the number of open connections to the instance. High spikes or numbers might point to a suboptimal connection strategy either from the unresponsive server or the client side.

Query Targeting and Query Executors

This is the average rate per second over the selected sample period of scanned documents. For query executors, this is during query-plan evaluation and queries. Query targeting shows the ratio between the number of documents scanned and the number of documents returned.

A high number ratio points to suboptimal operations. These operations scan a lot of documents to return a smaller part.

Scan and Order

It describes the average rate per second over the chosen sample period of queries. It returns sorted results that cannot execute the sort operation using an index.

Queues

Queues can describe the number of operations waiting for a lock, either write or read. High queues might depict the existence of less than optimal schema design. It could also indicate conflicting writing paths, pushing high competition over database resources.

MongoDB Replication Metrics

Here are the primary metrics for replication monitoring:

Replication Oplog Window

This metric lists the approximate number of hours available in the primary’s replication oplog. If a secondary lags more than this amount, it can’t keep up and will need a full resync.

Replication Lag

Replication lag is defined as the approximate number of seconds a secondary node is behind the primary in write operations. High replication lag would point to a secondary that faces difficulty in replicating. It might impact your operation’s latency, given the read/write concern of the connections.

Replication Headroom

This metric refers to the difference between the primary replication’s oplog window and the secondary’s replication lag. If this value goes to zero, it could cause a secondary to go into RECOVERING mode.

Opcounters – repl

Opcounters – repl is defined as the average rate of replication operations executed per second for the chosen sample period. With the opcounters – graph/metric, you can take a look at the operations velocity and breakdown of operation types for the specified instance.

Oplog GB/Hour

This is defined as the average rate of gigabytes of oplog the primary generates per hour. High unexpected volumes of oplog might point to a highly insufficient write workload or a schema design issue.

MongoDB Performance Monitoring Tools

MongoDB has built-in user interface tools in Cloud Manager, Atlas, and Ops Manager for performance tracking. It also provides some independent commands and tools to look at more raw-based data. We’ll talk about some tools you can run from a host which has access and appropriate roles to check your environment:

mongotop

You can leverage this command to track the amount of time a MongoDB instance spends writing and reading data per collection. Use the following syntax:

mongotop <options> <connection-string> <polling-interval in seconds>

rs.status()

This command returns the replica set status. It’s executed from the point of view of the member where the method is executed.

mongostat

You can use the mongostat command to get a quick overview of the status of your MongoDB server instance. For optimal output, you can use it to watch a single instance for a specific event as it offers a real-time view.

Leverage this command to monitor basic server statistics such as lock queues, operation breakdown, MongoDB memory statistics, and connections/network:

mongostat <options> <connection-string> <polling interval in seconds>

dbStats

This command returns storage statistics for a specific database, such as the number of indexes and their size, total collection data versus storage size, and collection-related statistics (number of collections and documents).

db.serverStatus()

You can leverage the db.serverStatus() command to have an overview of the database’s state. It gives you a document representing the current instance metric counters. Execute this command at regular intervals to collate statistics about the instance.

collStats

The collStats command collects statistics similar to that offered by dbStats at the collection level. Its output consists of a count of objects in the collection, the amount of disk space consumed by the collection, the collection’s size, and information concerning its indexes for a given collection.

You can use all these commands to offer real-time reporting and monitoring of the database server that lets you monitor database performance and errors and assist in informed decision-making to refine a database.

How To Delete a MongoDB Database

To drop a database you created in MongoDB, you need to connect to it through the use keyword.

Say you created a database named “Engineers”. To connect to the database, you’ll use the following command:

use Engineers

Next, type db.dropDatabase() to get rid of this database. After execution, this is the result you can expect:

{ "dropped"  :  "Engineers", "ok" : 1 }

You can run the showdbs command to verify if the database still exists.

Summary

To squeeze every last drop of value from MongoDB, you must have a strong understanding of the fundamentals. Hence, it’s pivotal to know MongoDB databases like the back of your hand. This requires familiarizing yourself with the methods to create a database first.

In this article, we shed light on the different methods you can use to create a database in MongoDB, followed by a detailed description of some nifty MongoDB commands to keep you on top of your databases. Finally, we rounded off the discussion by discussing how you can leverage embedded documents and performance monitoring tools in MongoDB to ensure your workflow functions at peak efficiency.

What’s your take on these MongoDB commands? Did we miss out on an aspect or method you’d have liked to see here? Let us know in the comments!

Salman Ravoof

Salman Ravoof is a self-taught web developer, writer, creator, and a huge admirer of Free and Open Source Software (FOSS). Besides tech, he's excited by science, philosophy, photography, arts, cats, and food. Learn more about him on his website, and connect with Salman on Twitter.