When you work on a single machine everything is easy. Unfortunately when you have to scale and be fault tolerant you must relay on multiple hosts and manage a structure usually called “cluster“.

MongoDB enable you to create a replica set to be fault tolerant and use sharding to scale horizontally. Sharding is not transparent to DBAs. You have to choose a shard-key and adding and removing capacity when the system needs.

Structure

In a MongoDB cluster you have 3 fundamental “pieces”:

  • Servers: usually called mongod
  • Routers: usually called mongos.
  • Config servers

Servers are the place where you actually store your data, when you start the mongod command on your machine you are running a server. In a cluster usually you have multiple shard distributed over multiple servers.
Every shard is usually a replica set (2 or more servers) so if one of the servers goes down your cluster remains up and running.

Routers are the interaction point between users and the cluster. If you want to interact with your cluster you have to do throughout a mongos. The process route your request to the correct shard and gives back you the answer.

Config servers hold all the information about cluster configuration. They are very sensitive nodes and they are the real point of failure of the system.

mongodb_cluster

Shard Key

Choosing the shard key is the most important part when you create a cluster. There are a few important rule to follow learned after several errors:

  1. Don’t choose a shard key with a low cardinality. If one of this possibile values grow too much you can’t split it anymore.
  2. Don’t use an ascending shard key. The only shard who grows is the last one and distribute load on the other server always require a lot of traffic.
  3. Don’t use a random shard key. If quite efficient but you have to add an index to use it

A good choice is to use a coarsely ascending key combined to a search key (something you commonly query). This choice won’t work well for everything but it’s a good way to start thinking about.

N.B. All the informations and the image of the cluster strucure comes from the book below. I read it last week and I find it really interesting 🙂

Scaling MongoDB
by Kristina Chodorow

scaling_mongodb