Sharding — Choosing the Right q Value posted Tuesday, September 22, 2020 by The Neighbourhoodie CouchDB Team
One of CouchDB’s core features is scalability. There are two axis of scalability in CouchDB:
- Scaling the amount of data stored
- Scaling the number of requests handled
One mechanism is responsible for achieving this: sharding. Sharding means that what looks like a single database to the CouchDB API is in reality multiple parts. Those parts are all independent from each other and can live on one or more nodes of a CouchDB cluster.
This allows you to store more data in a single database than fits onto a single CouchDB node. It also allows you to handle more requests to that database than a single node can handle.
In addition, since CouchDB 3.0.0, you can now increase the number of shards of a database, while the cluster is fully operational.
The number of shards is identified by the value q
. In CouchDB 2 q
defaulted to 8, in anticipation of storing a lot of data in CouchDB. In CouchDB 3, q
got reduced to 2, since now you can dynamically increase the number as your data grows.
This leaves one more point to cover: what is the right value of q
for you?
As usual, everything depends on your exact usage of CouchDB, document size and structure, request patterns etc, but in general, our advice is a minimum of 2, and increasing in powers of 2, a q
for every 10GB of data, or 1M documents, whichever comes first.
So a database with 100GB of data and q=8
should start considering going to q=16
.