neighbourhoodie-nnh-logo
Consulting & Development CouchDB Support & Services Greenkeeper Training

Use Type in Doc _id posted 01/09/2020 by The Neighbourhoodie CouchDB Team

This post is part of our CouchDB Tips Series that we publish every week. The team at Neighbourhoodie works on and with CouchDB every day and we are happy to pass on all the tips and tricks we learn along the way. If you like what you see, check out our Professional Services for CouchDB, including production support and training. If you want to continuously ensure that your CouchDB is running optimally, sign up for Opservatory, our 24/7 CouchDB analysis and diagnostics tool.

When deciding on which data goes into which CouchDB documents, it is commonly helpful to keep track of the type of document. For example, you could have documents for users and documents for articles.

The most common way to store the type is by adding a type field to the document:

{
  "_id": "123456",
  "type": "user",
   …
}

{
  "_id": "abcdef",
  "type": "article",
   …
}

There is an alternative that has advantages in certain situations: storing the type inside the document _id:

{
  "_id": "user:123456",
   …
}

{
  "_id": "article:abcdef",
   …
}

The advantage of this approach is when you have an application that often needs to query CouchDB for documents of a certain type, say, list all articles. In the first case, you’d have to use a Mango Query, or a JavaScript View to get the documents you want. With the second approach, you can use the _all_docs endpoint with the startkey and endkey parameters.

Using _all_docs for this allows you to skip using a secondary index for your most common types of queries. While having secondary indexes is perfectly fine, not having to use them will save you some computing resources.

In addition, secondary indexing in PouchDB up until version 7 comes with a performance penalty (that version 8 is going to remove, at least for find()), that is more severe than secondary indexes in CouchDB. So it is advantageous to re-use the _all_docs index as much as possible.

You can also use more segments, say you need all article by time, your _ids could look like this:

article:2020-05-25:abcdef

If your primary use of accessing CouchDB is not getting documents by type, then this is a less useful tip, but in case you do, this can give you a good performance boost.