neighbourhoodie-nnh-logo

Data Format Compatibility posted Friday, August 2, 2024 by The Neighbourhoodie Team

When storing data in CouchDB, you send JSON in one form or another. When you retrieve data from CouchDB it is also in the JSON format.

In order to do its job of storing and retrieving data, supporting secondary indexes with Mango and Views, as well as Replication and the Changes Feed, CouchDB takes the JSON apart and stores it in an internal format that supports all the aforementioned features.

The data you send to CouchDB for safekeeping is ultimately stored in files on a filesystem. So whatever the internal format of all data looks like ends up in bytes in a file.

Every once in a while, a new CouchDB version comes out with a new feature that requires a change to the internal data format, and thus to the bytes in your database files.

This opens the question of how CouchDB handles the different data formats, especially when a new version of CouchDB is installed and is tasked to open a database in the previous data format.

CouchDB handles this by versioning the database format. At the time of writing, the version is 8. In the nearly 17 year history of CouchDB, the data format was only updated seven times, so as you can see, this is a rare occurrence.

In addition to versioning the format, CouchDB also includes code to read not only the current data format, but also one or more previous versions of the format. It doesn’t support reading them all, because making all that work is rather complex, but you are guaranteed to at least be able to read from the immediately previous data format version.

The latest disk version supported is 4 which was current in CouchDB version 0.10 to 0.11 in the late 2000s.

When a new data format version is available, but a database is still in the previous format, CouchDB transparently converts the database to the new format. So if you need to be able to open the database with an older version of CouchDB, you have to make a physical backup copy first.

All in all, CouchDB’s data format is very stable and designed to be relied upon for a long time.