neighbourhoodie-nnh-logo

Avoid Concurrent Writes to Document Properties posted Friday, September 6, 2024 by The Neighbourhoodie Team

Designing your data for use with CouchDB is usually pretty straightforward. Your application will make use of certain objects or data records like persons, events, tasks, etc. When thinking about storing these in CouchDB, you usually store an individual instance of each of those objects or data records in an individual CouchDB document:

{
  "_id": "person:12345",
  "name": "Harriet Tubman"
}

{
  "_id": "person:67890",
  "name": "George Floyd"
}

Determining where information about a person begins and where it ends is easy enough to do, but sometimes you want to store information that is a little more complicated.

Say you have a bike rental service at your workplace, so folks can run quick errands during their day. All they need to do is check out the bike and log the trip when they return it, so the maintenance folks know how much usage the bike got.

A bike is a self-contained thing, so you might model this in a single document in CouchDB:

{
  "_id": "bike:123",
  "vendor": "GT",
  "trips": [
    {
      "start": "2020-06-07 15:22",
      "end": "2020-06-07 15:46",
      "distance": 3.2
    }
  ]
}

Every time a new trip is recorded, you append a new entry in the trips array and all is well.

Or is it? There are two concerns with this approach, let’s explore them one after the other.

  1. If the bike rental service is very popular, the list of trips might grow quite large and with each trip added, any operation with this document will get slower, as it takes longer to process the document as it grows. After a few years, you might even run into practical document size limits. Now imagine something that happens a lot more often than trips on a bike that humans can do, like sensors measuring something in an IoT device many times per hour. You're better off storing each measurement (or trip), in a separate document.
  2. What if a person gets back and gets immediately distracted with work and logs their trip a little while later? They might coincide with someone else returning the bike and collide when trying to append to the trips array. In CouchDB, this causes a conflict. While conflicts are a normal occurrence in CouchDB data that you need to understand and handle, when designing your data model, you want to limit the opportunities for conflicts as much as possible. Again, storing trips in separate documents helps with this.

What if you need all trips for a bike? A View will get you the result.

{
  "_id": "bike:123:trip:xyz",
  "start": "2020-06-07 15:22",
  "end": "2020-06-07 15:46",
  "distance": 3.2
}