neighbourhoodie-nnh-logo

Structured Query Server

SQL Queries for Apache CouchDB

Product Description

Structured Query Server (SQS) is an SQL query engine for CouchDB, letting you use complex SQL SELECT statements to extract information from a CouchDB database with extreme flexibility.

Use Cases

Developers can build more complex software on top of CouchDB faster without sacrificing performance or CouchDB’s unique replication feature.

SQS seamlessly complements CouchDB’s built-in querying mechanisms and allows for complex queries that have not been possible with your CouchDB data before, including finding documents by multiple criteria, sorting by another value, and complex JOINS.

Data analysts can now query your CouchDB with the tools they are used to without any additional infrastructure to maintain.

Instead of having to learn CouchDB’s idiosyncrasies compared to SQL databases, SQS gives your data analysts full access to your CouchDB data.

Just to add, loving it so far, its amazing how much capability this will bring to [our product] once its fully integrated!

Our current method of working around CouchDB Query limitations are a significant performance bottleneck.

This will replace large swaths of our custom code.

Features

HTTP interface that returns JSON: query with anything, from anywhere, no SQL client necessary.

SQS runs in parallel with CouchDB and keeps itself up to date.

Works out of the box with most database designs and node configurations.

Handles millions of documents per database.
Fast indexing (on average: on par with CouchDB JS views)

Structured Query Server is designed to feel as easy to use as the CouchDB views you already know:

  • Update a design doc to change your schema and SQS automatically picks it up.
  • SQS indexes are stored as hashes of the schema content, so you can “warm up” indexes in production before using them.
  • Supports _purge for strict data removal, for example to ensure GDPR compliance.
  • Includes _stats that show useful runtime statistics.
  • Can be configured with different log levels for everything from request logging to debugging and logs to stdout, a file or syslog.
  • Correctly handles single-node and clustered uses of CouchDB, including changes-feed rewinds.
  • Index compaction is handled automatically and can be switched to manual if needed.
  • Crash-proof design, just like CouchDB.

Installation and Usage

Structured Query Server is delivered as a binary package and is simple to set up and use.

Installation

  1. Start up the Structured Query Server and pass in the URL of your CouchDB as an argument.
  2. In each CouchDB database you wish to query, add a special design document that defines your schema and how your CouchDB documents should map to tables and rows.
  3. There is no step three.

Usage

Send SQL SELECT queries to the Structured Query Server via simple HTTP requests:

curl http://user:pass@couchdb.examople.com:4984/db/_design/sqs -X POST \
-d 'SELECT one_field FROM table WHERE other_field = 'value' AND third_field = 42 ORDER BY yet_another_field DESC LIMIT 5'

SQS is currently available for Linux x86_64, Mac x86_64, and Windows x86_64, with support for more architectures on the way.

In addition, we offer a support package for SQS, should you require assistance with the setup or production use of Structured Query Server. Should you have any questions, do not hesitate to get in touch.

Technical Details

Structured Query Server is built in Node.js and uses SQLite for persistence and querying. Node.js’s event based runtime is perfectly suited to implement SQS, which loads data from CouchDB over the network and stores the results in on-disk indexes, while also allowing high-concurrency and fast HTTP requests for SQL queries.

SQS is a HTTP server that automatically inherits your CouchDB security setup. No additional access control management is required.

Supports all SQLite features:

  • Subqueries, including correlated subqueries.
  • Up to 64-way JOINs.
  • LEFT, RIGHT, FULL INNER, OUTER and CROSS JOINs.
  • DISTINCT, ORDER BY, GROUP BY, HAVING, LIMIT, and OFFSET.
  • UNION, UNION ALL, INTERSECT, and EXCEPT.

Pricing Information

SQS is available as a one-time purchase or in 12 or 24 month instalments:

*VAT is added if applicable

Please consult our Terms of Service for more information.

To make sure you always get the help you need while using SQS we are offering product support separately:

SQS Support for only 44 EUR* per month/per user

  • Get in touch with our CouchDB experts whenever you need
  • Get help with the installation of SQS
  • Get help with all questions related to the use of SQS
  • Get information on new features

More information on the Structured Query Server Support page.

*VAT is added if applicable

FAQ

Is the use of SQS for my CouchDB limited?

CouchDB is very good at offering simple and medium-complexity queries with speed and scalability. In fact, CouchDB makes a conscious trade-off limiting the complexity of queries in favour of extreme scalability: you can run the same queries on a single node or a 100-node cluster. However, most CouchDB users use between 1–6 nodes and this allows Structured Query Server to make a different trade-off: allow very complex queries, including SQL JOINS, at the expense of extreme scalability. Structured Query Server is still plenty fast and can support a lot of data, but won’t keep up with a 100-node CouchDB cluster. Consult our performance guide below to see whether SQS is a viable option for your scenario.

Is SQS compatible with all other SQL software?

No, SQS is not a fully-fledged SQL interface to CouchDB that is compatible with all other SQL software, as it only supports SELECT statements.

Is SQS only available for SQLite?

Due to SQLite’s technical nature, per-database indexing is single-threaded and querying is serialised through a mutex to ensure correct results. While this leaves some performance opportunities on the table, we plan to address this with future products: Structured Query Server for MySQL, PostgreSQL, MSSQL & Oracle, if we get enough interest in them. Sign up here to register interest and receive news when those ship.

Tell me more about the performance of SQS

We’ve run extensive tests to ensure the best possible performance for Structured Query Server. These act as guidelines and order-of-magnitude comparisons for your own installations. This will let you decide whether Structured Query Server will work with your CouchDB setup. There are two main areas of interest when it comes to SQS performance: indexing and querying.

Indexing Performance

Here, we make two comparisons:

  1. How much slower is SQS at creating an index by reading the CouchDB _changes feed than just letting curl dump the _changes feed into /dev/null. Of course the latter measure is not practically useful, but it illustrates a theoretical optimum. SQS indexing will be slower, but it does full SQL indexing and persistence, at not-that-large a slowdown.
  2. How does SQS compare to CouchDB JavaScript views, the most complex default querying option? Here SQS does a lot more work at comparable speeds.

SQS vs _changes to /dev/null (theoretical optimum)

Number of docsDoc size in bytesDifference
1,000 — 10,000,0001003.5x (slower)
1,000 — 10,000,00010001.87x (slower)
1,000 — 10,000,00010,0001.98x (slower)

Our conclusion is that at a ~2x slowdown for typical databases versus streaming _changes into /dev/null, you get a full SQL query engine. Not bad!

SQS vs JS Views

q=1 database

Number of docsDoc size in bytesDifference
1,000 — 10,000,0001000.76x (faster)
1,000 — 10,000,00010000.84x (faster)
1,000 — 10,000,00010,0001.18x (slower)

q=2 database

Number of docsDoc size in bytesDifference
1,000 — 10,000,0001000.86x (faster)
1,000 — 10,000,00010000.99x (about the same)
1,000 — 10,000,00010,0001.56x (slower)

q=4 database

Number of docsDoc size in bytesDifference
1,000 — 10,000,0001000.95x (about the same)
1,000 — 10,000,00010001.07x (about the same)
1,000 — 10,000,00010,0001.81x (slower)

Our conclusion is that with increased sharding, CouchDB can make better use of multi-core systems, but even for typical large databases, SQS can hold its own. For small databases, SQS even beats JS views. Given that SQS allows for very complex queries, the observed numbers are very favourable.

Querying Performance

Here we measure the number of requests per second against SQS. Each run is 1,000,000 random queries against a fully built index. We ran this with concurrency levels of 1–1,000, the results are all in the same ballpark independent of the concurrency level.

Number of docsTime per request (ms)Requests per second
1,000 — 100,0002.73381.1
500,000 — 750,0003.01321.2
750,000 — 1,000,00012.266.65

Our conclusion is that up to 750,000 docs in the index, SQS is plenty fast for most workloads and after that the limits of the underlying SQLite library are starting to show, but the performance is still not bad. If you need more concurrent read performance at larger doc counts, sign up for our mailing list to get notified when we can reach higher limits.

If you need to reach higher limits for concurrent queries, you can run multiple instances of SQS on multi-core machines with spare CPU capacity or on dedicated machines.

Interested in working with us?

View the full list of our services on our homepage.