Skip to content

Integration into Couchbase Lite

snej edited this page Oct 2, 2014 · 4 revisions

ForestDB is being integrated into a future release of the Couchbase Lite mobile database, replacing SQLite as the underlying data store. This is still early in development, but it's already working reliably and we are seeing performance improvements of 2x to 5x on common operations like populating, indexing and querying medium-sized databases.

ForestDB also eliminates contention between readers and writers that has caused occasional temporary lock-ups in the current release; while these don't cause crashes or data loss, they are still problematic because a mobile app needs its UI to remain responsive to user input. ForestDB guarantees that read operations performed by UI code will not be blocked by writes occurring on the replicator thread.

Schema

A single Couchbase Lite database is stored in multiple ForestDB databases, to provide independent keyspaces. One is the main document store; one is for "local" documents (which have a simpler structure and are used mostly for storing replication state); and there is another ForestDB database for each map/reduce view's index.

The document store

The document store uses UTF-8 document IDs as keys. The value (body) is in a custom binary format containing a revision tree, and also the JSON body of the current revision (and any conflicting revisions.)

The document metadata contains only the current revision ID and some flags indicating whether the document is in conflict and whether the current revision is a deletion. This allows the database to be iterated without reading the document bodies, since many operations (like sending a list of changes to the server) only need to know the document IDs and revision IDs.

View indexes

The databases used for map/reduce view indexes have a complex key structure. The keys emitted by the map functions can be arbitrary JSON values, and there is a strict definition of how these values are collated. This is much higher-level than a simple lexical comparison: for instance, numbers are compared in numeric order, and arrays are compared element-by-element. To avoid the overhead of using a custom comparison function, these keys are transformed into a representation that is semantically equivalent but has the property that it can be compared lexicographically while preserving the defined ordering.

(Actually the keys in the database are encoded from a JSON array whose first element is the emitted key, and second element is the ID of the document being indexed. This ensures that if multiple source documents emit the same key, there is no collision, and the keys will be traversed in a well-defined order.)

The body of an index document consists simply of the value emitted by the map function. For simplicity it's encoded the same way as the key, but it could just as well be stored as plain JSON.

In addition to these emitted key/value pairs, the index also contains a set of entries that map from a document ID (in the main database) to the list of sequence numbers corresponding to the values currently emitted by that document. These entries are used to remove obsolete key/value entries when a document is re-indexed after it's changed.

Implementation

We chose to write a C++ library, CBForest, as a glue layer between ForestDB and the Couchbase Lite implementation. This layer includes a lot of needed functionality such as revision-tree storage and map-reduce indexing, which will be shareable between the different implementations (Objective-C, Java, C#) of Couchbase Lite. It can also be used on its own by those who want a higher-level ForestDB API that's more idiomatically C++.

Try it out!

Instructions for building a ForestDB-powered version of Couchbase Lite for iOS/Mac are available on the Couchbase Lite wiki. (The Java and C# versions will be integrating ForestDB later.)

Clone this wiki locally