Skip to content

An embeddable, persistent key-value store optimised for sequential reads.

License

Notifications You must be signed in to change notification settings

dborchard/stormdb

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

StormDB

An embeddable, persistent key-value store optimised for sequential reads.

Quality Gate Status Coverage Security Rating

Current Status: Used in production by CleverTap.

Goals

  • High speed sequential read
  • High speed random writes
  • Affinity to cluster recently updated keys for higher random IO on such keys
  • Ability to recover from file corruption

Benchmarks

We ran a small benchmark for 100 million 4 byte keys with 28 byte values, and the results are shown below:

Iteration

Random

Note: We've only compared this with RocksDB since in all our previous benchmarks, we've observed that RocksDB was the fastest.

More details about this benchmark may be found here.

Motivation

The primary motivation behind StormDB was to achieve fast sequential scans across all keys in the database. When we benchmarked sequential reads of popular key value stores, we realised:

  1. None of them are as fast as simply reading the file from an SSD
    • We used time cat 4GB_file >> /dev/null to understand the raw throughput possible
    • None of the key value stores out there delivered anything close to this
  2. Most are based on LSM, which requires all data to be sorted
    • This inherently doesn't provide an optimised technique to cluster recently updated keys together
  3. Although a few keys are frequently updated and queried, LSM based stores would require all their data to be cached by the operating system file cache
    • On the contrary, StormDB clusters recently updated keys at the head of its data file
    • Therefore, the operating system can cache just the head of the data file, and provide high throughput random reads without caching the entire data file

Getting Started

  1. Add the CleverTap Maven Bintray repository
<repositories>
    <repository>
        <id>bintray-clevertap-Maven</id>
        <name>bintray</name>
        <url>https://dl.bintray.com/clevertap/Maven</url>
    </repository>
</repositories>
  1. Add the Maven dependency
<!-- Heads up! Use the latest version from https://github.com/CleverTap/stormdb/releases/latest -->
<dependency>
    <groupId>com.clevertap</groupId>
    <artifactId>stormdb</artifactId>
    <version>1.0.0</version>
</dependency>
  1. Create and use a StormDB instance
// Create.
final StormDB db = new StormDBBuilder()
        .withDbDir("path/to/db")
        .withValueSize(28)
        .build();

// Put.
db.put(1, new byte[28]);

// Get.
final byte[] value = db.randomGet(1);

// Iterate over all keys.
db.iterate((key, data, dataOffset) -> {
    // Read 28 bytes starting at dataOffset in data[]. 
});

// Close.
db.close();

Limitations

  • Key sizes are restricted to 4 bytes (they're stored as integers internally)
  • A database must be opened indicating the value size
    • As a result, once created, all values are always of the specified size
    • This is fixed, as StormDB is an offset based database

FAQs

When should I consider using StormDB?

  • Your keys are limited to 4 bytes (a maximum of ~4.2 billion keys per database)
  • The size of your values are fixed
  • You need to iterate over all keys very quickly (and not sorted in any order, other than recently updated keys)

If the answers to all the points above are a hard "yes", then you'd benefit from StormDB.

Contributing

We welcome anybody and everybody to contribute to the code, design, or report bugs. If there's a feature which you'd like to see implemented, open an issue about it. If you'd like to implement the feature yourself, please discuss the design before beginning its implementation.

A few pointers

Maintainers

StormDB is maintained by the CleverTap Labs team, with contributions from the entire engineering team.

License

StormDB is licensed under the MIT License. Please see LICENSE under the root directory.

About

An embeddable, persistent key-value store optimised for sequential reads.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%