Skip to content

Prefix Filter: Practically and Theoretically Better Than Bloom.

License

Notifications You must be signed in to change notification settings

TomerEven/Prefix-Filter

Repository files navigation

The Prefix Filter

Implementation of Prefix-Filter, which is an incremental filter (approximate set-membership queries). If you plan on using the Prefix-Filter, please cite our paper: Prefix Filter: Practically and Theoretically Better Than Bloom. Tomer Even, Guy Even, Adam Morrison. To appear in PVLDB, 15(7).

Short talk abouth the Prefix Filter: https://www.youtube.com/watch?v=KMVtvACSGo0

Prerequisites

  • Compiler: A C++17 compiler such as GNU G++ or LLVM Clang++.
  • CMake (Version 3.10 or higher).
  • System: Linux.
  • Hardware: Intel. Support of AVX512 is a must.
Python Packages To Produce The Graphs
  • matplotlib
  • brokenaxes (From here).
  • pandas

There are three main targets for three different benchmarks.

  1. measure_perf for benchmarking performance of insertions and lookups under various loads.
  2. measure_build for build-time.
  3. measure_fpp for evaluating the false positive probability, and the "effective space consumption".

There is also an example of how to use the Prefix-Filter in the file example.cpp:

#include "Tests/wrappers.hpp"

int main(){
    using spare = TC_shortcut; // Any incremental filter can replace TC_shortcut.
    using prefixFilter = Prefix_Filter<spare>; 

    size_t filter_max_capacity = 1'000'000; // Choose any size.
    prefixFilter example_filter = FilterAPI<prefixFilter>::ConstructFromAddCount(filter_max_capacity);

    uint64_t x1 = 0x0123'4567'89ab'cdef;
    FilterAPI<prefixFilter>::Add(x1, &example_filter); // Insertions of an item x1. Insertion can be performed only one step at a time.

    bool res = FilterAPI<prefixFilter>::Contain(x1, &example_filter); // Lookup of x1.
    assert(res); //No false negative.
    
    uint64_t y1 = ~0x0123'4567'89ab'cdef;
    bool res2 = FilterAPI<prefixFilter>::Contain(y1, &example_filter); // Lookup of y1.
    std::cout << res2 << std::endl; // Possible false positive. (Although with one item in the filter, this is highly unlikely.)
    return 0;
}

To build

git clone -b master https://github.com/TheHolyJoker/Prefix-Filter.git
cd Prefix-Filter
mkdir build
cd build
cmake .. -DCMAKE_C_COMPILER=gcc-10 -DCMAKE_CXX_COMPILER=g++-10
t="measure_perf measure_built measure_fpp"; time cmake --build ./ --target $t -j 20

On GCC release: Older releases like 9.3 will work. We recommend using newer releases, which seems to perform better.

To Run

To run all benchmarks (from Prefix-Filter/build directory):

cp ../RunAll.sh RunAll.sh 
./RunAll.sh
Specific Target

Any specific target (for example measure_perf) can be built and executed with as follows:

t="measure_perf"; time cmake --build ./ --target $t -j 20 && time taskset -c 2 ./$t 

Credits