-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the support of SCALING for bloom filter #1721
Changes from 2 commits
d16abea
94bc62c
616cb5b
8b54c89
afbbfb3
4210137
395e5aa
527ac7b
d9b1b59
4f4975e
82bd9e5
320502b
0d741b3
07dbeda
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -56,12 +56,36 @@ rocksdb::Status BloomChain::createBloomChain(const Slice &ns_key, double error_r | |
block_split_bloom_filter.Init(metadata->bloom_bytes); | ||
|
||
auto batch = storage_->GetWriteBatchBase(); | ||
WriteBatchLogData log_data(kRedisBloomFilter, {"createSBChain"}); | ||
WriteBatchLogData log_data(kRedisBloomFilter, {"createBloomChain"}); | ||
batch->PutLogData(log_data.Encode()); | ||
|
||
std::string sb_chain_meta_bytes; | ||
metadata->Encode(&sb_chain_meta_bytes); | ||
batch->Put(metadata_cf_handle_, ns_key, sb_chain_meta_bytes); | ||
std::string bloom_chain_meta_bytes; | ||
metadata->Encode(&bloom_chain_meta_bytes); | ||
batch->Put(metadata_cf_handle_, ns_key, bloom_chain_meta_bytes); | ||
|
||
std::string bf_key = getBFKey(ns_key, *metadata, metadata->n_filters - 1); | ||
batch->Put(bf_key, block_split_bloom_filter.GetData()); | ||
|
||
return storage_->Write(storage_->DefaultWriteOptions(), batch->GetWriteBatch()); | ||
} | ||
|
||
rocksdb::Status BloomChain::createBloomFilter(const Slice &ns_key, BloomChainMetadata *metadata) { | ||
uint32_t bloom_filter_bytes = BlockSplitBloomFilter::OptimalNumOfBytes( | ||
static_cast<uint32_t>(metadata->base_capacity * pow(metadata->expansion, metadata->n_filters)), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might not strong related to this patch, should we enforce a "hardlimit" to avoid too large BloomFilter? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does Redis have some limits here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. RedisBloom has a maximum Cuckoo Filter I think our bitmap has some limits. And There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 128MiB is also too large for one key, maybe we can improve this by separating them into multi subkeys in future. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Previously I talked with @zncleon . Split into subkeys doesn't work like bitmap, since bloom filter would tent to random access all bits in it's space. Maybe a maximum size would help? |
||
metadata->error_rate); | ||
metadata->n_filters += 1; | ||
metadata->bloom_bytes += bloom_filter_bytes; | ||
|
||
BlockSplitBloomFilter block_split_bloom_filter; | ||
block_split_bloom_filter.Init(bloom_filter_bytes); | ||
|
||
auto batch = storage_->GetWriteBatchBase(); | ||
WriteBatchLogData log_data(kRedisBloomFilter, {"createBloomFilter"}); | ||
batch->PutLogData(log_data.Encode()); | ||
|
||
std::string bloom_chain_meta_bytes; | ||
metadata->Encode(&bloom_chain_meta_bytes); | ||
batch->Put(metadata_cf_handle_, ns_key, bloom_chain_meta_bytes); | ||
|
||
std::string bf_key = getBFKey(ns_key, *metadata, metadata->n_filters - 1); | ||
batch->Put(bf_key, block_split_bloom_filter.GetData()); | ||
|
@@ -147,18 +171,22 @@ rocksdb::Status BloomChain::Add(const Slice &user_key, const Slice &item, int *r | |
|
||
// insert | ||
if (!exist) { | ||
if (metadata.size + 1 > metadata.GetCapacity()) { // TODO: scaling would be supported later | ||
return rocksdb::Status::Aborted("filter is full"); | ||
if (metadata.size + 1 > metadata.GetCapacity()) { | ||
if (metadata.IsScaling()) { | ||
s = createBloomFilter(ns_key, &metadata); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the status seems not checked There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure it can be done like this. Now we have holding a "lock" here. And I propose a |
||
} else { | ||
return rocksdb::Status::Aborted("filter is full and is nonscaling"); | ||
} | ||
} | ||
s = bloomAdd(bf_key_list.back(), item_string); | ||
zncleon marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if (!s.ok()) return s; | ||
*ret = 1; | ||
metadata.size += 1; | ||
} | ||
|
||
std::string sb_chain_metadata_bytes; | ||
metadata.Encode(&sb_chain_metadata_bytes); | ||
batch->Put(metadata_cf_handle_, ns_key, sb_chain_metadata_bytes); | ||
std::string bloom_chain_metadata_bytes; | ||
metadata.Encode(&bloom_chain_metadata_bytes); | ||
batch->Put(metadata_cf_handle_, ns_key, bloom_chain_metadata_bytes); | ||
|
||
return storage_->Write(storage_->DefaultWriteOptions(), batch->GetWriteBatch()); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's needless to comment on simple methods such as
IsScaling
orGetCapacity
- they have self-explanatory names. BTW, the implementation ofIsScaling
is so short/simple that you can write it in the header file.