Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the migrate iterator for iterating key values #1989

Closed
Tracked by #1223
git-hulk opened this issue Jan 5, 2024 · 2 comments · Fixed by #2004
Closed
Tracked by #1223

Implement the migrate iterator for iterating key values #1989

git-hulk opened this issue Jan 5, 2024 · 2 comments · Fixed by #2004
Assignees

Comments

@git-hulk
Copy link
Member

git-hulk commented Jan 5, 2024

Currently, we need to iterate all keys in the database in different places like the cluster migration and kvrocks2redis, but don't have an iterator for this purpose. It's very error-prone to implement this in different places since Kvrocks may add a new column family in the future, and we must be careful to iterate all keys in all column families. This would be a burden for maintenance, So I think we could add an iterator to iterate all keys in all column families.

Proposal

For the API, we could generally comply with the rocksdb's iterator API, but we allow to add some extra functions if needed.

class Iterator {
public:
    Iterator(Storage *storage, const rocksdb::ReadOptions &options, const int slot = -1);
    ~KeyIterator();

    bool Valid() const;
    void Next();

    rocksdb::WriteBatch *Batch() const;
    Slice Key() const;
    Slice Value() const;
    RedisType Type() const;
};

And when implementing this iterator, it will iterate the metadata column family first and check its type, if it's not a string, then it will iterate the corresponding column family to get subkeys. That said, if we have a key foo with type hash, then the iterator will iterate foo and foo:field1, foo:field2, and so on.

This solution can bring those benefits:

  • The codes look more intutive
  • Can reuse this iterator if we want to iterate keys only
@git-hulk
Copy link
Member Author

@caipengbo Would you mind taking a look while you're free.

@caipengbo
Copy link
Contributor

Would you mind taking a look while you're free.

Good job, LGTM!

git-hulk added a commit that referenced this issue Jan 12, 2024
Currently, we need to iterate all keys in the database in different places like the cluster migration and kvrocks2redis, but don't have an iterator for this purpose. It's very error-prone to implement this in different places since Kvrocks may add a new column family in the future, and we must be careful to iterate all keys in all column families. This would be a burden for maintenance, So we want to implement an iterator for iterating keys.

```C++

DBIter iter(storage, read_option);
for (iter.Seek(); iter.Valid(); iter.Next()) {
    if (iter.Type() == kRedisString || iter.Type() == kRedisJSON) {
        // the string/json type didn't have subkeys
        continue;
    }

    auto subkey_iter = iter.GetSubKeyIterator();
    for (subkey_iter.Seek(); subkey_iter.Valid(); subkey_iter.Next()) {
        // handle its subkey and value here
    }
}

```

When using this iterator, it will iterate the metadata column family first and check its type, if it's not a string or JSON, then it will iterate the corresponding column family to get subkeys. That said, if we have a key foo with type hash, then the iterator will iterate foo and foo:field1, foo:field2, and so on.

This solution can bring those benefits:

- The codes look more intuitive
- Can reuse this iterator if we want to iterate keys only

This closes #1989
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants