Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cursor.get returns key and value concatenated into key variable #25

Open
DrissiReda opened this issue Mar 20, 2019 · 10 comments
Open

Cursor.get returns key and value concatenated into key variable #25

DrissiReda opened this issue Mar 20, 2019 · 10 comments

Comments

@DrissiReda
Copy link

Weird output from cursor get: I would get both key and value inside the key variable, then the value inside the value variable :

#include <cstdio>
#include <cstdlib>
#include <lmdb++.h>
using namespace lmdb;
int getsize(const lmdb::env& e){
    auto t = lmdb::txn::begin(e.handle(), nullptr, MDB_RDONLY);
    auto d = lmdb::dbi::open(t, nullptr);
    int r=d.size(t);
    t.abort();
    return r;
}


int main() {
  auto env = lmdb::env::create();
  env.set_mapsize(1UL * 1024UL * 1024UL * 1024UL); /* 1 GiB */
  env.open("./example.mdb", 0, 0664);
  {
    auto wtxn = lmdb::txn::begin(env);
    auto dbi = lmdb::dbi::open(wtxn, nullptr);
    char a[6] = "hello";
    dbi.put(wtxn, "email", "hello");
    dbi.put(wtxn, "key", "value");
    dbi.put(wtxn, "user", "johndoe");
    wtxn.commit();
  }
  {
      auto rtxn = lmdb::txn::begin(env);
      auto dbi = lmdb::dbi::open(rtxn, nullptr);
      auto cursor = lmdb::cursor::open(rtxn, dbi);
      lmdb::val k, v;
      while(cursor.get(k, v, MDB_NEXT)){
        printf("We got '%s'\nValue '%s'\n", k.data(), v.data());
      }
  }
  {
    std::printf("size is %d\n", getsize(env));
  }
return EXIT_SUCCESS;
}

Expected Output:

We got 'email'
Value 'hello'
We got 'key'
Value 'value'
We got 'user'
Value 'johndoe'
size is 3

Output :

We got 'emailhello'
Value 'hello'
We got 'keyvalue'
Value 'value'
We got 'userjohndoe'
Value 'johndoe'
size is 3

@hoytech
Copy link

hoytech commented Mar 20, 2019

It looks like you are assuming your values will be NUL-terminated:

printf("We got '%s'\nValue '%s'\n", k.data(), v.data());

But your puts above aren't writing the NUL byte.

Don't use the C string routines since you won't be able to store NUL bytes in your keys/values. Instead do something like:

std::string myKey(k.data(), k.size());
std::string myValue(v.data(), v.size());
std::cout << "We got " << myKey << " and " << myValue << std::endl;

(untested)

Or, better yet, upgrade to C++17 and use my fork and get the string_view hotness :)

@DrissiReda
Copy link
Author

DrissiReda commented Mar 20, 2019

I know about string_view and about your fork, but I have constraint that prevent me from going above c++11. Since you're here, could you tell me how can I force my database to sorted by input order (instead of lexicographically). Also on first creation of database, it always crashes with "bus_error", but if I open the environment, then close it then open it again and execute my code, it works.

EDIT: How can I make sure my puts do add the null byte, in order to avoid this problem?

@hoytech
Copy link

hoytech commented Mar 20, 2019

  1. Make it so your keys increase every time you insert, or use a secondary index

  2. I don't know, I'd need code to reproduce it. Make sure you are creating a big enough MAPSIZE, and the same mapsize each time. Make sure you commit the transaction that creates the tables.

  3. I suggest not storing the NUL byte. It's a waste of space since LMDB tracks size anyway. Furthermore, it indicates you are unable to store NUL bytes as part of your keys or values (as I described above)

@DrissiReda
Copy link
Author

How can I use a secondary index?

@hoytech
Copy link

hoytech commented Mar 20, 2019

Every time you insert into your main table, you also insert into another table. In this secondary table, the key is an increasing integer, and the value is the key into your main table.

Then when you want to iterate over your items in insertion order, iterate through the secondary index. For each value, use it as a key to look up the item from the main table.

@DrissiReda
Copy link
Author

I'm sorry for all these questions but I can't find where else to ask them, is it possible to get the position of an entry without iterating through my database with a cursor and then incrementing a counter?

@hoytech
Copy link

hoytech commented Mar 22, 2019

It's OK. What do you mean get the position? You can position the cursor directly to a know key with cursor ops like MDB_SET, or go to somewhere nearby with MDB_SET_RANGE.

If you mean get an item by its postion index (ie, get the 10th element in the DB) then afaik this is not possible. If you're always appending to the DB (and never removing or inserting in the middle) then you could maintain the position index in a secondary index.

@DrissiReda
Copy link
Author

I meant, doing a dbi.get() to search for the position index of a certain key, (e.g inputting "user_email" and receiving 10). Is this possible without holding duplicate databases with secondary indexes?

@hoytech
Copy link

hoytech commented Mar 22, 2019

AFAIK, no.

@DrissiReda
Copy link
Author

I'll see which one would be more cost effective, iterating through the database and counting, or keeping a duplicate of each concerned database with indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants