-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Badger datastore #4007
[WIP] Badger datastore #4007
Conversation
repo/fsrepo/datastores.go
Outdated
p = filepath.Join(r.path, p) | ||
} | ||
|
||
os.MkdirAll(p, 0755) //TODO: find better way |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would check errors here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was unpushed for some reason.
Please rebase. |
some initial numbers, adding all of ipfs dists (~2GB) badgerds: 19.88 seconds, 9% cpu So badger is significantly faster than flatfs, and comparable to flatfs-nosync, while still actually syncing data to disk. This is really great stuff :) |
downside is that querying the datastore ( |
Also, for context: sha256 hashing all the files from the above experiment took 5.43 seconds |
It seems for me like badger isn't syncing. |
Can you try running the sync test and doing manual |
0f20573
to
fb920fc
Compare
License: MIT Signed-off-by: Łukasz Magiera <[email protected]>
License: MIT Signed-off-by: Łukasz Magiera <[email protected]>
License: MIT Signed-off-by: Łukasz Magiera <[email protected]>
License: MIT Signed-off-by: Łukasz Magiera <[email protected]>
3fafeec
to
c310e3d
Compare
License: MIT Signed-off-by: Jeromy <[email protected]>
License: MIT Signed-off-by: Jeromy <[email protected]>
Some more benchmark numbers, adding ipfs dists again:
|
License: MIT Signed-off-by: Jeromy <[email protected]>
License: MIT Signed-off-by: Jeromy <[email protected]>
This looks like its ready to go. Just probably wanting some code review, cc @magik6k @Stebalien @kevina @Kubuxu Also, thank you to @manishrjain and the badger crew for implementing the error handling and pushing more improvements to this. Its looking all around better than our current filesystem based blockstore. |
Glad that it's working for you guys! I looked at the Badger code usage, you are not setting Also, can you ensure that you set GOMAXPROCS=128 -- a value large enough so that if you're running this on SSD, you'll be able to see the max IOPS allowed by the SSD. This is useful for key-value iteration and random Get throughput. Also, Badger writes work best if you do batch + goroutines. You can also now do We're also working on mmap of value log, which would significantly improve the random Get latency. |
@manishrjain ah, no. We arent setting Our calling code probably isnt taking advantage of the parallelism of the batch put as much as we should. I can look at tweaking that some and see how it affects the performance. |
repo/config/profile.go
Outdated
"path": "badgerds", | ||
} | ||
return nil | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be missing something, but won't this config bypass the measure datastore? If so will this create a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC there is global measure in fsrepo
It's got to do with the fact that disk reads block OS threads, and how many threads can be scheduling disk reads at the same time. Full discussion here: https://groups.google.com/forum/#!topic/golang-nuts/jPb_h3TvlKE We set Dgraph by default to 128 threads. It doesn't add that much overhead, so it's a safe change to do. Consider either using For values below 100 bytes, we have seen 400K key-value writes per second on the cheapest i3 instance (with local SSD). P.S. Sent a PR to go-ipfs-badger. Small changes to how Badger is accessed. |
We might want to put this in experimental-features doc with some info on how to use this. |
We can also increase GOMAXPROCS from within go ( |
License: MIT Signed-off-by: Jeromy <[email protected]>
License: MIT Signed-off-by: Jeromy <[email protected]>
I think this is good to go now. Thoughts? @magik6k @Stebalien @kevina ? |
LGTM, though badger-ds could be updated to 0.2.1 (see https://github.com/ipfs/go-ds-badger/commits/master) |
This LGTM but I have not been following the Badger datastore discussion so I am not really qualified to review this. |
repo/fsrepo/datastores.go
Outdated
|
||
badgerds "gx/ipfs/QmNWbaGdPCA3anCcvh4jm3VAahAbmmAsU58sp8Ti4KTJkL/go-ds-badger" | ||
levelds "gx/ipfs/QmPdvXuXWAR6gtxxqZw42RtSADMwz4ijVmYHGS542b6cMz/go-ds-leveldb" | ||
badger "gx/ipfs/QmQL7yJ4iWQdeAH9WvgJ4XYHS6m5DqL853Cck5SaUb8MAw/badger" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't declared in package.json. However, if you update to go-ds-badger 0.2.1, you can use the re-exported badgerds.DefaultOptions
and badgerds.Options
and avoid this import altogether.
Is there going to be a guide to show how to use this? |
@Stebalien Could you handle updating the badger-ds dependency? @hoffmabc Yes, We will add docs around this. But the simple of case of making a new ipfs node that uses this is just |
(and use the re-exported options instead of importing badger directly) License: MIT Signed-off-by: Steven Allen <[email protected]>
@whyrusleeping done. |
I'm not sure @whyrusleeping to be honest. That seems pretty straightforward for using this. |
Do any tests need to be written for this out of curiosity? |
We have tests on the data store in isolation, but you're right. It would be
good to have some level of integration testing.
…On Wed, Sep 6, 2017, 9:33 PM Brian Hoffman ***@***.***> wrote:
Do any tests need to be written for this out of curiosity?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4007 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABL4HC9t7qHFm-u4sOhXbZ8ZQv78cVBuks5sf0fegaJpZM4OEdGB>
.
|
Also we initialize ipfs using a config file where leveldb is the only option. Will this be extended to support this datastore? EDIT: I see the pre-requisite now above. |
For tests: we can/should try running whole sharness over this datastore. To not make it take forever to run on travis/circle it could be done as jenkins pipeline stage which would pass additional profiles to ipfs/iptb init in sharness. |
Datastore section in |
is this getting merged today? |
@hoffmabc Yeap! :) |
Depends on:
TODO: