Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increase blockstore 'writecache' functionality #2850

Closed
whyrusleeping opened this issue Jun 13, 2016 · 5 comments
Closed

increase blockstore 'writecache' functionality #2850

whyrusleeping opened this issue Jun 13, 2016 · 5 comments
Labels
exp/expert Having worked on the specific codebase is important kind/enhancement A net-new feature or improvement to an existing feature status/in-progress In progress topic/perf Performance

Comments

@whyrusleeping
Copy link
Member

we currently have a cache in the blocks/blockstore/writecache.go that caches blocks we've written for later checks to avoid duplicating writes.

The most common by far datastore call is Has, it causes a lot of disk contention, and you can view the number of calls per method in the expvars that ipfs exports (localhost:5001/debug/vars).

On all Get, Has and PutMany calls we should be setting the cache respectively. We should also investigate both increasing this cache size and take a look at what using a different datastructure (such as a bloom filter) might look like.

@whyrusleeping whyrusleeping added help wanted Seeking public contribution on this issue exp/expert Having worked on the specific codebase is important labels Jun 13, 2016
@kevina
Copy link
Contributor

kevina commented Jun 20, 2016

@whyrusleeping if you are going to use a bloom filter I would think you would want to implement that on the 'flatfs' datastore so that it can be complete and saved to disk.

If the flatfs datastore has a bloom filter than I image the write cache can probably be eliminated.

@Kubuxu
Copy link
Member

Kubuxu commented Jun 20, 2016

We can use bloom filter on for smart cache management, not for caching Has itself unless we run through the storage itself during startup and create bloom filter of all available blocks.


Scrap That
Bloom filter will only work well if most calls to for example Has will return false otherwise you have to check the real storage either way.

For Get and caching we would have to create weighed bloom filter, something I would be quite interested of doing.

(Note to self, read for a future: Weighed bloom filter paper)

@kevina
Copy link
Contributor

kevina commented Jun 20, 2016

@whyrusleeping @Kubuxu do we have any stats on the number of Has() requests on the gateway nodes that return false? With bitswap negative Has() requests would seam like a common thing.

What I am proposing that might help is to cache Has() itself at the datastore level with a bloom filter can be kept complete by maintaining a copy on disk. I actually know very little about bloom filters so I have no idea if this is practical so consider this a question. Is this something work considering? I also have no idea if the cost of the bloom filter will outweigh the cost of negative stat system calls to check if a file exists in the flatfs datastore. Thoughts?

@Kubuxu
Copy link
Member

Kubuxu commented Jun 20, 2016

Yes, we plan making a bloom filter with lru cache behind it. The bloom filter would have to be rebuilt on GC but it isn't a problem as implementation of the bloom filter I am working on now requires <100ns per insertion and <80ns per check.

The bloom filter itself should be quite small.

@Kubuxu Kubuxu added kind/enhancement A net-new feature or improvement to an existing feature topic/perf Performance and removed help wanted Seeking public contribution on this issue labels Jun 20, 2016
@whyrusleeping
Copy link
Member Author

we can check the number of total has requests, we don't know if they return true or false though.

@whyrusleeping whyrusleeping added this to the Ipfs 0.4.3 milestone Jun 21, 2016
@Kubuxu Kubuxu added the status/in-progress In progress label Jun 23, 2016
@Kubuxu Kubuxu removed their assignment Aug 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exp/expert Having worked on the specific codebase is important kind/enhancement A net-new feature or improvement to an existing feature status/in-progress In progress topic/perf Performance
Projects
None yet
Development

No branches or pull requests

3 participants