Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ipfs Performance Tweaks #404

Closed
wants to merge 1 commit into from
Closed

Ipfs Performance Tweaks #404

wants to merge 1 commit into from

Conversation

whyrusleeping
Copy link
Member

This PR will be the staging ground for ipfs performance fixes. It will be nice to have all of these changes in a single place so we can easily test performance of this branch against master.
My list of hopeful features is:

  • Blockstore agent
    • A blockstore agent would act to serialize access to the datastore, avoiding the lockdeath weve seen from leveldb
  • make handleOutgoingMessage not a goroutine
    • Or gate the number of goroutines with a semaphore
  • make an agent for the muxers bandwidth accounting
    • instead of locking around the bandwidth variables, send updates over a channel to a receiver loop that keeps track of them.
  • Batch up bitswap wantlist sends
    • send wanted blocks in order, N blocks at a time, this will make streaming more likely to happen
    • example: if 100 blocks are requested, only add 10 to the wantlist at a time, so that blocks are received in a general order so that we can stream. Otherwise we run the risk of receiving the blocks in (worst case) reverse order and cannot stream.
  • msgio fixes
    • sink Pool into Reader (no max size, preallocated buffers issue)
    • fix func (rw *ReadWriter_) Close()
    • r&w threadsafety
  • take a look at levelDB's performance
    • Compare to other K/V store databases.
  • profile for number of active goroutines at any given time

@whyrusleeping whyrusleeping added the status/in-progress In progress label Dec 5, 2014
@jbenet
Copy link
Member

jbenet commented Dec 5, 2014

A blockservice agent would act to serialize access to the datastore, avoiding the lockdeath weve seen from leveldb

Are the leveldb locks actually more expensive than what this agent would do? (either sync.Mutex or channels). How granular are the locks? full serialization is not what we want, we want to do as much as we possibly can in parallel.

make an agent for the muxers bandwidth accounting

What does this one mean?

Batch up bitswap wantlist sends

Expand too? This could mean a few different approaches, and want to make sure we're on the same page.

msgio fixes:

  • sink Pool into Reader (no max size, preallocated buffers issue)
  • fix func (rw *ReadWriter_) Close()
  • r&w threadsafety

take a look at levelDB's performance

vague.

profile for number of active goroutines at any given time

+1

@whyrusleeping
Copy link
Member Author

My goal with the blockstore agent isnt so much to serialize the disk writes, but to avoid the goroutine buildup, at one point i had 3000 goroutines trying to write to the datastore.

@jbenet
Copy link
Member

jbenet commented Dec 5, 2014

My goal with the blockstore agent isnt so much to serialize the disk writes, but to avoid the goroutine buildup, at one point i had 3000 goroutines trying to write to the datastore.

What do you have in mind? the goroutines are created by clients. we need to make sure the writes complete (consistency). not sure we're talking about the same things

@whyrusleeping
Copy link
Member Author

Hrm... thinking about it more, maybe that wouldnt help anything

@btc
Copy link
Contributor

btc commented Dec 6, 2014

Before we begin a large performance effort, what specific concrete performance issues are we currently facing?

We may want to consider looking at network reliability (ie. reachability, message delivery) before setting out to increase throughput and reduce latency.

send wanted blocks in order, N blocks at a time, this will make streaming more likely to happen

  1. This has a non-trivial adverse effect on latency.
  2. The local node has no actual control over the behavior of the network. With the proposed behavior, the contract provided to the client is no stronger than the current behavior.

Would this be more reliably addressed by re-ordering the blocks locally as a stage in the response pipeline?

My goal with the blockstore agent isnt so much to serialize the disk writes, but to avoid the goroutine buildup, at one point i had 3000 goroutines trying to write to the datastore.

Async disk writes, as @jbenet mentioned, inhibit consistency.

The root cause of this was overzealous goroutine generation in the network layer.

Don't we just need to apply backpressure when producers are working faster than consumers?

take a look at levelDB's performance

We recently addressed LevelDB's write compaction issues. Afaik, this is no longer a bottleneck. Has something changed?

@whyrusleeping
Copy link
Member Author

We are currently completely unable to transfer a file larger than around 50MB due to quickly running out of memory.

re: Bitswap Request batching, if done properly, it should significantly improve latency. Currently, no order is implied in the requesting of a block, so the time until you have usable data is very unpredictable, in my opinion this is very bad. If your bandwidth is at most X blocks per second it wont make a difference to the overall speed of the operation if you only request X blocks at a time, i would prefer to receive blocks towards the front of the file first so we can stream, as opposed to random blocks throughout the file.

As for the levelDB issues, i dont beleive that was ever actually addressed, i think we just stopped hitting it so hard. I think what im seeing is network speeds greater than levelDB write speeds causing a chokepoint

@btc
Copy link
Contributor

btc commented Dec 6, 2014

As for the levelDB issues, i dont beleive that was ever actually addressed,

Yeah they were. The solution ended up being pretty simple. The culprit was unnecessary snappy compression during write compaction.

ff490a6

I think what im seeing is network speeds greater than levelDB write speeds causing a chokepoint

It's that producers on the network side are spewing goroutined requests up to bitswap faster than bitswap can retrieve off of the disk. This is the fault of the network layer for not responding to back-pressure.

@whyrusleeping
Copy link
Member Author

How was the compaction issue addressed?

@btc
Copy link
Contributor

btc commented Dec 6, 2014

How was the compaction issue addressed?

Compaction was a red herring. Compression made compaction take too long and when compaction took too long, writes backed up.

We blamed it on compaction because it was an easy target and the internet seemed to back up that hypothesis, but the root cause was expensive compression during compaction.

We are currently completely unable to transfer a file larger than around 50MB due to quickly running out of memory.

What's our best guess as to why this is happening?

@btc
Copy link
Contributor

btc commented Dec 6, 2014

no order is implied in the requesting of a block

keys at the front/head of the wantlist are deemed higher priority. priority is expressed through the order of the wantlist.

@whyrusleeping
Copy link
Member Author

So, how was the "not compaction" issue fixed?

On Fri, Dec 5, 2014, 6:01 PM Brian Tiger Chow [email protected]
wrote:

no order is implied in the requesting of a block

keys at the front/head of the wantlist are deemed higher priority


Reply to this email directly or view it on GitHub
#404 (comment).

@btc
Copy link
Contributor

btc commented Dec 6, 2014

So, how was the "not compaction" issue fixed?

I linked to it in a previous comment. Didn't realize you didn't see the link. ff490a6

@whyrusleeping
Copy link
Member Author

Ah, totally missed that that link was the fix. As for the wantlist being
high priority at the front, our wantlist is a map, which has random order.

On Fri, Dec 5, 2014, 6:17 PM Jeromy Johnson [email protected] wrote:

So, how was the "not compaction" issue fixed?

On Fri, Dec 5, 2014, 6:01 PM Brian Tiger Chow [email protected]
wrote:

no order is implied in the requesting of a block

keys at the front/head of the wantlist are deemed higher priority


Reply to this email directly or view it on GitHub
#404 (comment).

@btc
Copy link
Contributor

btc commented Dec 6, 2014

The wantlist should be a hybrid that uses both a slice and a map. The map to prevent duplicates. A slice to preserve ordering. This is done in the BitSwapMessage:

https://github.com/jbenet/go-ipfs/blob/master/exchange/bitswap/message/message.go#L46

our wantlist is a map, which has random order.

oops. this is a bug.

@jbenet
Copy link
Member

jbenet commented Dec 6, 2014

@whyrusleeping wrote

We are currently completely unable to transfer a file larger than around 50MB due to quickly running out of memory.

Is this recent? I've done that many times before. It's 1GB+ that has been failing for me.

@maybebtc wrote

This is the fault of the network layer for not responding to back-pressure.

This is an important issue. It was just addressed, though, yeah?


FWIW, i think we all agree here that we want to make sure to measure things correctly and outline the perf issues before optimizing blindly. I think @whyrusleeping has good instincts he's following, and @maybebtc is right to want measurements before diving into something. Particularly since we have very little time to release this (I'll write at length about this tomorrow, but we're shipping this before end of the year). Let's make sure we have concrete measured bottlenecks before jumping into things (like memory usage, and concurrency issues).

@whyrusleeping
Copy link
Member Author

@maybebtc for the wantlist, we should probably use that same wantlist object for the wantlists in ledger.

@jbenet Ive been able to get a file around 50MB through over WAN, but above 100MB things fail. I think most of the failures will be fixed with the msgio changes, but im not super confident in that yet. We will see once that lands

@btc
Copy link
Contributor

btc commented Dec 6, 2014

@maybebtc for the wantlist, we should probably use that same wantlist object for the wantlists in ledger.

Indeed. low-pri yeah

@jbenet Ive been able to get a file around 50MB through over WAN, but above 100MB things fail. I think most of the failures will be fixed with the msgio changes, but im not super confident in that yet. We will see once that lands

A little over a month ago, I transferred 1.5GB to cryptix. This is an interesting regression. We may want to create an integration test that performs a large file transfer.

@jbenet jbenet self-assigned this Dec 6, 2014
@jbenet jbenet mentioned this pull request Dec 6, 2014
@whyrusleeping
Copy link
Member Author

Closing, concerns have been mostly addressed, this PR no longer needed.

@whyrusleeping whyrusleeping removed the status/in-progress In progress label Dec 7, 2014
@Kubuxu Kubuxu deleted the perf/ipfs branch February 27, 2017 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants