-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dag import functionality only ( silent / no CLI progress ) #7038
Conversation
This PR is now retargetted against the export branch to make diff reading easier. Sharness tests are complete, with a few more "pathological cases" pending for a subsequent PR |
29a4ef8
to
5232d39
Compare
:scratchhead:
Locally the full sharness passes... |
One needs to run |
Sharness is now properly fixed, I ended up running into another known mis-design along the way: @Stebalien ready for final review, I will re-add the progress stuff once you are happy with the tests / code / etc. I would also like to hear from @mikeal if he has strong(er) objections to the current general design of the importer: #7038 (comment) |
core/commands/dag/dag.go
Outdated
if 0 == blockCount%200 { | ||
// work around https://github.com/ipfs/go-ds-flatfs/issues/36 for the time being | ||
// batch up-to 200 at a time ( fly under MacOS's default of 256 ) | ||
if err := batch.Commit(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you actually run into this problem? ipfs add
has the same behavior.
Looking at the code, this is probably because we usually test with the daemon (which raises the file descriptor limit to 8196). Let's not hack a fix in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note: we can reduce the batch size when constructing the batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I have ran into it. Part of the sharness tests runs without a daemon ( as would be seen in the wild in many cases ). If you remove this block - that test will fail.
I didn't realize the batch size can be controlled, will push a fix momentarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Welp the batch size can be controlled, and is currently defaulting to 128
. But the batch application is carried out async with ParallelBatchCommits
concurrent jobs. Combined together they blow through the ulimit
since blocks come in at cid-decoding-speed.
Since this is a global variable, the only reasonable way to flip it would be to muck with atomic.
and that doesn't sound right.
How should we fix this? Add an option-defaulting-to-0 to ignore the global value of ParallelBatchCommits
? Not using a batch at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My real concern here is that these are the default limits we're using for ipfs add
as well. Given that, we should consider reducing the default batch size (or just fixing flatfs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know why...
- We flush when we finish adding each individual file.
- The default chunk size is 256KiB.
- The maximum amount of data we can buffer is 8MiB * NCPU * 2.
That means we can end up with (8196*NCPU*2)/256
outstanding blocks (usually ~256).
Regardless, we should fix the batch: ipfs/go-ipld-format#56.
Then we should benchmark and see how this affects add performance on flatfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have prior art for benchmarking? A typical workstation OS with an SSD will not display a difference for obvious reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, no. However, I seem to remember a pretty dramatic difference when messing with these parameters last time, even on an SSD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's ok, I'll improvise. Finishing up something else at the moment, will likely not look into this until way later today. I mean tomorrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sharness passes now with no ulimit fixes and no workarounds
Had to pull in ipfs/go-merkledag#53 and ipfs/go-ipld-format#55 in order to benefit from ipfs/go-ipld-format#56
Initial testing looks good, will have the final number tomorrow morning, day got too long
I've rebased feat/carfile-export-only on master. I was planning on then rebasing this branch on feat/carfile-export-only so I could review the changes, but they appear to have divergent histories. |
I've released a new go-ipld-format that reverts the ErrNotFound changes: https://github.com/ipfs/go-ipld-format/releases/tag/v0.2.0. |
Export was merged into it, hence git getting confused. I generally avoid force-pushes during review exactly for this reason, but no biggie. Something is wrong with sharness locally is why I haven't pushed yet, any moment now... |
acf03d4
to
ec08abe
Compare
ret.PinErrorMsg = err.Error() | ||
} else if err := node.Pinning.Pin(req.Context, nd, true); err != nil { | ||
ret.PinErrorMsg = err.Error() | ||
} else if err := node.Pinning.Flush(req.Context); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to flush at most once at the end, but let's leave it this way for now. We'll have to think about the safety implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually you told me I need to pin+flush like that, originally I had it at the end... :)
But yeah, agreed we should punt, knowing more dark corners of the pin infra now. I'll open an issue in a bit referencing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did? I wonder why I said that...
LGTM except for the "root seen" flag. It's much easier to remove it now and add it (or something like it) back later. IMO, we'd be better off with |
ec08abe
to
a9c8a23
Compare
This still works over "loosely defined" .car files Please refer to the sharness tests for extra info We can tighten this up if the sentiment is "Postel was wrong"
a9c8a23
to
f1ecf33
Compare
Force-pushed to lose the 3mb of testdata from the commit chain. Addressing the other notes in separate commits shortly. |
@Stebalien the only new thing to review is eff4223 Everything else pushed is sharness cruft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
This still works over "loosely defined" .car files
Please refer to the sharness tests for extra info
We can tighten this up if the sentiment is "Postel was wrong"