-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement unixfs sharding #3042
Conversation
Note: tests won't pass yet as I havent vendored some of the new libs. I havent quite decided between willf/bitset and just using the |
74a6117
to
259aeb4
Compare
going with the stdlib, fewer imports and the stdlib is proven to be pretty stable. Its a hair bit less efficient on some operations, but we can optimize later if it becomes an issue. |
@whyrusleeping, I am afraid I do not know anything about sharding, so I will have to first learn about that before I am of any use. Doing a quick search I discovered ipfs/notes#76, is there anything else that would help with context for this pull request? |
unixfs/hamt/hamt.go
Outdated
) | ||
|
||
const ( | ||
HashMurmur3 uint64 = iota |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remember that this is configurable on the object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will likely use multihash if this proposal goes through: multiformats/multihash#30
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, i'd love to use an 'official' multihash table for this.
@whyrusleeping this is a very big PR that will be critical to review + merge in stages. if this has any bugs it will be serious trouble and will take weeks of work to find and debug. please:
|
@kevina if you want to review the algorithm for the HAMT datastructure and use that to check my implementation, that would be very helpful. |
@jbenet I moved the HAMT code into its own repo and added a larger test to verify operation order doesnt change the resultant structure: https://github.com/whyrusleeping/unixfs-hamt the |
918e24b
to
6c31225
Compare
Thanks i will revivew this year |
d5d3b21
to
5ae869f
Compare
c8f0fa0
to
10fb7a7
Compare
@whyrusleeping @jbenet mentioned that it would be better rip out all the unixfs into its own repo and this actually might be extremely beneficial to point several people into something isolated they can review fully. Also, I think the best review I can do is grab this implementation and write the implementation in JavaScript and possibly write the spec for it. |
@diasdavid I had it pulled out at one point, but it was really just for the sake of 'putting the code somewhere else to look at it'. Actually extracting this code is difficult because it would require extracting a large number of other go-ipfs packages. |
4f84cfa
to
07683a7
Compare
Rebased on latest master (1a365a8). @diasdavid @dignifiedquire The actual HAMT code is isolated in this single file: https://github.com/ipfs/go-ipfs/blob/07683a7d9a6e83c859c84e75b2eb47d2aee79aa7/unixfs/hamt/hamt.go All the review i'm looking for is just that file. the rest of the changes are just integration of that up through the stack and i'm not nearly as concerned about that part of the changeset. |
unixfs/hamt/hamt.go
Outdated
mask := new(big.Int).Sub(new(big.Int).Exp(big.NewInt(2), big.NewInt(int64(bp)), nil), big.NewInt(1)) | ||
mask.And(mask, ds.bitfield) | ||
|
||
return popCount(mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely clear on how this function works/what exactly it's purpose is. And in turn I'm having a hard time verifying that the bit operations you do, do what they are supposed to be doing. Would be great to have a comment about what the function does and an explenation why the bit ops do what they do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the popCount
that one is clear, but rather the whole indexForBitPos
.
License: MIT Signed-off-by: Jeromy <[email protected]>
License: MIT Signed-off-by: Jeromy <[email protected]>
License: MIT Signed-off-by: Jeromy <[email protected]>
License: MIT Signed-off-by: Jeromy <[email protected]>
License: MIT Signed-off-by: Jeromy <[email protected]>
14e7424
to
e876434
Compare
Wow, rebased. Is this coming in soon (TM)? |
774715f
to
f72c37e
Compare
@pgte Anything else we need to finish before this goes forward? I think we've agreed on not setting filesizes on the shards. |
@whyrusleeping Correct, I don't think we need anything else. |
Apart from one removed test: #3042 (comment) |
License: MIT Signed-off-by: Jeromy <[email protected]>
f72c37e
to
c4c6653
Compare
mfs/dir.go
Outdated
var out []NodeListing | ||
err := d.ForEachEntry(context.TODO(), func(nl NodeListing) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I shouldn't have missed that.
License: MIT Signed-off-by: Jeromy <[email protected]>
07da7ad
to
65b9716
Compare
Choo Choo! Here comes the merge train! 🚋🚋🚋🚋🚋🚋🚋🚋 |
Implement unixfs sharding
…ding Implement unixfs sharding
…ding Implement unixfs sharding
This PR implements a hash array mapped trie using merkledag nodes as a base layer for sharded unixfs directories.
This changeset is broken up into three distinct parts:
math/big.Int
Node
object to using a dirbuilderipfs add
and friends can now create sharded directories.Thorough review will be very appreciated!
cc @Kubuxu @jbenet @kevina @diasdavid @dignifiedquire
also cc @haadcode as you might like to review the datastructure logic