Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for using CidV1 in 'ipfs add' #3743

Merged
merged 7 commits into from
Apr 27, 2017
Merged

Conversation

kevina
Copy link
Contributor

@kevina kevina commented Mar 3, 2017

Closed #3699

License: MIT
Signed-off-by: Kevin Atkinson [email protected]

@kevina kevina added the status/in-progress In progress label Mar 3, 2017
@kevina
Copy link
Contributor Author

kevina commented Mar 3, 2017

@whyrusleeping I am not sure I am happy with this

What I described in #3699 (comment) was only sufficient for adding files with CidV1. It did not handle symbolic links or directories.

In order to handle directories I needed to touch the mfs package. I added a cidVer parameter to mfs.Mkdir and added a SetCidVersion method to the Directory class.

To make sure the root node was CidV1 I decided it was best to pass in the Cid version when creating the adder rather then setting it later.

The changes to the merkeldag package are not strictly necessary, but it seams the best way to set the Cid version of a ProtoNode.

@kevina
Copy link
Contributor Author

kevina commented Mar 3, 2017

@whyrusleeping one question: Do you think --cid-version=1 should imply --raw-leaves? I think it should and would be one less combination to test for.

@whyrusleeping
Copy link
Member

@kevina hrm... I think setting cid version to 1 could probably change the default value for raw leaves to true, but i think people should still be able to set raw leaves to false and cid version to 1. Even if we don't allow it through the command line people could manually construct graphs that were that way

@kevina kevina force-pushed the kevina/cidv1-add branch 2 times, most recently from 65ea7f3 to 33c65ba Compare March 3, 2017 22:52
@kevina kevina changed the title WIP: adder: add support for using CidV1 adder: add support for using CidV1 Mar 3, 2017
@kevina
Copy link
Contributor Author

kevina commented Mar 3, 2017

@whyrusleeping okay, i think this is ready for review

@@ -78,6 +79,7 @@ You can now refer to the added file in a gateway, like so:
cmds.StringOption(chunkerOptionName, "s", "Chunking algorithm to use."),
cmds.BoolOption(pinOptionName, "Pin this object when adding.").Default(true),
cmds.BoolOption(rawLeavesOptionName, "Use raw blocks for leaf nodes. (experimental)"),
cmds.IntOption(cidVersionOptionName, "Cid version. (experimental)").Default(0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should mention enables raw blocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but got stuck on how to state in concisely suggestion welcome.

Copy link
Member

@Kubuxu Kubuxu Mar 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CID Version. Non-zero value will change default of 'rawleaves' to true.

Copy link
Member

@Kubuxu Kubuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the fact that now every MFS API function will have to take cidversion argument. We will have to either live with it or think about it.

if err != nil {
res.SetError(err, cmds.ErrNormal)
return
}

outChan := make(chan interface{}, 8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract this constant, maybe use the same as AllKeysChan or some other/new but I would prefer it extracted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not new code I just moved things around a little. We can do this in a separate p.r.

Copy link
Member

@Kubuxu Kubuxu Mar 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would have taken 10s. General rule many of us try to strive for is to leave code in better shape than we've found it.

)

// CidVersion is a a cid version number that is guaranteed to be valid
type CidVersion struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make it type CidVersion int. It will be simpler to use and more idiomatic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also there isn't a reason it couldn't be just a constant?

type CidVerision int

const (
CID0 CidVersion = 0
CID1 CidVersion = 1
)

You can still have methods on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you went the way you did because as a way of preventing creation of different CID versions but IMO it should be done with helper function when CID version is from User Interface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because nothing prevents me from doing CidVersion{2} either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because nothing prevents me from doing CidVersion{2} either way.

Um, if you do that outside the merkeldag package you will get

implicit assignment of unexported field 'version' in merkledag.CidVersion literal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to provide some additional type safety so once you have a CidVersion you don't need to return an error if the version is invalid as it has already been checked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, missed that. Ok, but I would still be for having variables aliasing usable values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alias added, although not currently used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note, I am not against doing the way you suggested.

I just want to be able to assume that once we have a CidVersion we can guarantee we have something valid and don't want to have to return an error (instead we can panic). I think it will be reasonable to assume that the user (in this case the programmer) won't do a cast from an int to to CidVersion and will use the helper function.

@whyrusleeping thoughts.

@Kubuxu
Copy link
Member

Kubuxu commented Mar 5, 2017

Also the code is not formatter right: https://ci.ipfs.team/blue/organizations/jenkins/go-ipfs/detail/PR-3743/5/pipeline

@@ -78,6 +79,7 @@ You can now refer to the added file in a gateway, like so:
cmds.StringOption(chunkerOptionName, "s", "Chunking algorithm to use."),
cmds.BoolOption(pinOptionName, "Pin this object when adding.").Default(true),
cmds.BoolOption(rawLeavesOptionName, "Use raw blocks for leaf nodes. (experimental)"),
cmds.IntOption(cidVersionOptionName, "Cid version. (experimental)").Default(0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps this option should be a global option. (on the ipfs command, same places as the timout option).

Otherwise we will have to add this in (and maintain it) for each command that wants to be able to select the output cid version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but thought it would be simpler just to add it for the adder for now.

For example, should every command check for this flag if it is applicable, and if it is not supported yet error out?

rawblks, rawblksFound, _ := req.Option(rawLeavesOptionName).Bool()
cidVer, _, _ := req.Option(cidVersionOptionName).Int()

if cidVer >= 1 && !rawblksFound {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably check that cidVer is either zero or one. No other values are supported right now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is already done in the call to NewAdder.

@kevina
Copy link
Contributor Author

kevina commented Mar 6, 2017

I don't like the fact that now every MFS API function will have to take cidversion argument. We will have to either live with it or think about it.

Neither do I. It is only the MkDir call that needs an cidversion argument. I added it the MkDir call because it can create multiple nodes that all will need to be set to the correct Cid version. There are two additional options.

  1. Create a new function MkDirWCid and have MkDir call it.

  2. Walk the tree created and fix the Cid version after the fact. I see this as inefficient both in terms of speed and amount of code, but will avoid having to modify the MkDir call.

@whyrusleeping thoughs?

@@ -67,8 +67,15 @@ type AddedObject struct {
Bytes int64 `json:",omitempty"`
}

func NewAdder(ctx context.Context, p pin.Pinner, bs bstore.GCBlockstore, ds dag.DAGService) (*Adder, error) {
mr, err := mfs.NewRoot(ctx, ds, unixfs.EmptyDirNode(), nil)
func NewAdder(ctx context.Context, p pin.Pinner, bs bstore.GCBlockstore, ds dag.DAGService, v int) (*Adder, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets not change the parameters here. Instead, lets just add a SetCidVersion method to the adder.

@@ -33,6 +35,9 @@ type DagBuilderParams struct {
// instead of using the unixfs TRaw type
RawLeaves bool

// Prefix specifies cid version and hashing function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CidVersion only specifies the cid version, not the hashing function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, fixed. It was a leftover comment from an earlier attempt.

@@ -139,7 +155,11 @@ func (db *DagBuilderHelper) GetNextDataNode() (*UnixfsNode, error) {
raw: true,
}, nil
} else {
blk := NewUnixfsBlock()
blk := &UnixfsNode{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no NewUnixfsBlock anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. It was only used in one place so I just inlined the code. I removed the function to make sure I cached all possible places where it was called to make sure the correct Cid version was always set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want I can turn it into a method like I did for NewUnixfsNode. But since it was only used in one place I decided it wasn't worth the effort.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... just to maintain continuity with the previous methods. the whole 'raw' thing here is realllly just legacy code to prevent hashes from changing at this point

mfs/ops.go Outdated
@@ -104,7 +105,7 @@ func PutNode(r *Root, path string, nd node.Node) error {

// Mkdir creates a directory at 'path' under the directory 'd', creating
// intermediary directories as needed if 'mkparents' is set to true
func Mkdir(r *Root, pth string, mkparents bool, flush bool) error {
func Mkdir(r *Root, pth string, mkparents bool, flush bool, cidVer dag.CidVersion) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets have this be a field on the Root object, that way we can avoid changing the mfs APIs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Done.

Copy link
Member

@whyrusleeping whyrusleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid changing the existing apis on these things.
Other than those comments though, this LGTM.

@kevina kevina force-pushed the kevina/cidv1-add branch 3 times, most recently from 48d361f to 2ba6d8a Compare March 8, 2017 01:48
@kevina
Copy link
Contributor Author

kevina commented Mar 8, 2017

@whyrusleeping okay, this should be ready now

@kevina
Copy link
Contributor Author

kevina commented Mar 17, 2017

@whyrusleeping target this for 0.4.8? Or do you want to try for a global option instead?

@whyrusleeping whyrusleeping added this to the Ipfs 0.4.8 milestone Mar 18, 2017
@@ -33,6 +35,9 @@ type DagBuilderParams struct {
// instead of using the unixfs TRaw type
RawLeaves bool

// CID Version to use
CidVersion dag.CidVersion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of using a struct for this, we should just use a cid.Prefix. That way we make it much easier for us to switch the hash function in the future

Copy link
Member

@whyrusleeping whyrusleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Construction looks good to me, Just want to switch to setting the whole cid prefix instead of just the version. Will make future changes of this sort much easier on us

@kevina
Copy link
Contributor Author

kevina commented Mar 18, 2017

@whyrusleeping okay, changed to setting the prefix instead of just the CID version

}

func (n *ProtoNode) SetPrefix(prefix cid.Prefix) {
n.Prefix = prefix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make sure to set n.Prefix.Codec = cid.DagProtobuf, just for sanitys sake

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

@kevina kevina Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it turned out not to be such a good idea...

@kevina
Copy link
Contributor Author

kevina commented Mar 25, 2017

So commit bb09ffd (implement an HAMT for unixfs directory sharding) complicated things. It should have a fix by the end of Monday.

@whyrusleeping
Copy link
Member

@kevina alright, i'm gonna bump this to 0.4.9 then. Thanks!

@whyrusleeping whyrusleeping modified the milestones: Ipfs 0.4.9, Ipfs 0.4.8 Mar 25, 2017
@kevina kevina force-pushed the kevina/cidv1-add branch 2 times, most recently from cef50ea to 5adb2a3 Compare March 27, 2017 04:20
@kevina
Copy link
Contributor Author

kevina commented Mar 27, 2017

@whyrusleeping so setting the prefix for the MFS root after the fact became problematic. Have a look at the commit "Fixes" and let me know what you think. This commit will be squashed but I kept it separate to make it easier to see the new changes.

}

func (adder *Adder) SetMfsRoot(r *mfs.Root) {
adder.mr = r
adder.mroot = r
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not changing the naming of things, but not a big deal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it back. I did this as a trick to make sure I cached where all instance mr was used. I thought about naming the method mr() but that seamed a bit terse for a method name, even if it was only for internal use.

if d.dirnode != nil {
d.dirnode.SetPrefix(prefix)
}
// FIXME: Should we do this? -- kevina
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should probably set the shards prefix info too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will do. Have something in a few days.

@whyrusleeping
Copy link
Member

@kevina update here?

@kevina
Copy link
Contributor Author

kevina commented Mar 30, 2017

@whyrusleeping Have something by the end of tomorrow, if not today.

@kevina
Copy link
Contributor Author

kevina commented Mar 31, 2017

@whyrusleeping sorry for the delay, this should be good to go. I will squash the "Fixes" commit once you are okay with my changes.

Copy link
Member

@whyrusleeping whyrusleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just needs a rebase 👍

@whyrusleeping
Copy link
Member

Oh, and could you fix the codeclimate issues?

License: MIT
Signed-off-by: Kevin Atkinson <[email protected]>
License: MIT
Signed-off-by: Kevin Atkinson <[email protected]>
License: MIT
Signed-off-by: Kevin Atkinson <[email protected]>
@kevina
Copy link
Contributor Author

kevina commented Apr 26, 2017

Rebased but tests are failing again, will fix.

@kevina
Copy link
Contributor Author

kevina commented Apr 26, 2017

But codeclimate is at least happy.

@kevina
Copy link
Contributor Author

kevina commented Apr 26, 2017

Some how "make deps" in the shareness directory was not rebuilding things correctly for me. In any case we a green and this is now ready to merge.

@kevina kevina added RFM and removed status/in-progress In progress labels Apr 26, 2017
@whyrusleeping whyrusleeping merged commit e5529cd into master Apr 27, 2017
@whyrusleeping whyrusleeping deleted the kevina/cidv1-add branch April 27, 2017 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants