Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[add] --nocopy --trickle; copies data and ignores trickle #4052

Closed
djdv opened this issue Jul 8, 2017 · 15 comments
Closed

[add] --nocopy --trickle; copies data and ignores trickle #4052

djdv opened this issue Jul 8, 2017 · 15 comments
Labels
help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws)

Comments

@djdv
Copy link
Contributor

djdv commented Jul 8, 2017

Version information:

go-ipfs version: 0.4.10-00573cb2c
Repo version: 5
System version: amd64/windows
Golang version: go1.8.3

Type:

Bug

Severity:

Low-Medium

Description:

On my machine if I try to use --nocopy with the trickledag format it will still copy file blocks, as well as output a hash for the standard dag format, but even this behaviour is inconsistent. These are the cases I've tried and their results with a log below, the sample data is just 25M from udevrandom.

Nothing has been done yet, file is added with --nocopy
Result: As expected, file data is not copied to the datastore, standard hash returned
File has been added via --nocopy already, it is added again with --nocopy and -t
Result: File data is not copied, but standard dag hash returned, not trickle
Repo is cleaned, file is added with --nocopy and -t
Result: File data is copied, and standard dag hash returned
Repo is cleaned, file added with -t
Result: As expected, file data is copied to the datastore, trickle hash returned

In addition to this please look at the log below to see the blocks directory changing size, even after everything is removed it seems to have grown is size a few kb. I'm not sure if this is expected or not.
I haven't tested this elsewhere yet. I originally caught this after adding a lot of files like this add --nocopy -r -w -Q --chunker=rabin --pin=false --cid-version=1 --hash=Blake2b-256, -t was only provided conditionally (if more than half the files provided were multimedia files) so the arguments would turn into add --nocopy -r -w -Q --chunker=rabin --pin=false --cid-version=1 --hash=Blake2b-256 -t, I don't think any of the other arguments there affect --nocopy like this.

>cd %HOMEPATH%\.ipfs

>ipfs init
...

>ipfs config --json Experimental.FilestoreEnabled true

>cksum random
2009307347 26214400 random

>du -sh blocks
68K	blocks

>ipfs add --pin=false --nocopy random
added QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr random

>du -sh blocks
77K	blocks

>ipfs get --output=validate QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr
Saving file(s) to validate
>cksum validate
2009307347 26214400 validate

>ipfs add --pin=false --nocopy -t random
added QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr random

>du -sh blocks
77K	blocks

>ipfs get --output=validate QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr
Saving file(s) to validate
>cksum validate
2009307347 26214400 validate

>ipfs repo gc
removed QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr
...
removed zb2rhoTbL7rcs2J8enZ5R54gijwPdLGGHS3JtpFw74vDw8MyZ

>du -sh blocks
55K	blocks

>ipfs add --pin=false --nocopy -t random
added QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr random

>du -sh blocks
26M	blocks

>ipfs get --output=validate QmYt8DnHxMa2Jo9mcoYMD4rqjobHqhjwTWS4HUXtxuVtMr
Saving file(s) to validate
>cksum validate
2009307347 26214400 validate

>ipfs repo gc
removed zb2rhanQbgX9Q4uzrtDC5MtsukDRLBPLPwR4XvYNaJFHpAvSt
...
removed zb2rhdQWSLuJLhMQ7kwEpRT3pGHeHcMbHjFBJo7DmfNGiWfSF

>du -sh blocks
67K	blocks

>ipfs add --pin=false -t random
added QmYDL6Rnkhhog6NFwykAtAD2uT5R62PVBB4F9t85vUX6NM random

>ipfs get --output=validate QmYDL6Rnkhhog6NFwykAtAD2uT5R62PVBB4F9t85vUX6NM
Saving file(s) to validate
>cksum validate
2009307347 26214400 validate

>ipfs repo gc
removed QmWLPNh6mVuqsxpWTiW7YNaGWbMjmxfUUweJ18xaiBfn1s
...
removed QmdtwfJv8g6FQxAnHFVPpNxiasitZGvGzoVtSuScswQZfG

>du -sh blocks
79K	blocks
@whyrusleeping
Copy link
Member

cc @kevina

@kevina
Copy link
Contributor

kevina commented Jul 8, 2017

@whyrusleeping My code never supported the trickle adder, so it is unlikely there is any support for it in IPFS.

@whyrusleeping whyrusleeping added kind/bug A bug in existing code (including security flaws) help wanted Seeking public contribution on this issue labels Jul 9, 2017
@whyrusleeping whyrusleeping added this to the Ipfs 0.4.12 milestone Aug 31, 2017
@Kubuxu Kubuxu modified the milestones: Ipfs 0.4.12, go-ipfs 0.4.13 Nov 6, 2017
@schomatis
Copy link
Contributor

@whyrusleeping Is this issue still relevant to work on it?

@whyrusleeping
Copy link
Member

@schomatis Yeah, I don't think this has been fixed yet. Go right ahead :)

@schomatis
Copy link
Contributor

@whyrusleeping It would seem that the posInfo structure that signals the creation of the FilestoreNode was added only to the balanced DAG but not to the trickle DAG.

I'll keep researching the issue to see what would imply adding it to the trickle importer.

@kevina
Copy link
Contributor

kevina commented Feb 21, 2018

@schomatis that sounds about right to me. It should be fairly easy to do.

@schomatis
Copy link
Contributor

@kevina If I'm following correctly the original PR I should replicate the SetPosInfo pattern in the trickle importer for Layout and fillTrickleRec, and also for FillNodeLayer (which I'm not seeing a counterpart in the balanced importer), am I missing something else? I'll prepare a patch and let you know.

@kevina
Copy link
Contributor

kevina commented Feb 21, 2018

@schomatis go ahead and create a p.r. but label it a work-in-progress (by prepending the title with "WIP:" or similar) and I will have a look.

@schomatis
Copy link
Contributor

@kevina Regarding the trickle DAG Append family of functions, what should be the value of the offset for SetPosInfo?

I don't see analogous functions for the balanced DAG and I don't know if the offset should be zero or should continue from the offset value of the base node (basen) because I'm not sure if a new file is being appended or if is a continuation of the same (previously added) file (or is that something that should be checked through the fullPath variable?)

@kevina
Copy link
Contributor

kevina commented Feb 23, 2018

@schomatis the offset is the offset from the start of the backing file, so it should likely be the offset of the base node. It might be helpful to look at the filestore code to see how PosInfo is used, in particular readDataObj (https://github.com/ipfs/go-ipfs/blob/master/filestore/fsrefstore.go#L152).

@schomatis
Copy link
Contributor

@kevina Thanks for the pointer, it helps me understand Filestore internals.

What I'm missing now is a use case for the Append function. From what I'm seeing in the code this function is related to the Files API / MFS and the unixfs components, but both those specs are under development (at this time empty), and although there are many discussions open about them in several issues it's hard for me to get a firm grasp on how they work (besides the intuitive concept that they allow to have a UNIX file system mounted on top of IPFS).

@whyrusleeping Maybe you could point me in the right direction to get more information about those subjects. My main concern now is to have a list of commands that would trigger an Append call to learn more about how it works. For now I'm experimenting with a basic ipfs files example I've found (I didn't find any in the official IPFS Examples, maybe after this issue is closed I could add one).

@schomatis
Copy link
Contributor

After reading the example cited above I run the following commands to trigger the Append call:

echo "Plain text file." > file
ipfs add --nocopy file
# added zb2rhdJ19xRwTk3QMZ1GG8vPEh7gZVXMpi6FP7AxanrfrocPR file

ipfs files cp /ipfs/zb2rhdJ19xRwTk3QMZ1GG8vPEh7gZVXMpi6FP7AxanrfrocPR /file
echo "Added content." | ipfs files write -o 100 /file
# 14:56:22.085 ERROR cmds/files: seekfail: expected protobuf dag node files.go:772
# Error: expected protobuf dag node

The error is due to a check in Append for a ProtoNode (where in this case the node is a RawNode), the check is not commented.

Thinking about this it makes sense this append operation fails because Filestore won't have a concrete file to back the appended data (except the stdin data received by the ipfs files write is backed up in a separate file, although it wouldn't appear so).

My question then is if Filestore should support the append operation or if the SetPosInfo should be added only to the builder function Layout, resting on the fact that Append will fail for --nocopy nodes (and the posInfo structure won't be needed). If this is the case there should be a refactoring of fillTrickleRec and FillNodeLayer that are used by both operations (build and append) for one to support Filestore and not the other.

@schomatis
Copy link
Contributor

Researching some more, the problem is not related to the way the input is passed to the write command, this also fails:

ipfs files write /file file
# Error: expected protobuf dag node

The problem is that the node passed to Append is the node already stored with Filestore as RawNode (and the check mentioned above will always fail).

@whyrusleeping
Copy link
Member

@schomatis I don't think we need to make the trickle append code support filestore. The Add code only uses Layout. I believe you are correct in saying that only ipfs files write uses Append. Looking at your other comments, it looks like it needs to learn how to append to a non-protobuf node. That might get sticky, but I think its probably okay to error out in that case for now.

@schomatis
Copy link
Contributor

Closing in favor of ipfs/boxo#389.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

No branches or pull requests

5 participants