-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Files API root as best-effort pin #2872
Conversation
if !set.Has(k) { | ||
set.Add(k) | ||
child, err := ds.Get(ctx, k) | ||
if err == ErrNotFound { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets do something like
switch err {
case ErrNotFound:
continue
default:
return err
case nil:
break
}
@whyrusleeping see my line note on your second comment. There are two versions for you to chose from. Please do not merge this as is. I will squash this all down to one commit once we decide what works best. |
@kevina i think the latter option of passing the array of best effort root keys is the better of the two for now. My other thought was to enumerate the best effort set before calling gc, and pass that into the gc call as a 'extra keys that arent garbage' parameter. But i think that might be a little more difficult and uglier overall |
Closes ipfs#2697. Closes ipfs#2698. License: MIT Signed-off-by: Kevin Atkinson <[email protected]>
@whyrusleeping, okay, ready to merge once you are happy with it. |
test_expect_success "object not removed after gc" ' | ||
echo "hello world" | ipfs files write --create /hello.txt && | ||
ipfs repo gc && | ||
ipfs cat QmVib14uvPnCP73XaCDpwugRuwfTsVbGyWbatHAmLSdZUS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also add another test_expect_success
to make sure that /hello.txt
is still accessible through the files api here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whyrusleeping Added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... if this is "best effort", under what conditions should gc remove these, and can we make sure to test that here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All "best effort" means is not to abort if a child is unreadable. For the GC to remove the object will need to be detached somehow from the MFS Root.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevina want to take this one on and add a few more tests?
- Test that the read test below outputs the right text
- Exercise a situation that uses the best effort pin:
- Add a directory containing a file with
--pin=false
- pin the hash directly
- run a gc so that file is no longer accessible
ipfs files cp <dirhash> /somewhere
(this should work just fine)ipfs repo gc
should succeed- re-add the file that was originally in that directory with
--pin=false
- run
ipfs repo gc
again, that file shouldnt be removed.
- Add a directory containing a file with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevina want to take this one on and add a few more tests?
@whyrusleeping See #3151
…pin. License: MIT Signed-off-by: <[email protected]>
This LGTM. @Kubuxu wanna take a look over too? |
SGTM |
@@ -69,7 +70,7 @@ func GC(ctx context.Context, bs bstore.GCBlockstore, pn pin.Pinner) (<-chan key. | |||
return output, nil | |||
} | |||
|
|||
func Descendants(ctx context.Context, ds dag.DAGService, set key.KeySet, roots []key.Key) error { | |||
func Descendants(ctx context.Context, ds dag.DAGService, set key.KeySet, roots []key.Key, bestEffort bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i strongly dislike how this is implemented. Adding bestEffort
here to this function ruins the whole abstraction. You can implement this "best effort" notion outside of this function, just in the GC function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbenet in this function (Descendants) and EnumerateChildren all that bestEffort means is to not abort if a child is unreadable. In order to collect the set of keys to keep on a "best effort" bases I needed a way to to collected all the children of the initial best effort set that are local on the node.
I am not entirely clear what abstraction I am ruining so I am not 100% sure how to fix this. I could create a new function to find the Descendants of a keySet that doesn't abort on an error, but to me it seamed better to extend the functionally of the existing function.
@whyrusleeping I could appreciate any insights you can offer.
|
So to be clear all "best effort" means (as agreed to by @whyrusleeping and @Kubuxu) is to simply keep all children of the best effort pin that are reachable from the local node. (In other words if a child in unavailable simply skip it, rather than aborting or attempting to download it). Nothing More. The GC will never remove "best effort" pins. |
(Thinking out loud) Oh wow, that means something totally different, then. "best effort" in computer science tends to mean "up to best abilities, but not unreliable". Specifically, "best effort" in networks. Applied to pins, i took it to mean "pin a subgraph while you can, but GC if you need to", sort of like a "gc these last" type of thing. What you describe ("pin the graph objects currently present in the node") It's a bit concerning, too because it has unpredictable guarantees and consequences, in a very different way than "unreliable (best effort)"-- because how much is pinned forever depends entirely on order of events, which is very hard to reason about.
As it is now, I think your "objects that are currently local" pins can be turned into a other tools, and use regular pin sets:
Thinking about it another way, let's rename "best effort" into "incremental". Because that's what it really is, right? it's an incremental pin that over time will pin more and more of a subgraph, as it's accessed and retrieved. This has different use cases than i mentioned above (eg "incrementally pin" all my local maps' area in a google maps like thing, or "incrementally pin" a dataset i'm exploring, or a dungeon in a dungeon crawler). Is this more what you mean? Would be very useful to me to hear what use cases you have in mind for this. |
@jbenet, please see #2698 for context. @whyrusleeping actually came up with the name "best effort" and I just used it. |
@jbenet this isnt a separate pin type. All this is is a way to make the files api space not break when you run a GC, but also not require that everything in the files API be local. |
This fixes #2697 and #2698.