Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow data transfers with more than 8192 received CIDs #118

Merged
merged 5 commits into from
Nov 17, 2020

Conversation

hannahhoward
Copy link
Collaborator

Goals

Fix filecoin-project/lotus#4856

Implementation

This is a first attempt to fix 4856 -- it solves the issue in a protocol non-blocking way. There are more ideal ways to solve the issue but they involve changes to graphsync and require changes on the network.

What this does is remove the ReceivedCids list from the stored state for the channel, and it puts them instead in an append only file.

This has the effect of not only removing the limit but significantly reducing io & memory consumption relating to loading and saving these CIDs. The list is simply left in the file except when its needed for a restart.

The files are placed in a directory, indexed by channel ID.

integrate cid lists as a migraiton to replace the previous cidlist stored in the state
use different tempdirs per peer, don't delete files for received cids lists
Copy link
Contributor

@dirkmc dirkmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits but otherwise LGTM 👍

From a design perspective my only concern is whether we would need to change any lotus tooling to be aware of the new directory needed to store cid lists. I'm not sure if this is going to be an issue in practice, just flagging in case it's something we need to think about.

}

// CreateList initializes a new CID list with the given initial cids (or can be empty) for a data transfer channel
func (cl *cidLists) CreateList(chid datatransfer.ChannelID, initalCids []cid.Cid) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (cl *cidLists) CreateList(chid datatransfer.ChannelID, initalCids []cid.Cid) error {
func (cl *cidLists) CreateList(chid datatransfer.ChannelID, initialCids []cid.Cid) error {

}
}

// DeleteList deletes the lsit for the given data transfer channel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// DeleteList deletes the lsit for the given data transfer channel
// DeleteList deletes the list for the given data transfer channel


func TestCIDLists(t *testing.T) {

err := os.MkdirAll("_test", 0755)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe create _test in a tmp directory, then we don't need the .gitignore?

@@ -97,7 +96,8 @@ func (c channelState) Voucher() datatransfer.Voucher {

// ReceivedCids returns the cids received so far on this channel
func (c channelState) ReceivedCids() []cid.Cid {
return c.receivedCids
receivedCids, _ := c.channelCIDsReader(c.ChannelID())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return the error from ReceivedCids?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just cause it's an API change -- and there's an interface that expects no error. But I will log it.

return err
}
defer func() {
_ = f.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be worth returning the error on Close. It seems like the caller would want to know if the file couldn't be closed.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of a late review: I'm concerned about the perf of opening a file per CID fetched.

An alternative approach would be to just write to the datastore. That is, write /transferprefix/TRANSFER_ID/cids/CID. We can then extract/delete the list with a range query. However, I'm also not entirely sure about the performance here either.

return c.receivedCids
receivedCids, err := c.channelCIDsReader(c.ChannelID())
if err != nil {
log.Error(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider propagating this up and letting the caller handle it, but logging it and moving on is also quite reasonable as we have "no information" in this case.

}

// AppendList appends a single CID to the list for a given data transfer channel
func (cl *cidLists) AppendList(chid datatransfer.ChannelID, c cid.Cid) (err error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this going to be really slow if we have to do this per cid? I'd suggest having some form of batching.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most files will be big, and the default lotus chunking is 1MB, so this should be usable

But it's still pretty far from ideal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

File transfers over 8GB will fail in data transfers, or less depending on data chunking strategy
4 participants