-
Notifications
You must be signed in to change notification settings - Fork 36
blockstore: Adding Stat method to map from Cid to BlockSize #5
Conversation
I'm looking into the unit test failure... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the test fail, code LGTM
This should probably return a special type Stat struct {
Size int // -1 means "don't know"
} |
@Stebalien Thank you, I will make this change. -I will rename Stat to GetSize |
If we're going to go with a method dedicated to returning the size, I'd just return I suggested using |
3e9f68b
to
1950860
Compare
@Stebalien Thank you for your suggestions. After giving it some thought I decided on a method dedicated to returning the size. Let me know if you have any questions. I'm happy to share more details and discuss further. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good in general but this'll fail for zero sized blocks (forgot we had those...). We could just memorize all zero-sized blocks (one per hash) but that's a decision I'd like to make later, if ever. Ideally, we should start using identity hashes for zero-length blocks so a zero-length block table wouldn't be very useful.
arc_cache.go
Outdated
return ErrNotFound | ||
} | ||
|
||
b.arc.Remove(k) // Invalidate cache before deleting. | ||
err := b.blockstore.DeleteBlock(k) | ||
switch err { | ||
case nil, ds.ErrNotFound, ErrNotFound: | ||
b.addCache(k, false) | ||
b.addCache(k, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we can actually have zero-sized blocks. We'll eventually "inline" these into CIDs but that's a work in progress. You'll probably need to use -1 to indicate a missing block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Stebalien
Done. I pushed out these changes and squashed into a single commit.
1950860
to
2d6e992
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Damn I'm bad about catching everything on the first review.
GetSize
should probably return ErrNotFound
when the block isn't found to match Get
.
It would also be nice to have a zero-sized block test (so we don't regress there).
blockstore.go
Outdated
func (bs *blockstore) GetSize(k *cid.Cid) (int, error) { | ||
maybeData, err := bs.datastore.Get(dshelp.CidToDsKey(k)) | ||
if err == ds.ErrNotFound { | ||
return -1, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that Get
returns bs.ErrNotFound
when a block isn't found, this should probably do the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking a look... I'll need to think about this.
Removing that check on Line 191: Triggers three test failures:
--- FAIL: TestPutManyAddsToBloom (0.00s)
bloom_cache_test.go:68: datastore: key not found
--- FAIL: TestGetAndDeleteFalseShortCircuit (0.00s)
arc_cache_test.go:41: get hit datastore
--- FAIL: TestHasRequestTriggersCache (0.00s)
arc_cache_test.go:41: has hit datastore
The two arc_cache_test failures are because:
- The test case assumes no errors when a block is not found via 'arccache.Has'. Removing the check in 'GetSize/Line 191' triggers this failure because arccache.Has has a dependency on GetSize:
func (b *arccache) Has(k *cid.Cid) (bool, error) {
blockSize, err := b.GetSize(k)
return blockSize > -1, err
}
The bloom_cache_test failure is due to similar reason:
- The test case assumes no errors when a block is not found via 'bloomcache.GetSize'.
blockSize, err = cachedbs.GetSize(block2.Cid())
if err != nil {
t.Fatal(err) //<--Test fails here bloom_cache_test.go:68: datastore: key not found
}
if blockSize > -1 || has {
t.Fatal("not added block is reported to be in blockstore")
}
This is what I need to think about a bit more...
Open Question:
Ideally, should 'Has' be dependent on 'GetSize'? or decoupled?
Open Question:
Ideally, should 'GetSize' and/or 'Has' return an error when a block is not found? The current implementation for 'Has' w/ or w/o my changes currently suppresses the error in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solution
I can pull that check out of 'GetSize' on line 191 and add it to 'arccache.Has'
- This would allow for 'Has' to return false instead of an error when a block isn't found.
- This would also allow 'GetSize' to return bs.ErrNotFound when a block isn't found
func (b *arccache) Has(k *cid.Cid) (bool, error) {
blockSize, err := b.GetSize(k)
if err == ds.ErrNotFound {
return false, nil
}
return blockSize > -1, err
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like the right way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Stebalien This is done. I pushed out these changes and squashed into a single commit.
Performant way to map from Cid to BlockSize. License: MIT Signed-off-by: Jeromy <[email protected]>
2d6e992
to
d021381
Compare
@Stebalien This is done. I pushed out these changes and squashed into a single commit. |
Performant way to map from Cid to BlockSize.
Helps resolve issue: ipfs/kubo#4378 (comment)
@whyrusleeping @kevina @magik6k
License: MIT
Signed-off-by: Jeromy [email protected]