-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unixfs: integrate pb.Data
into FSNode
to avoid duplicating fields
#5098
unixfs: integrate pb.Data
into FSNode
to avoid duplicating fields
#5098
Conversation
schomatis
commented
Jun 8, 2018
•
edited
Loading
edited
My main question is exactly how to integrate the
|
I'd like to just embed it however, that would expose a lot of internal state. I believe we may want to eventually expose these datastructures as a library so that may not be the best idea. Are you planning on reusing the
|
@Stebalien This was an incomplete PoC PR to get an idea of what it would look like refactoring these structures together, but I'm seeing now that leaving it halfway done actually creates more noise than signal. I'll give it another iteration and I'll ping you back. Thanks for the feedback.
Yes, you're right, it would be better not to export the embedded format structure (leaving it as
Yes, this PR would remove all of the previous
Yes, that's the haflway (not) done part, |
Seems like we could just |
Agreed. I was doubting about doing that, I added it in |
To avoid duplicating fields and making the code easier to follow. Remove all of `FSNode` previous fields in favor on a single `pb.Data` structure that is not exported. Accessor methods are added only for the necessary internal fields. This takes up more memory, `pb.Data` is always created inside `FSNode` and it stays there instead of just being created and destroyed during the (un)marshal operations. The removed fields `Data`, `blocksizes` and `Type` had a direct counterpart in the embedded `pb.Data` structure, in contrast (only) the `subtotal` field doesn't have one, it was used as a temporary accumulator to track the `Filesize`, which is now being kept updated on every modification (to ensure the entire `FSNode` is always at a valid state), so `subtotal` could just be removed without the addition of any other field (this temporary accumulator was obscuring how `Filesize` was computed). To keep `Filesize` up to date a method was added (`UpdateFilesize()`) to adjust its value in the two places where the file size could be modified, when changing its data (in `SetData()`, accessor method added) and when adding or removing child nodes (in `AddBlockSize()` and `RemoveBlockSize()`). A constructor method was added (`NewFSNode()`) to initialize the required fields, like `Type` which is explicitly set, this deprecates the previous methodology of just calling `new(FSNode)` and relying in the default value of `pb.Data_DataType` (`Data_Raw`) to avoid an explicit assignment. Also, `Filesize` is initialized to avoid being left with a `nil` value before marshaling empty nodes, which would result in a different hash from previous versions, to be backwards compatible. Previous versions of `GetBytes()` always set the `Filesize` value, even though it is reflected as an `optional` field in the `.proto` file (this may be an inaccurate field rule). Without the duplicated fields the functions `GetBytes()` and `FSNodeFromBytes()` are now reduced to simple `Marshal()` and `Unmarshal()` operations respectively. License: MIT Signed-off-by: Lucas Molas <[email protected]>
@Stebalien Could you please take another look? |
pb.Data
into FSNode
to avoid duplicating fieldspb.Data
into FSNode
to avoid duplicating fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely prefer this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM