-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: creating files logs information for use in chisel.db #105
Conversation
When creating files as part of the cut command, we will use a proxy that logs information about the hash, path, mode, etc. This will then be ingrated into a report which will be part of chisel.db.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks nice. I left a few comments and suggestions. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks trimmed and much better now, thanks!
internal/slicer/slicer.go
Outdated
@@ -153,11 +153,13 @@ func Run(options *RunOptions) error { | |||
if reader == nil { | |||
continue | |||
} | |||
fsCreator := fsutil.NewCreator() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a different fsCreator
for every slice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That is one of the bits that can no longer be seen in this trimmed down version of the PR. We need to create one for each because we want to report the files created for each one. If we used the same fsCreator
we would not know which slice created which file (unless we resorted to emptying the map or some other hack).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good, thanks Alberto. Just a few details/questions.
internal/deb/extract.go
Outdated
@@ -25,6 +25,7 @@ type ExtractOptions struct { | |||
TargetDir string | |||
Extract map[string][]ExtractInfo | |||
Globbed map[string][]string | |||
FSCreator *fsutil.Creator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be just Creator
, given the context, and although this is minor we can use a locally defined interface to allow for variation on the types, even more for testing purposes if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree on the interface but can we do it when the need arises? Right now the tests depend on the default creator and the tree dump function that inspects the filesystem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one trivial, and a question about the organization, but otherwise this seems ready for merging.
internal/deb/extract.go
Outdated
@@ -58,8 +59,12 @@ func Extract(pkgReader io.Reader, options *ExtractOptions) (err error) { | |||
if err != nil { | |||
return err | |||
} | |||
validOpts := *options | |||
if validOpts.Creator == nil { | |||
validOpts.Creator = fsutil.NewCreator() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not seem great as it's doing a shallow copy and we have no idea of the consequences. If there's intended mutation going on to the values of the options, they won't be moved out anymore. If on the other hand, the goal is to not mutate, this won't help in cases where the fields are pointers.
There are multiple options here, but the easiest here seems to be simply forcing the call site to provide the creator. We don't have many call sites anyway, and tests are already doing this in a custom way I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you. I added it because of your comment about the panic but I agree that, because it is an internal function, we can just add it in each call site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's a bit of a bind because it'd be fine to either 1) Use a default global value that doesn't increase in memory size over time, or 2) Use a local value that is cheap to create. I'm fine with either, but would prefer to not be creating maps, populating maps, and then destroying (GCing) them on every iteration of a loop. Seems wasteful for no greater good.
internal/fsutil/create.go
Outdated
@@ -84,3 +123,20 @@ func createSymlink(o *CreateOptions) error { | |||
} | |||
return os.Symlink(o.Link, o.Path) | |||
} | |||
|
|||
// readerProxy implements the io.Reader interface proxying the calls to its inner io.Reader. On each read, the proxy | |||
// calculates the file size and hash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/calculates/keeps track of/
@@ -137,7 +137,7 @@ func extractTar(dataReader io.Reader, targetDir string) error { | |||
|
|||
//debugf("Extracting header: %#v", tarHeader) | |||
|
|||
err = fsutil.Create(&fsutil.CreateOptions{ | |||
err = fsutil.NewCreator().Create(&fsutil.CreateOptions{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in a loop, creating and discarding the creator and its map on every iteration, not even keeping track of it. Is the idea here that the follow up PR will unify all of these calls and track these properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that you have one creator per slice to know what files were created for each one. The other option will be cleaning the creator.Created
map each iteration to know which files come from the new slice, I think that is more convoluted to follow. From the report PR:
<loop over slice>
creator := fsutil.NewCreator()
err := deb.Extract(reader, &deb.ExtractOptions{
Package: slice.Package,
Extract: extract[slice.Package],
TargetDir: targetDir,
Globbed: globbedPaths,
Creator: creator,
})
for _, entry := range creator.Created {
report.AddEntry(slice, entry)
}
<end of loop>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems fine. What's a bit awkward is the case above where we have a tight loop around individual file extraction using a new creator every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change that to the outer possible scope, you are right.
@@ -153,11 +153,13 @@ func Run(options *RunOptions) error { | |||
if reader == nil { | |||
continue | |||
} | |||
creator := fsutil.NewCreator() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
@@ -211,7 +213,8 @@ func Run(options *RunOptions) error { | |||
return fmt.Errorf("internal error: cannot extract path of kind %q", pathInfo.Kind) | |||
} | |||
|
|||
err := fsutil.Create(&fsutil.CreateOptions{ | |||
creator := fsutil.NewCreator() | |||
err := creator.Create(&fsutil.CreateOptions{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see the replies above as it may be relevant for upcoming PRs, but this seems fine as it is for now. Thanks!
When using the fsutil.Creator, it will store information about the files and directories created. This will then be integrated into a report which will be part of chisel.db in a separate PR.