-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Windows] Handle paths on Windows #3322
Conversation
16e6f90
to
69f46a5
Compare
a916dba
to
36b3177
Compare
util/system/path_unix.go
Outdated
import ( | ||
"path/filepath" | ||
) | ||
|
||
// CheckSystemDriveAndRemoveDriveLetter verifies that a path, if it includes a drive letter, | ||
// is the system drive. This is a no-op on Linux. | ||
func CheckSystemDriveAndRemoveDriveLetter(path string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed this function, and it looks like code was forker from moby/moby, e.g. this one came from https://github.com/moby/moby/blob/487fa4a469b4fc77e1706f14acbd97c06383e754/pkg/system/path.go#L33
Should we look at unifying these packages to prevent maintaining two distinct implementations (and at risk of diverging?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me! Any suggestions in regards to which package this would best be placed in?
Edit: I have a feeling there are several functions that are probably copied over from docker. As we add Windows support to buildkit I think it would be useful to move those as well (if/as we find them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to give it a look; my eye just dropped on this one, and the name of this function looked very familiar 😂
I also may be ruffling a hornets nest here 🙈
Some quick background (might look more in-depth later);
- I know code has been forked / copied from docker to BuildKit, some of that related to there being a circular dependency between docker and buildkit (which can be "fun"), and packages in the docker repository not always being "easy" to consume (there's various packages that had dependencies on too many other things, which could cause dependency hell).
- While understandable; it's also slightly risky; implementations may start to diverge, and (due to the circular dependency) in some cases leads to two implementations both being used in the same codebase. If both implementations have the same (similar) requirements, we should definitely try to avoid that.
- I've been untangling some of that in the docker repository (still some work in progress), so that packages that have known consumers had no (or very little) dependencies.
- In some cases, we decided to move those packages to be their own repository and module (e.g., https://github.com/moby/sys, which provides some modules, or https://github.com/moby/term)
- A separate repo (and/or module) is definitely the easiest way to consume them, but not all packages may make sense to be a "project on their own"); so that would have to be looked at.
- If it doesn't make sense as an independent module, we can still look what the best home is for it (which could be either inside the BuildKit repository, or in the Docker repository).
- We're in the process of migrating docker to be a proper go module (which will solve some things, but may complicate some others, e.g. having a "go.mod" can force some things and be slightly less flexible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about adding a new module in the https://github.com/moby/sys called something like path
, or util
. It seems that both moby and buildkit have the system
package copied over. There are probably a lot more of these small packages/functions re-implemented in various places.
Should we do this as part of this PR or should we open an issue and move it as part of a different change? What do you think would be the better way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the time being, I think this can live in buildkitd
. I will be happy to move this somewhere else when we decide where the best place for it would be. I am still new to both the buildkitd
and moby
code base, so I am not sure where that place would be 😄 .
client/llb/fileop.go
Outdated
@@ -321,9 +327,13 @@ type fileActionMkfile struct { | |||
} | |||
|
|||
func (a *fileActionMkfile) toProtoAction(ctx context.Context, parent string, base pb.InputIndex) (pb.IsFileAction, error) { | |||
normalizedPath, err := normalizePath(parent, a.file, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you are repeating the normalization of the path over and over would it make sense to move the error handler as well to the function rather than copy and pasting the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to skip needing all the path parsing and handeling would be to use GetFullPathNameW.
It does, however, expand relative paths bases on the process's current working directory, and I dont think you would want to have to make temp-dirs or change the working directory for all go-routines in the current process, so you may still need to do some processing to identify and rectify relative paths.
func GetFullPathName(p string) (string, error) {
p0, err := syscall.UTF16PtrFromString(p)
if err != nil {
return "", err
}
n, err := windows.GetFullPathName(p0, 0, nil, nil)
if n == 0 {
return "", err
}
p1 := make([]uint16, n)
_, err = windows.GetFullPathName(p0, n, &p1[0], nil)
if err != nil {
return "", err
}
return syscall.UTF16ToString(p1), nil
}
488e4aa
to
c379286
Compare
Most of the path parsing in buildkit is a "translation layer" that converts what the container sees, to the corresponding physical path on the host. So we end up stripping away drive letters and joining what is left to what we know is the filesystem root as seen by the host. There's even some handling for symlinks and their targets. I am not sure if |
Hey, I wonder is there still blocking on this pull request? |
Not as far as I am concerned. Most of this PR is just replacing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not taking a look at this a bit sooner, I've had it on my backlog for weeks 👀
Aside from the inline comments, I'm curious about general use of the path
package - is this something we need to avoid using in the future when manipulating file paths? If so, I'm guilty of a few instances (especially around some of the attestations components). Is it worth attempting to deny-list this package (or some of it's functions) at the linter level, to prevent regression?
return errors.Wrap(err, "removing drive letter") | ||
} | ||
|
||
d.state = d.state.Dir(withoutDriveLetter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scans odd to me. I'm not sure why we have two separate ways for setting the path - the path in the state and the path in the config should be equivalent (this is also in the original).
I think we should restructure this so that we pass the result of NormalizeWorkdir
directly into d.state.Dir
(I think this is equivalent in behavior as long as NormalizeWorkdir
always returns an absolute path?). Then we don't have two separate ways of computing the working directory.
If we're doing that, I think we can also push down the call to CheckSystemDriveAndRemoveDriveLetter
into NormalizeWorkdir
? Ideally, it would be nice to have as little windows-specific stuff in the dockerfile frontend code, since it then makes it harder for other frontends to also support windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense. And would address @tonistiigi 's comment about basing this on the dispatchstate
platform. We could decide whether or not to strip drive letters based on the target platform. Will change shortly.
@@ -1203,7 +1215,10 @@ func dispatchCopy(d *dispatchState, cfg copyConfig) error { | |||
commitMessage.WriteString(" <<" + src.Path) | |||
|
|||
data := src.Data | |||
f := src.Path | |||
f, err := system.CheckSystemDriveAndRemoveDriveLetter(src.Path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of propagating this call throughout dispatchCopy
, WDYT about pushing it down into llb.Copy
(or fileActionCopy
)? I'm not sure, I'm not as familiar with that code, but to me it seems like then we'd avoid having to copy this block everywhere, and other frontends would also automatically get the path fixes for Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will look into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of places where the source folder passed to llb.Copy()
is wrapped in filepath.Join("/")
:
buildkit/frontend/dockerfile/dockerfile2llb/convert.go
Lines 1148 to 1153 in 6d9c24d
if a == nil { | |
a = llb.Copy(cfg.source, filepath.Join("/", src), dest, opts...) | |
} else { | |
a = a.Copy(cfg.source, filepath.Join("/", src), dest, opts...) | |
} | |
} |
If this path is a Windows path, this ends up becoming something like: \C:\test
. If the requirement is to make sure the path is absolute before it gets passed to llb.Copy()
here, the assumption being that it is possible this path may be relative in other places, we need to make sure the path is properly converted before we pass it in. Maybe a new function called ToAbolute(path string, platform string)
, that does the right thing based on target platform? What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
friendly ping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed I had some pending reviews I had forgotten to post 😬
client/llb/meta.go
Outdated
@@ -66,17 +68,22 @@ func dirf(value string, replace bool, v ...interface{}) StateOption { | |||
} | |||
return func(s State) State { | |||
return s.withValue(keyDir, func(ctx context.Context, c *Constraints) (interface{}, error) { | |||
if !path.IsAbs(value) { | |||
if !system.IsAbs(value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Client code should never check for the os of the client itself. A windows client can build linux images and vice versa.
For the abs check, either both os should be checked or detected from the value. Or from the state platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! In this case, the s.GetPlatform(ctx)
function should return the proper platform I suppose. Based on that I could decide how to handle the actual path.
wd := c.Path | ||
if !path.IsAbs(c.Path) { | ||
wd = path.Join("/", d.image.Config.WorkingDir, wd) | ||
withoutDriveLetter, err := system.CheckSystemDriveAndRemoveDriveLetter(c.Path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this logic should be based on the dispatchstate platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I attempted to parametrize the target platform as much as I could. Let me know if the latest commit is what you had in mind.
No worries! Had that happen to me a couple of times 😄. Will address the comments ASAP. |
Disclaimer: Really sorry for the wall of text 😄.
It depends. There are places, such as, for example (but not limited to) some public APIs of fsutil, where we need to have unix style paths. In some cases, unix style paths are acceptable even on Windows, as the nice folks over at MSFT have worked hard to improve compatibility with other platforms. You can see an example of the various ways to type in a path in the PR description. There are however places where unix style paths are not allowed on Windows. One such example is the That being said, the We strip the drive letter in a bunch of places, because we want to join that path with the root path of the container FS as seen by the host. Coincidentally, the path For example, if I am on D: Then the path Luckily, we only need to care about But we still need to care about how we handle them, and we need to be able to properly identify an absolute path (as seen by the container, which usually includes a drive letter and which is supplied by the user in a Dockerfile/llb). This won't be doable with standard packages alone. The reason for that is that on Windows, like we've seen, paths are a whole maze of different implementations. As such, in the Go standard library,
That is quite alright. Once we identify a sane way to handle paths, any incongruity can be changed, proper comments and tests can be added to prevent regressions. But as detailed above, getting it right is a bit of a can of worms 😄.
I can't say for sure at this stage. We'll have to see where this PR ends up and decide from there. |
2e0581f
to
7abc3ed
Compare
a05bcb2
to
4d2c75e
Compare
5ea471d
to
6da95e1
Compare
This is interesting. The following failure indicates an exit code a bit foreign to any Linux I am aware of: https://github.com/moby/buildkit/actions/runs/4664892558/jobs/8257778803?pr=3322#step:7:1404 This looks like it's maybe running under WSL? |
Not to impede progress but |
Indeed! The code paths here, mostly deal with transforming paths as seen by the container into paths to the mounted container root FS on the host. It also deals with the When COPY --from=someStage C:/myFile.txt C:/some/path/myFile.txt Which will translate to something like: cp C:\Windows\temp\buildkit-temp1\myFile.txt C:\Windows\temp\buildkit-temp2\some\path\myFile.txt Where the source layer is mounted in The Do you see any reason to |
Makes sense, thanks for the background on that.
Not at this time 👍 |
b52088e
to
c93bead
Compare
Would it help if I attempt to split this PR into smaller PRs? |
c93bead
to
d3c9ca6
Compare
Please take another look when you have some cycles. |
0c07313
to
79dc7d4
Compare
Needs rebase. Is it still a WIP? Some commits look a bit odd to me to clearly understand the scope:
|
Hi @crazy-max
Will do!
Not a WiP anymore. It is, however awaiting another look after the last set of changes.
This PR grew as the conversation evolved. Commits should be squashed, as subsequent commits changed code from previous commits. One example is I can attempt to split this in several PRs. It will remove some of the confusion and narrow the scope a bit. I will propose a new set of PRs to replace this one in the next couple of days. |
This change properly handles paths on Windows, in the context of the WORKDIR, COPY and ADD stanzas. Signed-off-by: Gabriel Adrian Samfira <[email protected]>
Signed-off-by: Gabriel Adrian Samfira <[email protected]>
Signed-off-by: Gabriel Adrian Samfira <[email protected]>
Signed-off-by: Gabriel Adrian Samfira <[email protected]>
Signed-off-by: Gabriel Adrian Samfira <[email protected]>
Signed-off-by: Gabriel Adrian Samfira <[email protected]>
Co-authored-by: Justin Chadwell <[email protected]> Signed-off-by: Gabriel Adrian Samfira <[email protected]>
79dc7d4
to
7854e86
Compare
@crazy-max I opened:
Converted this to draft to avoid confusion, but leaving it as it contains a longer discussion related to changes that were split. Looking forward to any input you may have. |
Closing this, as the work has been merged in other branches. |
This change properly handles paths paths on Windows, in the context of the
ADD
,COPY
andWORKDIR
stanzas. The Windows shell oddly handles a surprising array of path formats:This change makes sure that regardless of the paths used in a
Dockerfile
, buildkit will be able to properly clean it, and use it to set theCwd
in the image spec and properly copy/add files inside the image.This should bring us one step closer to fixing #616