Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make chmod hermetic #2024

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mattyclarkson
Copy link
Contributor

Removes the need for system binaries to make the Python interpreter library directories/files read-only.

Fixes #2016

@mattyclarkson mattyclarkson force-pushed the chmod-touch-id branch 2 times, most recently from 5830642 to 7a71c56 Compare July 1, 2024 15:21
@mattyclarkson mattyclarkson changed the title fix: make Python repositories chmod/id/touch` hermetic fix: make Python repositories chmod/id/touch hermetic Jul 1, 2024
python/repositories.bzl Outdated Show resolved Hide resolved
python/repositories.bzl Outdated Show resolved Hide resolved
python/private/touch.py Outdated Show resolved Hide resolved
@@ -193,15 +205,17 @@ def _python_repository_impl(rctx):
if "windows" not in platform:
lib_dir = "lib" if "windows" not in platform else "Lib"

python = _get_host_python_interpreter(rctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the python_bin var that is defined below instead of using the host_python. And we should do the read-only fix only if the host platform the same as the platform of the toolchain.

Could we recursively change this for windows as well? Right now we are calling this only if it is not Windows, but do the scripts work on Windows?

Copy link
Contributor Author

@mattyclarkson mattyclarkson Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.chmod says:

Although Windows supports chmod(), you can only set the file’s read-only flag with it (via the stat.S_IWRITE and stat.S_IREAD constants or a corresponding integer value). All other bits are ignored.

So, yes, it would remove write access on Windows.

os.getuid is not available on Windows but I can find a fallback for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chmod.py does set the folders and files to read-only on Windows, but does not prevent new files being created in the directories. So the same behaviour as Unix is not present. We would have to look into Windows permissions to restrict that.

The getuid could return the Window Security ID for the user but that doesn't really help much. I've removed the touch check by checking platform.

We now run the scripts on Windows, but they don't really do much and actually fail to prevent .pyc files being written. Which is no different to before this patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can prevent writing files into directories on Windows with security permissions. That could be done by calling out to icacls.exe (in leiu of the win32security dependency). That will likely take significant extra testing as I'm not a Windows expert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aignas would you like me to continue figuring out truely read-only directories on Windows or does this PR move the needle enough to be considered for merge? I'd be happy to raise follow up issues/PRs, if necessary.

@mattyclarkson mattyclarkson force-pushed the chmod-touch-id branch 2 times, most recently from e9b91f5 to 126abef Compare July 2, 2024 09:27
@aignas
Copy link
Collaborator

aignas commented Jul 17, 2024

@rickeylev, what do you think about this? How do we move with this forward? Should we just start ignoring the pyc files in the repo and not bother with making it read-only?. The making of the external repos read-only is something that we don't do in the whl_library (maybe we should) and I am wondering what is the overall strategy here...

I am thinking that having the chmod, id and touch be hermetic might be ignoring the overall picture of what we are trying to do here.

@rickeylev
Copy link
Contributor

what is the overall strategy here

So, correctness wise, there are two things we're constrained by:

  1. The input files (both content and the list of them) have to be stable.
  2. The downloaded python binary may not be runnable by the host that downloaded it, such as in a cross-build with RBE situation.

(1) can be accomplished by any of:

  • a. Making everything read only.
  • b. Generating all the necessary files at repo install time.
  • c. Excluding files that might be created at runtime

I think we should do (a) regardless -- its just defense in depth. For the case of Windows (or other failures) we should just be OK with leaving things writable. It's the best we can do, and users will just bypass it entirely with ignore_root_error=True anyways (they are given the choice between "guaranteed pedantic failure" and "probably works, but flaky"; they'll always choose the latter).

I think (b) is better than (c), but it's out of scope for this PR and we already have (c) in place.

(2) throws a wrench into what this PR is doing -- using the downloaded runtime itself to run commands to make things read only. So I'm not really keen on that overall. I don't mind having an alternative implementation of chmod for OP's case where coreutils aren't available. Ideally, we'd either have a repo-phase reference to a host-compatible Python, or a repo-phase reference to a host-compatible chmod tool.

So, more concretely, I ask for the following changes:

  • Use the system chmod if available, otherwise use the downloaded-python's chmod. If neither is available, oh well; continue on.
  • Remove the fail() code paths. Just chmod and trust that things are OK.

mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazelbuild#2024 (comment)
@mattyclarkson mattyclarkson changed the title fix: make Python repositories chmod/id/touch hermetic Make chmod hermetic Aug 1, 2024
@mattyclarkson mattyclarkson marked this pull request as draft August 1, 2024 11:35
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazelbuild#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazelbuild#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazelbuild#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazelbuild#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazelbuild#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazelbuild#2024 (comment)
mattyclarkson added a commit to mattyclarkson/rules_python that referenced this pull request Aug 1, 2024
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazelbuild#2024 (comment)
Uses `uutils/coreutils` on most platforms, falls backto host `chmod` otherwise.

Removes failure checking[1] as we now "trust" that things are OK.

[1]: bazelbuild#2024 (comment)
@mattyclarkson mattyclarkson marked this pull request as ready for review August 1, 2024 13:14
@mattyclarkson
Copy link
Contributor Author

I've updated to use uutils/coreutils and removed the fail paths.

@@ -106,6 +106,71 @@ def is_standalone_interpreter(rctx, python_interpreter_path, *, logger = None):
logger = logger,
).return_code == 0

def _chmod(rctx, platform = None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speculation, that needs to be measured: The way this is implemented it would fetch and extract the chmod binary for each python version. It would use the bazel repository cache, but having the chmod in a separate repository might be better from the performance point of view?

In general, using the chmod that is hermetic is an interesting idea, but I wonder if we should use the system chmod if it exists and if it doesn't, download the hermetic chmod. That would also satisfy the ask to support your usecase where system chmod is not available. All of this could be done in a separate host_chmod repository, we could even switch the behaviour to default to the hermetic chmod if an env var is set to a particular value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hermetic chmod/touch/id in python_repository
3 participants