Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hermetic chmod/touch/id in python_repository #2016

Open
mattyclarkson opened this issue Jun 26, 2024 · 5 comments · May be fixed by #2024
Open

Hermetic chmod/touch/id in python_repository #2016

mattyclarkson opened this issue Jun 26, 2024 · 5 comments · May be fixed by #2024

Comments

@mattyclarkson
Copy link
Contributor

mattyclarkson commented Jun 26, 2024

🚀 feature request

Relevant Rules

python_repository

Description

We currently call out the to system chmod, touch and id binaries.

We could do the equivalent functionality with Python.

Describe the solution you'd like

The Python interpreter is downloaded in python_repository. If we provide a script as a private attribute to the repository rule, we can perform the readonly permissions change with the downloaded Python interpreter.

Describe alternatives you've considered

None. Python has the functionality to perform the chmod/touch/id.

@aignas
Copy link
Collaborator

aignas commented Jun 27, 2024

I am not sure we can do it:

  1. The chmod is there to make the filesystem read-only to avoid having pyc files in the toolchain and we run the python interpreter we have just downlaoded in the said repository, then we will create pyc files, which make the builds non-deterministic.
  2. Touch is just a test that checks that we have succeeded in creating a read-only directory, we could use Python for that, maybe? However, I think it would be better to change logic to attempt to delete a file using repository_ctx.delete.
  3. Maybe it is possible to use Python here.

What is the problem here that you are trying to solve? I'm thinking that we are missing context here (see xy problem)

@mattyclarkson
Copy link
Contributor Author

mattyclarkson commented Jun 27, 2024

then we will create pyc files

For the chmod step, I thought we could use -B or PYTHONDONTWRITEBYTECODE to prevent writing the .pyc files.

I think it would be better to change logic to attempt to delete a file using repository_ctx.delete.

Agreed. That would be a good solution.

What is the problem here that you are trying to solve?

Apologies, I should have been more explicit in the description.

In our organisation, we run our Bazel inside a container that has zero core utilites installed to prove that our builds are truely hermetic. We resolve our core utilties through rules_coreutils. This helps us avoid version differences in core utilities or the GNU/BSD differences. Unfortunately, for rules_python it means that we hit the chmod/id/touch missing binaries.

In this case, it is entirely nitpicky: the usage is POSIX compliant. However, it seems that we could get away with doing it with Python, remove the dependency and make the download hermetic with respect to the host system.

@aignas
Copy link
Collaborator

aignas commented Jul 1, 2024

One reservation that I have is that it suffers from the same problem - how do you handle downloading for the platform that is not the host platform. Then you need a reference to the host interpreter. That said, only the host interpreter needs this special treatment, because the other ones just get packaged up without being used in any way.

I personally think that this could be an interesting approach to dropping some system deps.

PRs welcome. :)

@mattyclarkson mattyclarkson linked a pull request Jul 1, 2024 that will close this issue
@rickeylev
Copy link
Contributor

re: -B flag and writing bytecode

It's fine for the pyc to be generated at repo-rule time, for a few reasons:

  1. As far as loading/analysis phase is concerned, they're just source files, so they're static and unchanging. The non-determinism concern is if they're generated at build time.
  2. The globs for the runtime's files ignore pyc files. This prevents them from being included in build analysis (the runtime might still use them if it isn't e.g. sandboxed)
  3. Since the originating .py files don't change, Python will use the existing pyc and not generate a new one.

We could probably precompile the whole stdlib using //tools/precompiler at repo install time and then we could drop the .pyc exlcusion from the globs

@mattyclarkson
Copy link
Contributor Author

We could probably precompile the whole stdlib using //tools/precompiler at repo install time and then we could drop the .pyc exlcusion from the globs

Agree. Would we be happy with a follow up issue/PR? Does making the current setup hermetic satisfy this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants