Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New experimental_shell_command #12878

Merged
merged 19 commits into from
Sep 23, 2021
Merged

New experimental_shell_command #12878

merged 19 commits into from
Sep 23, 2021

Conversation

kaos
Copy link
Member

@kaos kaos commented Sep 14, 2021

The new experimental_shell_command added to the shell backend allows running arbitrary commands during pants execution.

This target is for introducing side effects in the build, either as new or modified files, calling out to external services or managing some other state. It remains important to ensure idempotency however, as the command may be cancelled or retried on the sole discretion of Pants.

For those familiar with Bazel, the experimental_shell_command has similarities with the Bazel genrule.

Fixes #3734

Example BUILD file usage:

shell_library(name="build-tools")
experimental_shell_command(
  command="./build-util.sh -o output do-things",
  tools=["bash", "env", "cat", "curl", "tar"],
  outputs=["output/"],
  dependencies=[":build-tools"],
)

The dependencies will pull in scripts from shell_library, arbitrary files from files and other experimental_shell_command targets, the outputs lists directories and files to capture, which may be included by consuming targets, and tools lists all required executables that command may be using.

The [shell-setup].executable_search_paths option is used when finding the specified tools.

Signed-off-by: Andreas Stenius <[email protected]>

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@kaos
Copy link
Member Author

kaos commented Sep 15, 2021

@stuhood any thoughts on the direction of this?

@stuhood
Copy link
Member

stuhood commented Sep 15, 2021

Sorry for the delayed review! Excited about it, but haven't taken a look yet. Will do today.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! This looks like the right track. Allowing gen_rule outputs to be consumed by other targets as Files is probably the most critical bit to implement.

src/python/pants/core/util_rules/gen_rule.py Outdated Show resolved Hide resolved
src/python/pants/core/util_rules/gen_rule_test.py Outdated Show resolved Hide resolved
src/python/pants/core/target_types.py Outdated Show resolved Hide resolved
src/python/pants/core/target_types.py Outdated Show resolved Hide resolved
@stuhood
Copy link
Member

stuhood commented Sep 16, 2021

Also curious whether @Eric-Arellano thinks this might need/want to integrate with shell_library.

@kaos kaos marked this pull request as ready for review September 17, 2021 12:38
Copy link
Member Author

@kaos kaos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy to rename gen_rule => shell or something like that. Apart from that, I feel this is getting pretty close to done.

I don't think we have to consider the difference in glob behaviour between git style and shell style too much. As long as it is clear that the command line is parsed by a shell, it shouldn't be any surprises there.

We could potentially however provide more data in the environment, regarding inputs and outputs etc.

src/python/pants/core/target_types.py Outdated Show resolved Hide resolved
src/python/pants/core/util_rules/gen_rule.py Outdated Show resolved Hide resolved
src/python/pants/core/util_rules/gen_rule_test.py Outdated Show resolved Hide resolved
src/python/pants/core/util_rules/gen_rule_test.py Outdated Show resolved Hide resolved
src/python/pants/core/util_rules/gen_rule_test.py Outdated Show resolved Hide resolved
Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really great: thanks a lot!

I think that this is ready to land with an experimental_ prefixed name.

src/python/pants/core/target_types.py Outdated Show resolved Hide resolved
src/python/pants/core/target_types.py Outdated Show resolved Hide resolved
src/python/pants/core/target_types.py Outdated Show resolved Hide resolved
src/python/pants/core/target_types.py Outdated Show resolved Hide resolved
src/python/pants/core/util_rules/gen_rule.py Outdated Show resolved Hide resolved
src/python/pants/core/util_rules/gen_rule.py Outdated Show resolved Hide resolved
src/python/pants/core/util_rules/gen_rule.py Outdated Show resolved Hide resolved
src/python/pants/core/util_rules/gen_rule_test.py Outdated Show resolved Hide resolved
@Eric-Arellano
Copy link
Contributor

Also curious whether @Eric-Arellano thinks this might need/want to integrate with shell_library.

What might that look like? I think it makes sense to be separate. Fwict, shell_library describes shell source code checked in to disk. Iiuc, this new target does not require anything on-disk necessarily.

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I looked mostly at the modeling of the target, not the actual rule implementation

src/python/pants/core/target_types.py Outdated Show resolved Hide resolved
src/python/pants/core/target_types.py Outdated Show resolved Hide resolved
src/python/pants/core/target_types.py Outdated Show resolved Hide resolved
src/python/pants/core/target_types.py Outdated Show resolved Hide resolved
src/python/pants/core/target_types.py Outdated Show resolved Hide resolved

class GenerateFilesFromGenRuleRequest(GenerateSourcesRequest):
input = GenRuleSources
output = FilesSources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that Files are not (currently) included when building a pex_binary and python_awslambda. They're mostly helpful for tests and for archive. Is that okay?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I'd like the output to be as generic as possible, is plains Sources better, then?
Or rather, I want to make sure that whatever is produced by the shell_command (if we call it that) may be used as sources by another target.

Not sure if simply depending on a shell_command, from say a python_library is enough, or if you should include the output files from shell_command in the python_library sources field? But that may complain that the "glob doesn't match" any files, until the shell_command has been executed, if that is an issue?

@kaos
Copy link
Member Author

kaos commented Sep 21, 2021

Seing now this as a (experimental_)shell_command, maybe it ought to live in the shell backend?

@kaos
Copy link
Member Author

kaos commented Sep 21, 2021

I just realize that I really doesn't like to use $tool instead of just tool. That puts a lot of constraints on the script to run, and also complicates if you want to run it outside of a pants execution.

Just take the leading shebang, for example (is what prompted me down this lane to begin with):

#!/usr/bin/env bash
...

This won't work, but should instead be just

#!$bash
...

Which doesn't work either, as that line is not expanded.

So, I propose that we setup a .tools/bin/ folder, and symlink all requested tools in there, and list that in the search path PATH=/path/to/sandbox/.tools/bin so that ordinary env works, etc.

Edit: hmm... can I create symlinks in a Digest? otherwise, I guess this doesn't work, or?

Signed-off-by: Andreas Stenius <[email protected]>

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
Signed-off-by: Andreas Stenius <[email protected]>

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
@kaos
Copy link
Member Author

kaos commented Sep 21, 2021

  • I moved the whole implementation of experimental_shell_command to the shell backend.
  • Added a new field log_output, anticipating that it will be good to be able to print something informative from the command, in case it is only side-effecting without any outputs captured, at least.
  • Sets up a .bin dir with symlinks to all tools specified for use, and points PATH at that .bin dir, so we can use cd, cat etc without having to use parameter expansion from the environment on them.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really fantastic: thanks a lot @kaos!

)

command_env = {
"TOOLS": " ".join(tools),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unlikely, but tools might have spaces in their paths. Should shlex.escape them before joining them.

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, great work!

@@ -33,22 +33,22 @@ def rules():
*package.rules(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the ordering :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it was hurting my eyes, and C-space C-s ] C-p M-x so-li enter is a mere 2.5 second fix :P

help = dedent(
"""\
Execute any external tool for its side effects.
This may be retried and/or cancelled, so ensure that it is idempotent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, should probably move below the example. This is a nuanced detail that doesn't explain what this target is, only a limitation when using it. The example is really helpful to figure out what it is.

Might also be worth adding a paragraph below the example explaining how you can depend on shell_library targets to run its script, or you can directly invoke Bash commands like touch my_file.ext.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Another key detail missing from this help, you must add this target to the dependencies of each consumer, such as your python_tests. When relevant, Pants will run your command and insert the output_files into that build's context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also be worth adding a paragraph below the example explaining how you can depend on shell_library targets to run its script

Oh, wait, I don't think I fully understand what you mean here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, could we get around this limitation, at least to some extent, if we implement the GenerateTargetsRequest union for shell_command as well (and generate file targets for any output files)?

Copy link
Member

@stuhood stuhood Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, only being able to bring in all of the output files of the script is totally fine for now. And files is almost certainly the right output type.

GenerateTargetsRequest wouldn't be able to fully determine which files existed without actually running the script, because directory/ entries include the children of the directory.

(@Eric-Arellano : Which reminds me: we probably ought to document an expectation that target generation runs quickly, and/or is stably cacheable, since it will run for all graph introspection)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wait, I don't think I fully understand what you mean here.

What I mean is that if you set your command to be ./my_script.sh, you should add a shell_library target which includes my_script.sh in its sources and include that in the dependencies field.


class GenerateFilesFromShellCommandRequest(GenerateSourcesRequest):
input = ShellCommandSources
output = FilesSources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To respond to the earlier thread: I think this is the right choice for now to start, rather than Sources.

Note also that callers must set enable_codegen=True for this codegen to happen.

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@stuhood
Copy link
Member

stuhood commented Sep 22, 2021

With a fresh description and title, I think that this is probably ready to land! Thanks again.

Comment on lines +251 to +252
"Execute any external tool for its side effects.\n"
+ dedent(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay thanks for doing that formatting :) This is great because Pants wraps lines based on your terminal width, but we need to use implicit string concatenation for that to work properly.

(The docs also work better when using implicit string concatenation so that the browser can wrap for you.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I remember you told me this once before ;)

@kaos kaos changed the title gen_rule poc. New experimental_shell_command Sep 22, 2021
@kaos
Copy link
Member Author

kaos commented Sep 22, 2021

Right, I forgot there is one more thing I'd like to include here before I feel it is ready to go.. that is I'd like to use the executable-search-paths option from the shell-setup subsystem, rather than using a hard coded list of the most common search paths.

Signed-off-by: Andreas Stenius <[email protected]>

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
Comment on lines 230 to 231
"and these tools must be found on the paths provided by "
"[shell-setup].executable-search-paths (which defaults to the system PATH)."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it. Great detail!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am so excited for this!

Random nit: Don't option names in config have to use underscores? i.e., executable_search_paths? We let them be with dashes on the cmd line (--shell-setup-executable-search-paths) because --shell-setup-executable_search_paths looks awful, but I think we at least encourage underscores in option names (but dashes in scope names, so it's still shell-setup).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah true, I think it needs to be underscores for the option part

stuhood and others added 2 commits September 22, 2021 12:47
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@kaos kaos enabled auto-merge (squash) September 22, 2021 20:14
@kaos
Copy link
Member Author

kaos commented Sep 22, 2021

I copied the PR description in as merge commit message, kept author info. I assume you can disable, and fixup if that is not what you want? :)

@Eric-Arellano
Copy link
Contributor

I copied the PR description in as merge commit message, kept author info. I assume you can disable, and fixup if that is not what you want? :)

That sounds good! The most important part is the PR title being useful for the changelog, which this is. (Nit that our changelog is Markdown, so escaping with ` is useful)

For the description, copying in the PR description is great too. And totally fine to include author info. Also you can copy the [ci skip-rust] part when it's safe to.

@kaos kaos changed the title New experimental_shell_command New experimental_shell_command Sep 22, 2021
Signed-off-by: Andreas Stenius <[email protected]>

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@kaos kaos merged commit 2da07b6 into pantsbuild:main Sep 23, 2021
@kaos kaos deleted the issue/3734_gen_rule branch September 23, 2021 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

genrule support via isolated processes
4 participants