WIP: Implementation for `publish` goal #12805

xlevus · 2021-09-09T05:26:59Z

Implementation for #8935

Full(er) implementation here: https://github.com/xlevus/pants2-plugins/tree/publish

xlevus · 2021-09-09T05:30:01Z

src/python/pants/core/goals/publish.py

+    )
+
+    # ???
+    PublishedPackageSet(packages)


Not quite sure what you're meant to do with the resulting values from the concrete implementations?

Only for logging purposes in the publish goal, I'd say.

Yea, this is likely a good place to render something to the Console that lists the things that were published.

Better is logging imo:

pants/src/python/pants/core/goals/package.py

Lines 102 to 109 in fd71177

for pkg in packages:

for artifact in pkg.artifacts:

msg = []

if artifact.relpath:

msg.append(f"Wrote {dist_dir.relpath / artifact.relpath}")

msg.extend(str(line) for line in artifact.extra_log_lines)

if msg:

logger.info("\n".join(msg))

We've been using logging when it describes a side-effect like this.

By this point the publish will already have happened in the background, so that's the relevant side-effect logging (see other comment). So that's why this felt like a good place for a summary, similar to:

pants/src/python/pants/core/goals/lint.py

Lines 260 to 275 in e1dd981

if all_results:

console.print_stderr("")

for results in all_results:

if results.skipped:

sigil = console.sigil_skipped()

status = "skipped"

elif results.exit_code == 0:

sigil = console.sigil_succeeded()

status = "succeeded"

else:

sigil = console.sigil_failed()

status = "failed"

exit_code = results.exit_code

console.print_stderr(f"{sigil} {results.linter_name} {status}.")

return Lint(exit_code)

src/python/pants/core/goals/publish.py

kaos

Most excellent draft, so far, in my opinion.

I have left a few comments to think about, however, don't take them for more than my opinion. I am sure that you'll get more feedback from the core maintainers.

kaos · 2021-09-09T06:59:18Z

src/python/pants/backend/experimental/python/publish_pypi.py

+    default = "${TWINE_PASSWORD}"
+
+
+class PypiPublishRequest(PublishRequest):


I think, that rather than to re-invent for the publish request scenario, to instead make use of the existing field sets mechanisms. Looking at the other core goals, they all manage to employ this across packaging, testing, linting etc..

That would use the TargetRootsToFieldSetsRequest and friends.

makes sense?

The approach makes sense, but the feasibility of the implementation not so much...

I went with this approach to support publishing to multiple different repositories for any given type.

My understanding is a FieldSet is just a subset of the fields of a target. But we're not just operating on a subset of those fields.

e.g. this example: https://github.com/xlevus/pants2-plugins/blob/publish/examples/pypi_repository/BUILD
The example_package has two publish_targets one is a pypi repository, the other a S3 bucket.

So we end up with two calls to Get(PublishedPackage,...) roughly like:

Get(PublishedPackage, PublishRequest, PyPiPublishRequest(publish_target=":pypi-repo", fieldset=PypiFieldSet())) Get(PublishedPackage, PublishRequest, S3PublishRequest(publish_target=":s3-bucket", fieldset=S3Bucket()))

So, (as far as my understanding goes) using TargetRootsToFieldSetsRequest, There's still going to need to be a Union'd class to carry the [FieldSet, PublishTarget] pair.

It gets a bit weird, as I predicted that a publish-target (repository) might need to know more than just the artifacts of the build (i.e. a docker push needs to know the image and version) So there's a bit of ping-pong.

We find all the targets that have the minimum publishable fields

We find the different publish-targets for the publishee

We then determine what fields each of those targets need

We check that the publishee has the fields that the publish-target wants.

we do the thing.

Pretty close, but I think a few of those steps is handled by the FieldSet logic, for us.
I've been playing with a POC locally to try out my idea.. and what I think I've landed in, that it will be easier if each repository/registry has its own field type, registered to each target type that it supports to publish for.

Say that the pypi publisher registers a pypi_repositories field for python_distribution targets, and the s3 publisher registers a s3_buckets field for python_distribution, archive and files for instance. With the added benefit of it will be very clear which kinds of repositories are supported for which kind of targets.

Then it'll be straightforward to get the field sets for all targets to publish, based on targets that has any of those publish fields in the BUILD file. Some of this reasoning is inspired by the lint goal, and how it constructs the set of LintRequests.

My reasoning for implementing it this way was to make things in a sense 'just work'.

If a developer A writes a plugin that allows them to publish a file to some form of file store, then developer B writes a plugin that bundles some files together, both should work together for the end-user.

Having a specific field type for each type of publishable target would mean that either A would need to know about B to add support, or the end user would have to write their own plugin to shim A and B together.

With this implementation, A just needs to ensure they do some sanity-checking on input, and B just needs to ensure their target has a publish_targets field.

Yes, there are cases where things are not compatible. i.e. uploading a docker image (a 'fileless' artifact) to a file-store. Or a tarball to a pypy repository. But to me, it does not seem too egregious to just error with a "I can't do that".

@Eric-Arellano : Any thoughts here?

kaos · 2021-09-09T07:02:17Z

src/python/pants/backend/experimental/python/register.py

@@ -1,10 +1,14 @@
 # Copyright 2021 Pants project contributors (see CONTRIBUTORS.md).
 # Licensed under the Apache License, Version 2.0 (see LICENSE).

-from pants.backend.experimental.python import user_lockfiles
+from pants.backend.experimental.python import publish_pypi, user_lockfiles


I think that it would be better to separate the rules for publishing to pypi, from the tool used to do it (twine). That way, there can potentially be multiple tools to choose from, that all publishes to a pypi registry, and it will be up to the user to decide which one to enable.

Would the pypi_repository target stay as a 'core' type, with the twine subsystem being a user-configurable backend, or does pypi_repository become twine_repository and sit in it's own backend?

My counter argument:

There should be one-- and preferably only one --obvious way to do it.

Supporting two tools to publish to pypi just sounds like twice as much code to maintain to achieve the same goal.

Yes, not saying that there'd necessarily be more than one supported by core pants. Not entirely sure how best to structure this, though.

I think that it would be better to separate the rules for publishing to pypi, from the tool used to do it (twine). That way, there can potentially be multiple tools to choose from, that all publishes to a pypi registry, and it will be up to the user to decide which one to enable.

I'm less sure about this... since the destination (PyPI) has a particular API (, auth requirements, etc), various tools to publish to it are likely to be fairly similar in terms of their arguments. Seems like maybe premature generalization.

Just consider if I'd rather use, say, poetry to publish my dist instead of twine, how would that work? (for reasons, like using fewer tools/configuration etc)
I may have to provide my own plugin to support using poetry for this, but I'd want to be able to do so, without the default twine implementation tripping me up.

You'd have to define a poetry_pypi_repository target.

kaos · 2021-09-09T07:10:51Z

src/python/pants/engine/environment.py

@@ -16,6 +16,7 @@

 name_value_re = re.compile(r"([A-Za-z_]\w*)=(.*)")
 shorthand_re = re.compile(r"([A-Za-z_]\w*)")
+interpolation_re = re.compile(r"^\$\{([A-Za-z_]{1,}[A-Za-z0-9]{0,})\}$")


I find it confusing that ${VAR} is expanded but $VAR is not. Also, I would expect that "embedded $VAR here" to expand the VAR in the middle of the string.

With that said, I'm not too sure that this is what we want here, or need. The Pants way to do it, is to add the required values as options on the tool used.

Say you have a twine tool. Along with that is options for specifying which version to use (if not the default) and possibly overriding the download url, if needed.
There'll also be the right place to add twine specific configuration, such as username, password, and config files for twine, to setup repository configurations. Those options may pick values for specific options from ENV vars as needed.

Ps.
I realize this comes from my own suggestion earlier over in your own repo. Didn't think of the tooling options then, apologies :)
Ds.

I find it confusing that ${VAR} is expanded but $VAR is not. Also, I would expect that "embedded $VAR here" to expand the VAR in the middle of the string.

This is mostly just a result of my idioms, as I'm used to predominantly using the ${foo} syntax in Dockerfile and docker-compose.yml files.

Also has the convenient side-effect of supporting non-variable values that start with a $ symbol. Still doesn't cover ${Not_A_Variable,This_Is_a_password} but I think the odds of that are significantly smaller.

With that said, I'm not too sure that this is what we want here, or need. The Pants way to do it, is to add the required values as options on the tool used.
Say you have a twine tool. Along with that is options for specifying which version to use (if not the default) and possibly overriding the download url, if needed.
There'll also be the right place to add twine specific configuration, such as username, password, and config files for twine, to setup repository configurations. Those options may pick values for specific options from ENV vars as needed.

The username and password isn't a configuration value for Twine, but rather a configuration value for the repository. In my use-case, I have two destinations for python packages. One for internal tooling, and the other to a "public" repository for clients. Both have different credentials.

Ideally, I wouldn't like to implement it specifically and have it handled by something that resolves #12797. But my familiarity with Pants only goes so far 😉

Usually, there's an escape for $ itself for disambiguation. That makes it clear what is, and what is not, a variable for expansion.

~~However, regarding twine, you can provide a configuration file with repositories, and those in turn supports environment variables: https://twine.readthedocs.io/en/latest/#environment-variables~~

~~Question then just becomes, what environment variables to pass on when running twine, but I think that will be a cleaner approach.~~

Edit: Read that documentation a bit too quickly. It was instead of the config file. Sigh. Will reconsider my reply..

Perhaps it is an OK limitation to not expect publishing to more than a single pypi repo at a time (when relying on env vars)?
In which case the env variables to use are a defined set at all times.

If using a config file, and that file can also hold the required credentials, then there's no such limitations.

stuhood · 2021-09-09T20:00:47Z

src/python/pants/backend/experimental/python/register.py

@@ -1,10 +1,14 @@
 # Copyright 2021 Pants project contributors (see CONTRIBUTORS.md).
 # Licensed under the Apache License, Version 2.0 (see LICENSE).

-from pants.backend.experimental.python import user_lockfiles
+from pants.backend.experimental.python import publish_pypi, user_lockfiles


I think that it would be better to separate the rules for publishing to pypi, from the tool used to do it (twine). That way, there can potentially be multiple tools to choose from, that all publishes to a pypi registry, and it will be up to the user to decide which one to enable.

I'm less sure about this... since the destination (PyPI) has a particular API (, auth requirements, etc), various tools to publish to it are likely to be fairly similar in terms of their arguments. Seems like maybe premature generalization.

stuhood · 2021-09-09T20:04:07Z

src/python/pants/core/goals/publish.py

+    )
+
+    # ???
+    PublishedPackageSet(packages)


Yea, this is likely a good place to render something to the Console that lists the things that were published.

stuhood · 2021-09-09T20:12:52Z

src/python/pants/backend/experimental/python/publish_pypi.py

+    process = await Get(
+        Process,
+        VenvPexProcess,
+        VenvPexProcess(
+            twine_pex,
+            argv=["upload", *call_args.args, *paths],
+            input_digest=request.built_package.digest,
+            extra_env=call_args.env,
+            description=f"Publishing {', '.join(paths)} to {request.publish_target.address}.",
+        ),
+    )
+
+    await Get(ProcessResult, Process, process)


This is side-effecting, which makes it a vaguely sketchy thing to do in a @rule rather than at the top level in the @goal_rule if there is any chance that the Process is not idempotent. For example: this process invoke will be cached (unless you pass a non-default CacheScope for the *Process), and will not re-run after succeeding. By default, failed processes are not cached, so that part should be fine. But it's also the case that while executing a cache lookup the N+1th time this process runs, we might retry the process and then cancel it.

So. An important question for this Goal's API is: do you think that it is a reasonable requirement that all publishing commands/destinations MUST be idempotent? If so, then I think that the current API works. If not so, then rather than actually executing the publish processes in @rules, you would want to return them as a list of publish InteractiveProcesses to have the @goal_rule execute synchronously in the foreground.

In my experience, idempotent publish commands are generally a good idea, but it's something to be intentional about here.

Little unsure on how to proceed with this.

Twine isn't idempotent, re-running the upload with the same inputs will result in an error.

So, I'm guessing I need to move the publish call from the @rule to the @goal_rule. Presumably by just returning a (boxed?) InteractiveProcess for the goal to execute.

But what about systems that do not necessarily use a process? i.e. the docker subsystem could in theory also use the docker library.

I've moved the execution of the process to the @goal_rule.

Twine isn't idempotent, re-running the upload with the same inputs will result in an error.

True by default, although it can be configured to not fail for existing destinations. @benjyw , @jsirois : thoughts on this one?

But what about systems that do not necessarily use a process? i.e. the docker subsystem could in theory also use the docker library.

I think that even if some of the implementations (like docker) are idempotent, it's still ok to execute them in the foreground. So this makes sense I think.

Twine fails on re-upload for a reason, I'm not sure we want to elide that away unconditionally.

stuhood · 2021-09-09T20:14:46Z

Thanks a lot, this looks great! Kicked off CI.

stuhood · 2021-09-10T18:18:40Z

@xlevus : When this is ready for review, please take it out of Draft: thanks!

Allows rules/targets to easily support values that can be pulled from the environment. Alternative solution to pantsbuild#12797.

kaos · 2021-11-12T12:19:55Z

Thanks for kicking this off the ground, @xlevus
Closing this as we have merged #13057.

xlevus commented Sep 9, 2021

View reviewed changes

src/python/pants/core/goals/publish.py Show resolved Hide resolved

kaos reviewed Sep 9, 2021

View reviewed changes

stuhood reviewed Sep 9, 2021

View reviewed changes

stuhood mentioned this pull request Sep 16, 2021

New experimental_shell_command #12878

Merged

xlevus force-pushed the publish branch from 244a885 to 9dbd427 Compare September 27, 2021 02:17

xlevus added 4 commits September 27, 2021 15:27

Added environment interpolation rule.

07ac4f1

Allows rules/targets to easily support values that can be pulled from the environment. Alternative solution to pantsbuild#12797.

Add publish goal

041cb34

Concrete implementation for publishing to Pypi repositories.

8d307a7

Improved calling of publish commands

f3a3220

xlevus force-pushed the publish branch from 9dbd427 to f3a3220 Compare September 27, 2021 02:28

This was referenced Sep 27, 2021

Docker image mirroring #13015

Closed

New publish goal #13057

Merged

stuhood mentioned this pull request Oct 14, 2021

Support for publishing deploy jars to S3 #13261

Open

kaos closed this Nov 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Implementation for `publish` goal #12805

WIP: Implementation for `publish` goal #12805

xlevus commented Sep 9, 2021

xlevus Sep 9, 2021

kaos Sep 9, 2021

stuhood Sep 9, 2021

Eric-Arellano Sep 14, 2021

stuhood Sep 14, 2021

kaos left a comment

kaos Sep 9, 2021

xlevus Sep 9, 2021 •

edited

Loading

kaos Sep 9, 2021

xlevus Sep 27, 2021

stuhood Sep 27, 2021

kaos Sep 9, 2021

xlevus Sep 9, 2021 •

edited

Loading

kaos Sep 9, 2021 •

edited

Loading

stuhood Sep 9, 2021

kaos Sep 10, 2021

xlevus Sep 27, 2021

kaos Sep 9, 2021

kaos Sep 9, 2021

xlevus Sep 9, 2021

kaos Sep 9, 2021 •

edited

Loading

kaos Sep 9, 2021 •

edited

Loading

stuhood Sep 9, 2021

stuhood Sep 9, 2021

stuhood Sep 9, 2021 •

edited

Loading

xlevus Sep 27, 2021

xlevus Sep 27, 2021

stuhood Sep 27, 2021

benjyw Sep 27, 2021

stuhood commented Sep 9, 2021

stuhood commented Sep 10, 2021

kaos commented Nov 12, 2021

	for pkg in packages:
	for artifact in pkg.artifacts:
	msg = []
	if artifact.relpath:
	msg.append(f"Wrote {dist_dir.relpath / artifact.relpath}")
	msg.extend(str(line) for line in artifact.extra_log_lines)
	if msg:
	logger.info("\n".join(msg))

	if all_results:
	console.print_stderr("")
	for results in all_results:
	if results.skipped:
	sigil = console.sigil_skipped()
	status = "skipped"
	elif results.exit_code == 0:
	sigil = console.sigil_succeeded()
	status = "succeeded"
	else:
	sigil = console.sigil_failed()
	status = "failed"
	exit_code = results.exit_code
	console.print_stderr(f"{sigil} {results.linter_name} {status}.")

	return Lint(exit_code)

		default = "${TWINE_PASSWORD}"


		class PypiPublishRequest(PublishRequest):

WIP: Implementation for publish goal #12805

WIP: Implementation for publish goal #12805

Conversation

xlevus commented Sep 9, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xlevus Sep 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xlevus Sep 9, 2021 • edited Loading

Choose a reason for hiding this comment

kaos Sep 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaos Sep 9, 2021 • edited Loading

Choose a reason for hiding this comment

kaos Sep 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuhood Sep 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuhood commented Sep 9, 2021

stuhood commented Sep 10, 2021

kaos commented Nov 12, 2021

WIP: Implementation for `publish` goal #12805

WIP: Implementation for `publish` goal #12805

xlevus Sep 9, 2021 •

edited

Loading

xlevus Sep 9, 2021 •

edited

Loading

kaos Sep 9, 2021 •

edited

Loading

kaos Sep 9, 2021 •

edited

Loading

kaos Sep 9, 2021 •

edited

Loading

stuhood Sep 9, 2021 •

edited

Loading