Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow customizing generation of tags #187

Closed
pradyunsg opened this issue Aug 25, 2019 · 25 comments · Fixed by #231
Closed

Allow customizing generation of tags #187

pradyunsg opened this issue Aug 25, 2019 · 25 comments · Fixed by #231

Comments

@pradyunsg
Copy link
Member

Coming from pypa/pip#6908

pip's current get_supported function for generating PEP 425 tags, has the following signature:

def get_supported(
    versions=None,  # type: Optional[List[str]]
    noarch=False,  # type: bool
    platform=None,  # type: Optional[str]
    impl=None,  # type: Optional[str]
    abi=None  # type: Optional[str]
):

It takes details/strings about the specific platform, implementation, ABI and interpreter versions. It also has a flag to avoid generating architecture specific tags.

packaging.tags would need to have feature parity with this interface, to be used within pip.

@brettcannon
Copy link
Member

This is mostly as notes for myself, but:

  • impl + versions is interpreter
  • abi is abi 😄
  • I would assume platform is platform
  • But what is noarch for? And why isn't it a way to specify the CPU architecture instead of a boolean?

@brettcannon brettcannon removed their assignment Sep 11, 2019
@di
Copy link
Member

di commented Sep 27, 2019

@brettcannon I started to attempt this to include in #213 but I don't have enough familiarity with either codebase. Did you have a rough plan here? I see you unassigned yourself, are you not able to implement this? Should I just dig in deeper? @pradyunsg maybe this would make more sense to you?

@brettcannon
Copy link
Member

I just unassigned in case someone beat me to it is all. It's on my todo list so I will definitely attempt to tackle it soon-ish (I'm trying to get through my 'packaging' stuff ATM and so this should bubble up shortly).

@brettcannon
Copy link
Member

It looks like noarch is used to simply skip over any ABI or platforms and only go with the generic none and any, respectively.

@brettcannon
Copy link
Member

I'll also mention that this has to land before our next release as warn has been added as an argument to sys_tags() and as long as we support Python 2.7 we can't control API expansion via keyword-only arguments. So we need to stabilize our API for sys_tags() before the public starts to rely on positional arguments.

@brettcannon
Copy link
Member

Searching through the pip source code via GitHub only turns up the following call site for a place where arguments are specified:

https://github.com/pypa/pip/blob/e6f69fadd8c84526d933b2650141193595059e72/src/pip/_internal/models/target_python.py#L98-L103

That would suggest that noarch isn't necessary.

@brettcannon
Copy link
Member

@pradyunsg @di if sys_tags() grew interpreters, abis, and platforms that all took Iterable[String] to override the system discovery would that be good enough?

The problem I have with get_supported() as-is in pep425tags is it is treating its arguments like hints and then tries to back-fill extra values. That feels brittle to me as it starts to add reliance on the format of what people provide to the code. Plus not everyone may want that back-fill for some reason.

Basically my question is how much must packaging.tags take on in terms of this logic versus having pip manage it for the one place it uses it? If this must be here then I would argue to have separate APIs for attempting to back-fill and keep sys_tags() simple.

@di
Copy link
Member

di commented Oct 4, 2019

@brettcannon I think that's reasonable, and that's also the conclusion I came to when starting to look at this. The only real hitch there is that pep425tags has impl + versions where we have interpreter here. So if @pypa/pip-committers are willing to accept this as an internal API change (along with warn) then I think that's what we should do.

@pradyunsg
Copy link
Member Author

separate APIs for attempting to back-fill and keep sys_tags() simple.

Sounds fair. I feel that this would be a lot easier to reason about on both sides (packaging.tags and pip).

@pradyunsg
Copy link
Member Author

this has to land before our next release

Hmm... With current master, yes.

I'm OK to add a docstring that says warn should be specified via keyword only and set the expectation that more arguments may be added, especially if this is a blocking concern IMO. Or even to take **kwargs and handle it for these optional arguments (we're not directly using the call-signature via autodoc for Sphinx, so we can use the keyword only notation there).

That said, it seems like we might be going down the path of creating a separate API for pip's "flexible" way of tag generation which makes this suggestion kinda moot. 🙃

@brettcannon
Copy link
Member

I think it depends on what we want sys_tags() to be and what overall API we want to provide. Maybe it would be better to expose more of the private functions that generate things like the interpreter, ABI, and platform tags and then allow folks to construct the ordered list themselves? That would leave sys_tags() as a convenience function for those who aren't calculating wheels for a different installation than the one they are on.

And as for the comment from @di , I was thinking about this last night and came to a similar conclusion that python version would probably be necessary as well.

And doing the old **kwargs trick so we don't have to hold up the release or to start with keyword-only for warn is fine by me.

@brettcannon
Copy link
Member

More notes...

According to python -m pip help install there are:

  • --platform
  • --python-version
  • --implementation
  • --abi
    for controlling the 3 parts of the tags (with --python-version and --implementation working together to form the interpreter part of the tag set). Now one key observation is all of these take a single argument. Another is that --implementation is version-independent, e.g. it takes cp and not cp38.

I think if we expose the *_tags() functions and make those take the appropriate optional arguments then that should give pip the flexibility to specify everything it wants manually while letting us not have to over-expose internal details too much. It also changes sys_tags() into basically a dispatcher to the other *_tags() functions much like it is now. There will still need to be some glue code in pip, but it should be rather minimal and straight-forward (e.g. constructing the interpreter from Python version and implementation).

@brettcannon
Copy link
Member

FYI I started the work to expose the *_tags() functions and I think it will work out for pip. Still have to code up the tests and write the documentation.

@chrahunt
Copy link
Member

chrahunt commented Nov 8, 2019

@brettcannon if you want to push what you have I can try it out on the pip-side.

@brettcannon
Copy link
Member

@chrahunt sure, that would be great! My branch is at https://github.com/brettcannon/packaging/tree/expose-tags and I have exposed:

  1. cpython_tags()
  2. pypy_tags()
  3. generic_tags()
  4. compatible_tags()

If you look at sys_tags() (both as it is in master and in the branch) it should hopefully be clear how to call the functions and what they do. I've started writing tests but they are not complete yet so no promises there aren't bugs. 😁

@chrahunt
Copy link
Member

chrahunt commented Nov 9, 2019

There are several places where there are differences so far:

  1. the cpython_tags result includes abi3 for 2.7 to 2.2 if python_version=(2, 7) is passed
  2. in pep425tags (here) we're calculating abi3 dynamically. It looks like this is just trying to be future-proof to an abi4 or cater to implementations that could offer their own "stable" ABI, since this can only ever match abi3 in CPython. Internally we just need to decide if switching to only supporting "abi3" as a stable ABI is OK (and whether this behavior was documented enough to require deprecation), but any thoughts that could help direct that discussion would help.
  3. in get_supported (here), we backfill supported macosx versions based on the user input. We may be able to use _mac_platforms (if exposed), but would still have to parse the incoming version in get_supported itself.
  4. When I compare the output of a bare call to sys_tags() against get_supported() (on Linux x86-64, Python 3.8), I see some differences
Output
>>> from pip._internal.pep425tags import get_supported
>>> from packaging.tags import sys_tags
>>> import difflib
>>> print('\n'.join(difflib.unified_diff(['-'.join(e) for e in get_supported()], [str(e) for e in sys_tags()])))
---

+++

@@ -34,12 +34,47 @@

 cp32-abi3-manylinux2010_x86_64
 cp32-abi3-manylinux1_x86_64
 cp32-abi3-linux_x86_64
+py38-none-manylinux2014_x86_64
+py38-none-manylinux2010_x86_64
+py38-none-manylinux1_x86_64
+py38-none-linux_x86_64
 py3-none-manylinux2014_x86_64
 py3-none-manylinux2010_x86_64
 py3-none-manylinux1_x86_64
 py3-none-linux_x86_64
-cp38-none-any
-cp3-none-any
+py37-none-manylinux2014_x86_64
+py37-none-manylinux2010_x86_64
+py37-none-manylinux1_x86_64
+py37-none-linux_x86_64
+py36-none-manylinux2014_x86_64
+py36-none-manylinux2010_x86_64
+py36-none-manylinux1_x86_64
+py36-none-linux_x86_64
+py35-none-manylinux2014_x86_64
+py35-none-manylinux2010_x86_64
+py35-none-manylinux1_x86_64
+py35-none-linux_x86_64
+py34-none-manylinux2014_x86_64
+py34-none-manylinux2010_x86_64
+py34-none-manylinux1_x86_64
+py34-none-linux_x86_64
+py33-none-manylinux2014_x86_64
+py33-none-manylinux2010_x86_64
+py33-none-manylinux1_x86_64
+py33-none-linux_x86_64
+py32-none-manylinux2014_x86_64
+py32-none-manylinux2010_x86_64
+py32-none-manylinux1_x86_64
+py32-none-linux_x86_64
+py31-none-manylinux2014_x86_64
+py31-none-manylinux2010_x86_64
+py31-none-manylinux1_x86_64
+py31-none-linux_x86_64
+py30-none-manylinux2014_x86_64
+py30-none-manylinux2010_x86_64
+py30-none-manylinux1_x86_64
+py30-none-linux_x86_64
 py38-none-any
 py3-none-any
 py37-none-any

The extra entries make sense to me and are probably OK, but we should add the missing cpXY-none-any, cpX-none-any.

@brettcannon
Copy link
Member

(1) is a bug, (2) is a maybe, (3) I really don't want to get into the business of trying to parse individual tags. And the output difference is very much on purpose and was heavily researched and discussed when we started this work (distutils-sig will have the archives).

The key question for me is whether the API is where pip needs it to be?

@chrahunt
Copy link
Member

chrahunt commented Nov 9, 2019

Some parts are unresolved, but this should be the full set of conditions required to answer that question:

  • if pip maintains support for the platform declaring multiple stable ABIs (from here), then no - since the current cpython_tags interface is not sufficient for generating well-ordered tags for CPython if we expect multiple stable ABIs
  • if pip maintains support for non-CPython implementations declaring stable ABIs (from here), then no - since the ordering of stable-ABI-using tags is only accounted for in cpython_tags, which hard-codes "cp"
  • if we do not want to maintain macOS compatibility information in two code bases (here and in packaging.tags), then no - since _mac_platforms is not part of the public API
  • if pip maintains the current behavior of not including tags for lower versions when provided an explicit version, then no - since cpython_tags and compatible_tags always internally constructs tags with a descending version range based off of the python_version argument
  • otherwise yes

Regarding the lack of cpXY-none-any and cpX-none-any, the latest explicit proposed example I could find in the archives is here which includes them. Were these excluded in later discussion?

@brettcannon
Copy link
Member

@chrahunt

  1. Adding support for dynamic ABIs is fine; please file a separate bug as that doesn't impact the API of the module
  2. Is exposing _mac_platforms() as-is enough? I basically do not want to have to parse anything as I view tags as an opaque string that tools are doing straight comparison against, but if you are specifically looking for generating those tags just on macOS then we should be able to do that without feeling too bad about it 😉
  3. Do you want to keep the behaviour of not backfilling the interpreter part? It seems to somewhat contradict backfilling macOS, filling in various ABIs, etc. If you do decide to keep the current semantics could you explain why? (We can probably add a flag like skip_python_backfill or something to turn that off if it's only interpreter details, but I do want to understand the logic behind it at least)
  4. I'm okay reconsidering e.g. cp38-none-any if that's a blocker, but cp3-none-any just makes no sense to me (and I explicitly mention this in my blog post on tags). There is no compatibility story for CPython 3 with no specific ABI IMO that I can think of which isn't hypothetical. But do note that when I made this decision I checked PyPI's BigQuery data when Python 3.6 was the latest release and cp36-none-any had a total of 7 wheels and cp3-none-any had 2.

@brettcannon
Copy link
Member

brettcannon commented Nov 10, 2019

@chrahunt I also just double-checked the code and:

https://github.com/brettcannon/packaging/blob/8f54af00fbb97406af8a7f9b3e6378b39bf1fa99/packaging/tags.py#L330-L331

should emit cp38-none-any if you call compatible_tags() with an interpreter parameter (it's actually the entire reason that function even has an interpreter parameter and it makes me feel icky every time I see it). In your code did you pass that argument in to get that tag triple that you're asking for?

@chrahunt
Copy link
Member

chrahunt commented Nov 10, 2019

Adding support for dynamic ABIs is fine; please file a separate bug as that doesn't impact the API of the module

Filed pypa/pip#7327 to discuss whether it's necessary at all. By "doesn't impact the API of the module" you mean we can always add the keyword argument later if required?

Is exposing _mac_platforms() as-is enough?

Yes

Do you want to keep the behaviour of not backfilling the interpreter part?

Not personally, no. 😁

A closer reading of the documentation only suggests that we use this for compatibility checks, but doesn't say exactly how that happens. If the best approach we know for finding compatible wheels involves including ones with a lower-versioned interpreter part, then it seems reasonable to change the behavior without a deprecation notice IMO. That would also align with our existing behavior of back-filling the applicable macOS versions. @pradyunsg or @pfmoore may also have an opinion.

4. I'm okay reconsidering e.g. cp38-none-any if that's a blocker, but cp3-none-any just makes no sense to me (and I explicitly mention this in my blog post on tags).

I think providing cp38 explicitly to compatible_tags on our side is fine and then it can be removed later if it doesn't actually end up providing more matches than needed.

@brettcannon
Copy link
Member

@chrahunt By "doesn't impact the API of the module" means we can just add code that does the discovery and have it implicitly added as part of the semantics without any API change. So I have no problem doing it, I just don't want it lost as part of this PR which I'm trying to keep focused on the public API of packaging.tags.

I will look at exporting _mac_platforms() as a special case of platform support as part of this PR then!

As for not back-filling the interpreter, I realized yesterday that you could probably do it very simply on the pip side:

def interpreter_filter(interpreters, tag):
    return filter(lambda tag: tag.interpreter in interpreters, tags)

That way you can keep your semantics and we don't have to add specific support for it (unless I'm misunderstanding something).

I think providing cp38 explicitly to compatible_tags on our side is fine and then it can be removed later if it doesn't actually end up providing more matches than needed.

So the trick here with that tag is its placement in the order of tag preferences. Being smack-dab in the middle of the "py" tags is what makes it so darn awkward and the reason compatible_tags() needs an interpreter argument to begin with.

The other way to handle this is to expose _py_interpreter_range(), bring back its end parameter, and then have people construct what compatible_tags() does on their own while dropping that function. But I honestly don't know if that's better as there's then a much bigger chance of people constructing tags differently between tools which would partially negate the key point of moving all of this into 'packaging' in the first place.

So for me, I think if you're okay with keeping cp38-none-any but dropping cp3-none-any we can just keep compatible_tags() as it exists in this PR and just be pragmatic about the fact that tag orders are weird. 😉 (Otherwise adding cp3 is tricky as you then suddenly have to provide the interpreter version which is not the same as the Python version for e.g. PyPy and it adds an entirely new type of argument for just a single tag that isn't really used.)

@chrahunt
Copy link
Member

By "doesn't impact the API of the module" means we can just add code that does the discovery and have it implicitly added as part of the semantics without any API change.

👍

I will look at exporting _mac_platforms() as a special case of platform support as part of this PR then!

👍

I realized yesterday that you could probably do it very simply on the pip side

Ah, yes that's obvious now thank you. 😅

So for me, I think if you're okay with keeping cp38-none-any but dropping cp3-none-any we can just keep compatible_tags() as it exists in this PR and just be pragmatic about the fact that tag orders are weird.

I am, I think your blog post makes a good case for not needing that tag.

So to summarize: the only thinking pending for us is exposing _mac_platforms(), otherwise it looks fine and it should be straightforward to switch when available.

@brettcannon
Copy link
Member

@chrahunt I've opened #231 which includes everything I had before plus exposing mac_platforms() and has all the appropriate documentation and tests.

And I believe we have some issues to open:

  • Don't include abi3 when using a version of Python below 3.2 (e.g. 2.7)
  • Dynamically detect abi3

Am I missing anything?

@chrahunt
Copy link
Member

Nope, I think that's it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants