Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in namespace package metadata on poetry-core 1.1.0rc1 #6198

Closed
3 tasks done
ariebovenberg opened this issue Aug 19, 2022 · 7 comments
Closed
3 tasks done
Labels
kind/bug Something isn't working as expected

Comments

@ariebovenberg
Copy link
Contributor

ariebovenberg commented Aug 19, 2022

  • I am on the latest Poetry version. n/a
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option). n/a
  • OS version and name: MacOS 12.5.1
  • Poetry version: poetry-core 1.1.0rc1
  • Link of a Gist with the contents of your pyproject.toml file: see below

Issue

An addition in poetry-core between 1.1.0a7 and 1.1.0b1 introduced a bug in package metadata name for namespaced packages:

>>> from importlib import metadata
>>> metadata('mynamespace.foomodule')
importlib.metadata.PackageNotFoundError: mynamespace.foomodule

The package is instead registered as mynamespace-foomodule. This is a particular problem since importlib.metadata is the canonical way to single-source the __version__

This bug occurs on Python 3.8 and 3.9 (but not 3.10 insterestingly enough). The change that introduced the bug: python-poetry/poetry-core#328. Ever since this change, . is normalized to - in package names, which is incorrect for namespace packages.

To reproduce

pyproject.toml

[tool.poetry]
name = "mynamespace.foomodule"
version = "0.1.0"
description = ""
authors = ["Your Name <[email protected]>"]
readme = "README.md"
packages = [
    { include = "mynamespace"},
]

[tool.poetry.dependencies]
python = "^3.8"

[build-system]
requires = ["poetry-core==1.1.0rc1"]
build-backend = "poetry.core.masonry.api"

files

mynamespace/
├── __init__.py  # contains: __path__ = __import__('pkgutil').extend_path(__path__, __name__)
└── foomodule
    └── __init__.py  # empty

Then, run:

# make sure you're on python 3.8 or 3.9
pip install .
# this raises an error, but shouldn't
python -c "from importlib import metadata; metadata.version('mynamespace.foomodule')"

What to do?

I don't mind submitting a fix in poetry-core, but I'm of course missing a lot of context on why the change was made in the first place.

@ariebovenberg ariebovenberg added kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels Aug 19, 2022
@ariebovenberg ariebovenberg changed the title Regression in namespace package metadata poetry-core 1.1.0rc1 Regression in namespace package metadata on poetry-core 1.1.0rc1 Aug 19, 2022
@dimbleby
Copy link
Contributor

Would be interested in a link to the python 3.10 MR / discussion around the changed behaviour, if you can dig it out.

My insta-take is that poetry is not wrong here: would be interesting to see whether that upstream discussion is framed as "some callers do it wrong but we can help" or as "oops, we should work with canonicalized package names". I'm hoping for the latter.

Meanwhile, can you use canonicalized package names all along? The examples at https://packaging.python.org/en/latest/guides/packaging-namespace-packages/ use mynamespace-subpackage-a (not mynamespace.subpackage.a), suggesting that this was always the preferred approach.

@ariebovenberg
Copy link
Contributor Author

ariebovenberg commented Aug 20, 2022

@dimbleby thanks for the info, digging into the background I now understand the reasoning behind canonicalization, and I can understand heading in that direction.

Good find in the namespace packaging docs! Using name = "mynamespace-foomodule" in pyproject.toml bypasses the entire issue.

@PeterJCLaw
Copy link

@dimbleby given the frequency that this issue has appeared and that fact that it ends up being breaking (or very nearly so) for a number of use cases, would you consider adding a warning from poetry (ideally with a link to an explanation and how to fix) when it does this canonicalisation? Alternatively I would encourage that poetry actually just reject these names rather than silently change them in ways that aren't always compatible with previous versions of the same package.

It feels like either of those would ensure that the issue is visible when the offending package is built, rather than only once it has already been published and users are trying to use it.

A major version bump for this arguably breaking change might have been worth considering too (though I realise there's not much can be done about that now).


Aside: In my case we actually had issues uploading it to PyPI which seemed to stem from the source distribution keeping the dot, while the wheel didn't (though we weren't able to reproduce that failure mode); then confusion when PyPI had seemed to change the name of the project unexpectedly. In the end we pinned an older version of poetry and yanked the version of the package with the adjusted name in order to avoid breakage for now. I realise that's not a permanent solution of course. poetry also seemed to be using underscores rather than dashes in the artefact names, which seems to disagree with canonicalisation to dashes and further confused us.

@dimbleby
Copy link
Contributor

Doubtful. If you want to offer an MR then I expect it would get proper consideration, but I've no interest in doing this myself.

If you want to be able to upload packages to pypi with dots in their names, please go encourage pypi/warehouse#10030. If and when that is resolved at pypi, it will be straightforward for poetry to follow.

@PeterJCLaw
Copy link

If you want to be able to upload packages to pypi with dots in their names, please go encourage pypi/warehouse#10030. If and when that is resolved at pypi, it will be straightforward for poetry to follow.

For clarity: part of what causes confusion for me on this is that PyPI demonstrably does accept distribution files with dots in (e.g: sr.robot3) when the package name also contains dots.

I (now) realise there are PEPs which suggest that package names shouldn't contain dots (though the discussion on the practicalities of that and where those PEPs apply seems far from resolved), however that doesn't change the reality that dots are being used in the wild and this change was breaking for a number of projects.

The fact that PyPI doesn't accept doted package names to have underscored file names seems like a separate issue to me.

I'll have a look at putting together some changes; does poetry have a warnings framework I should be looking to use? From a quick look I can see there's use of the standard library logging module which seems to line up with the printed output (plus some colourisation I've not found), is that the way to go here?

@neersighted
Copy link
Member

The issue is that PyPI accepts dotted names, but not in a consistent way. We can't programmatically normalize a name to be valid for PyPI and allow dots/capitalization, and we don't want to leak PyPI's idiosyncratic validation into Poetry code. See #1202 for more of this.

Certainly, you're welcome to add a warning when names do get normalized and may surprise the user when they go to reference them -- that being said, I don't think Poetry should force people to use normalized names in their pyproject.toml -- it's just that the artifacts we produce need to be normalized as this causes less problems than not normalizing as we did before (there were a lot of ugly bugs around this).

PeterJCLaw added a commit to PeterJCLaw/poetry that referenced this issue Mar 5, 2023
The handling of package names within the Python packaging ecosystem
is unfortunately not in a great state at the moment. There exist
a large number of packages which contain characters which (some of)
the specs indicate are invalid, yet there is not a clear migration
path nor destination at the moment.

Poetry's approach is to normalise project names towards their
"canonical" form, meaning that existing projects on PyPI which
use the more relaxed name forms will be silently renamed if they
move to (recent versions of) Poetry. As well as changing the name
of these packages (which maintainers are unlikely to expect nor
desire) this unfortunately ends up breaking introspection via
`importlib.metadata` in Python 3.8 and 3.9 which are not aware of
the canonicalisation rules which Poetry is using.

Adding this compatibility note informs maintainers so that they can
decide how (and if) they want this name canonicalisation to happen.

See also python-poetry#6198
PeterJCLaw added a commit to PeterJCLaw/poetry that referenced this issue Mar 5, 2023
The handling of package names within the Python packaging ecosystem
is unfortunately not in a great state at the moment. There exist
a large number of packages which contain characters which (some of)
the specs indicate are invalid, yet there is not a clear migration
path nor destination at the moment.

Poetry's approach is to normalise project names towards their
"canonical" form, meaning that existing projects on PyPI which
use the more relaxed name forms will be silently renamed if they
move to (recent versions of) Poetry. As well as changing the name
of these packages (which maintainers are unlikely to expect nor
desire) this unfortunately ends up breaking introspection via
`importlib.metadata` in Python 3.8 and 3.9 which are not aware of
the canonicalisation rules which Poetry is using.

Adding this compatibility note informs maintainers so that they can
decide how (and if) they want this name canonicalisation to happen.

See also python-poetry#6198
Copy link

github-actions bot commented Mar 1, 2024

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Something isn't working as expected
Projects
None yet
Development

No branches or pull requests

5 participants