Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discuss how to update the "Tool recommendations" page #1468

Closed
jeanas opened this issue Dec 23, 2023 · 23 comments · Fixed by #1474
Closed

Discuss how to update the "Tool recommendations" page #1468

jeanas opened this issue Dec 23, 2023 · 23 comments · Fixed by #1474
Labels
component: recommendations type: discussion Discussion of general ideas, design, etc.

Comments

@jeanas
Copy link
Contributor

jeanas commented Dec 23, 2023

Quoting from https://gregoryszorc.com/blog/2023/10/30/my-user-experience-porting-off-setup.py/:

Outdated Tool Recommendations from the PyPA

Finally I find Tool recommendations in the PyPA User Guide. Under Packaging tool recommendations it says:

  • Use setuptools to define projects.
  • Use build to create Source Distributions and wheels.
  • If you have binary extensions and want to distribute wheels for multiple platforms, use cibuildwheel as part of your CI setup to build distributable wheels.
  • Use twine for uploading distributions to PyPI.

Finally, some canonical documentation from the PyPA that comes out and suggests what to use!

But my relief immediately turns to questioning whether this tooling recommendations documentation is up to date:

  1. If setuptools is recommended, why does the Packaging Python Projects tutorial use Hatch?
  2. How exactly should I be using setuptools to define projects? Is this referring to setuptools as a [build-system] backend? The existence of define seemingly implies using setup.py or setup.cfg to define metadata. But I thought these distutils/setuptools specific mechanisms were deprecated in favor of the more generic pyproject.toml?
  3. Why aren't other tools like Hatch, pip, poetry, flit, and pdm mentioned on this page? Where's the guidance on when to use these alternative tools?
  4. There are footnotes referencing distutils as if it is still a modern practice. No mention that it was removed from the standard library in Python 3.12.
  5. But the build tool is referenced and that tool is relatively new. So the docs have to be somewhat up-to-date, right?

These are all valid concerns.

I'm willing to do the work of updating that page, but I think this is a potentially very delicate/controversial topic, so I figured I'd open an issue to agree on a plan first, so as not to mix this discussion with PR review details.

Some thoughts by tool category:

  • I think we can agree that pip should be recommended. But this could/should also mention pipx for the specific case of CLI tools. I don't think pipx has much competition right now (though it may have in the future).
  • The recommendation of virtualenv/venv is fine. But we could mention that there many (a great many) tools out there to manage environments more automatically (virtualenvwrapper, Poetry/PDM/Hatch, etc.)
  • The recommendation of pip-tools, Pipenv or poetry for lock files could be expanded to include PDM.
  • I think we should drop buildout and Hashdist, both of these are dead judging from Git commit history, especially Hashdist.
  • I do not think the PyPA should officially recommend setuptools as a build backend. After all, the explicit purpose of PEP 517 was to allow other backends. Here, I think we should simply list the different alternatives without making anything that would look like an official choice, since there is no consensus on such a choice. The backends could be listed in the same order as the tabs in the "Packaging projects" tutorial.
  • The cibuildwheel recommendation is fine. I don't think there are any serious alternatives.
  • I'm not sure about recommending build and twine. Certainly, they're better than python setup.py bdist_wheel and python setup.py upload, but the equivalent Hatch/Poetry/PDM commands are fine too.

By and large, I think we should make less recommendations on that page, be more objective, and especially avoid recommending a given tool when there is no consensus among PyPA members that this should be the blessed tool, since the officialness of the page makes users believe that a consensus exists. For better or worse, there is more good advice to be given on what not to use — e.g., this could be a good page to remind things like "don't use python setup.py upload", "don't use distutils", and "whichever backend you choose, declare it in the [build-system] table".

One other thing I'm not sure about is the status of Pipenv (how well-maintained is it and are many people using it after all the controversy that there was around it?).

@webknjaz
Copy link
Member

  • I'm not sure about recommending build and twine. Certainly, they're better than python setup.py bdist_wheel and python setup.py upload, but the equivalent Hatch/Poetry/PDM commands are fine too.

These are purpose-specific tools, related to specific tasks in packaging and so we recommend them. The latter are workflow tools, not necessarily packaging tools, only one of them being under PyPA, some having a track record of serious problems. PyPUG should recommend tools for tasks. Workflow tools can be in a separate category, like managing virtualenvs (next to things like tox/nox).

@webknjaz
Copy link
Member

  • I think we should drop buildout and Hashdist, both of these are dead judging from Git commit history, especially Hashdist.

Yes.

@webknjaz
Copy link
Member

  • I do not think the PyPA should officially recommend setuptools as a build backend. After all, the explicit purpose of PEP 517 was to allow other backends. Here, I think we should simply list the different alternatives without making anything that would look like an official choice, since there is no consensus on such a choice. The backends could be listed in the same order as the tabs in the "Packaging projects" tutorial.

It's a PyPA project that works. And in some cases, there's no (easily) usable replacements. So I'm -1 on this.

@webknjaz
Copy link
Member

the officialness of the page makes users believe that a consensus exists.

I think we might need a document "What PyPA is not." explaining that there can't be consensus in some areas, by design.

@webknjaz
Copy link
Member

One other thing I'm not sure about is the status of Pipenv (how well-maintained is it and are many people using it after all the controversy that there was around it?).

I'm hearing that the new maintainers are better. Same with Poetry — people said that the current maintainers aren't as problematic. But I'm still not a fan of either because of their history and bad experiences.

@jeanas
Copy link
Contributor Author

jeanas commented Dec 23, 2023

  • I do not think the PyPA should officially recommend setuptools as a build backend. After all, the explicit purpose of PEP 517 was to allow other backends. Here, I think we should simply list the different alternatives without making anything that would look like an official choice, since there is no consensus on such a choice. The backends could be listed in the same order as the tabs in the "Packaging projects" tutorial.

It's a PyPA project that works. And in some cases, there's no (easily) usable replacements. So I'm -1 on this.

TBH, I was expecting this to be the controversial item 😄

I completely admit that I am biased, since I personally find setuptools hard to understand and use.

With that being said, I don't really understand your argument with setuptools being a PyPA project, since Flit and Hatch are also PyPA projects. Since there are three build backends in the PyPA, how do we justify the existence of the other two if PyPA "officially recommends" setuptools?

I also think it's worth remembering that when this recommendation was added long ago, it was a recommendation to use setuptools as opposed to distutils, not as opposed to other build backends which only became possible later, and at that time there was consensus on using setuptools rather than distutils. Today, there is clearly no consensus — you like setuptools, I like Hatch, the Flit maintainers like Flit, ... So, by the evolution of the situation, the page has now become objectively misleading, since it presents setuptools as the official recommendation as if this were the PyPA consensus, whereas there is no PyPA consensus. Quoting @pfmoore on https://discuss.python.org/t/removing-setup-cfg-and-setup-py-from-the-packaging-tutorial/16096/19?u=jeanas: “We’ve worked hard to open up the ecosystem to allow users to have a choice of backends. Promoting and encouraging that choice is part of that work.”

Finally, it's kind of contradictory if the official recommendation is setuptools but the "Packaging projects" tutorial uses Hatchling by default.

To be extra clear, I'm absolutely not advocating that we change this page to recommend a build backend other than setuptools, but that we say something like

“PyPA does not recommend a single build backend for all cases. Popular build backends include Hatchling, setuptools, Flit-core, PDM-backend and Poetry-core. For compiled extension modules, common build backends are meson-python, scikit-build and maturin.”

@sinoroc
Copy link
Contributor

sinoroc commented Dec 24, 2023

the officialness of the page makes users believe that a consensus exists.

I think we might need a document "What PyPA is not." explaining that there can't be consensus in some areas, by design.

I like this idea.

jeanas added a commit to jeanas/packaging.python.org that referenced this issue Dec 30, 2023
@jeanas
Copy link
Contributor Author

jeanas commented Dec 30, 2023

I've opened #1474, which implements basically what I said regarding build backends, along with other updates of the same page.

@webknjaz
Copy link
Member

“PyPA does not recommend a single build backend for all cases. Popular build backends include Hatchling, setuptools, Flit-core, PDM-backend and Poetry-core. For compiled extension modules, common build backends are meson-python, scikit-build and maturin.”

Yeah, I see your points and having read what you wrote and linked earlier got me thinking that it would be useful to have some sort of an admonition/banner at the top of all (many?) pages with this (similar?) explanation.

I remember the PyPA got a lot of blame for using one tool or the other in different guides with people reading it as favoritism. And having more explicit disclaimers would hopefully improve the perception. The individual documents will still use a single tool per document in most cases due to the nature of those text types, of course.

@webknjaz
Copy link
Member

Also, I have a feeling that there's confusion in some places when we document using tools as build front-ends. This is because many of those are workflow tools that compare more with tox/nox, not pip — they often manage virtualenvs and something that resembles lockfile ideas. They are also arbitrary command runners. And strictly speaking, I don't think something called a packaging guide should focus on those. In this context, I'd mention hatchling over Hatch, for example, and be explicit about their components that implement cross-compatible standards rather than unique inventions that cause vendor-locking.

@eli-schwartz
Copy link

If setuptools is recommended, why does the Packaging Python Projects tutorial use Hatch?

The tutorial used Hatch because the tutorial was improved to start using pyproject.toml metadata as defined by PEP 621, and at the time the PR was opened by @henryiii, setuptools didn't support that, but 6 months later when the tutorial was merged, setuptools had supported it for quite some time. I got the impression at the time that it was unofficially chosen because "hatch good", but I could be wrong.

It was the wrong move either way, since flit provides a much better tutorial for getting people started quick-'n-easy which is kind of what a tutorial is for.

To be extra clear, I'm absolutely not advocating that we change this page to recommend a build backend other than setuptools, but that we say something like

“PyPA does not recommend a single build backend for all cases. Popular build backends include Hatchling, setuptools, Flit-core, PDM-backend and Poetry-core. For compiled extension modules, common build backends are meson-python, scikit-build and maturin.”

Setuptools remains an adequate build backend for cases where you have no external dependencies, don't use fortran, and single-threaded compilation isn't a bottleneck for you. This is a huge percentage of packages.

Maturin is a special-case backend, which is very useful if you use rust (very nearly the only game in town) and very useless if you don't. To an extent it feels a bit to me like "if you use rust, you know maturin exists and you also probably wanted to use it anyway". I'm not sure whether it would be more confusing to mention it for something as generic as "compiled extension modules" -- in comparison, meson-python and scikit-build are powered by actual omni-language build systems that can handle pretty much anything you might try throwing at it (meson can even build rust code, just FYI ;)) and thus make a lot of sense to advertise in that slot.

@eli-schwartz
Copy link

eli-schwartz commented Jan 3, 2024

Since once again, wording seems to be getting designed with the unstated goal of listing Hatch first on a list and saying it's unbiased, I'd just like to point out that actual unbias typically involves something neutral like "we used alphabetical listing and explicitly state the list is alphabetized, in order to forcibly expunge bias, whether subconscious or otherwise".

@jeanas
Copy link
Contributor Author

jeanas commented Jan 3, 2024

Since once again, wording seems to be getting designed with the unstated goal of listing Hatch first on a list and saying it's unbiased,

Sorry, but I have to say that I feel hurt by this downright personal accusation.

The whole point of opening this issue and discussing the matter before opening a PR with a proposal was to give notice to anyone who wanted to comment, and avoid any impression that the change had been done under the cover by some random PR. And now you're accusing me of acting on a hidden agenda, which is exactly the kind of accusation that I wanted to prevent.

Yes, I like Hatch, if you want to know. The order in the PR isn't my preference order though (or PDM-backend would be second and setuptools last), it's just the same order as the tutorial because I didn't want to open the question of the order as a sub-debate and that order is just consistent with the tutorial (so any decision should affect the tutorial at the same time).

@eli-schwartz
Copy link

eli-schwartz commented Jan 3, 2024

Sorry, but I have to say that I feel hurt by this downright personal accusation.

The whole point of opening this issue and discussing the matter before opening a PR with a proposal was to give notice to anyone who wanted to comment, and avoid any impression that the change had been done under the cover by some random PR. And now you're accusing me of acting on a hidden agenda, which is exactly the kind of accusation that I wanted to prevent.

I apologize for giving you that impression. This is partially hasty wording on my part.

it's just the same order as the tutorial because I didn't want to open the question of the order as a sub-debate and that order is just consistent with the tutorial (so any decision should affect the tutorial at the same time).

The problem as I see it is that it's not actually obvious why keeping the order was chosen. Mechanically:

Here, I think we should simply list the different alternatives without making anything that would look like an official choice, since there is no consensus on such a choice. The backends could be listed in the same order as the tabs in the "Packaging projects" tutorial.

By and large, I think we should make less recommendations on that page, be more objective, and especially avoid recommending a given tool when there is no consensus among PyPA members that this should be the blessed tool, since the officialness of the page makes users believe that a consensus exists.

As you say, an explicit goal of this discussion is to solve the potential for suspected bias on an official page which looks like an official choice, given that no actual blessing exists.

But then why keep the same order as the tabs in the tutorial? Is it:

  • because the order feels correct to the author?
  • to avoid getting into debate?
  • simple lack of that "aha" moment where one realizes this too is part of the issue?

(For the record, my money was on option 3, not option 1. Either way I was wrong. I'm sorry.)

Transcluding someone else's (explicitly per option 2 or implicitly per option 3) is still ultimately the effect of having that goal. Either way one is still falling short of the stated objective of fixing an issue whereby tools are perceived as blessed because of the way an official page seems to portray them.

I cannot agree with the decision to avoid opening the question of the order as a sub-debate. I completely 100% understand that it's very tempting to avoid that topic as it feels like the only truly controversial topic. But I think it undermines the entire foundational direction of the proposal, to fail to address the main issue.

The alphabetical ordering approach is very common, and the natural instinct of people when it comes to publicizing information where the order doesn't matter but a decision must be made and ideally one that is consistent with later insertion of additional elements. It has the benefit of being mostly unbiased, with the caveat that we may theoretically get a wave of new backends all named "aardvark" in order to game the system and get to the top of the list. RFC 2777 describes an algorithm for guaranteeing unimpeachably unbiased random selection by relying on government run lotteries; this would solve that problem but is also tremendously overkill and I would simply not bother.

I'm genuinely surprised it wasn't originally chosen for the tutorial as well, but such is life. It's a fixable problem.

@pfmoore
Copy link
Member

pfmoore commented Jan 3, 2024

It was the wrong move either way, since flit provides a much better tutorial for getting people started quick-'n-easy which is kind of what a tutorial is for.

I’m pretty sure that the author of flit explicitly requested that flit not be used as the recommended or default backend.

The alphabetical ordering approach is very common

Given that flit is excluded by request, hatchling comes first in alphabetical order. So that’s ok then, isn’t it? 🙂

More seriously, I don't think any backend should be unilaterally recommended, as they all have issues:

  • Setuptools has too many legacy options, and too much out of date information on the web, to make it a good choice for a beginner.
  • Flit is excluded by request.
  • Poetry, PDM and hatch are too closely tied to their respective workflow tools, making a backend recommendation feel like a workflow tool recommendation (which we definitely shouldn’t be making)
  • Scikit-build, meson-python and maturin are too specialised (most readers will be writing pure python packages).
  • Poetry still uses non-standard syntax in some places (specifiers, I believe).

There's also a question of whether we should consider recommending non-PyPA backends. Personally, I don’t think being a PyPA project is necessary, but some people feel that the whole “PyPA own the tools” idea is important, so picking a non-PyPA tool would involve a whole extra debate (of course not picking a non-PyPA tool could just as easily trigger that 🙁)

If we have to pick (and we do for deciding on the default tab for the examples, at least…) then the best options IMO are setuptools (with a bunch of “look out for out of date information” warnings) or one of the workflow tool backends (with big disclaimers that there’s no implied recommendation of the tool itself). Or see if @takluyver has changed his position on flit as a recommended backend.

@jeanas
Copy link
Contributor Author

jeanas commented Jan 3, 2024

I apologize for giving you that impression. This is partially hasty wording on my part.

Thank you, apology accepted.

But then why keep the same order as the tabs in the tutorial? Is it:

  • because the order feels correct to the author?
  • to avoid getting into debate?
  • simple lack of that "aha" moment where one realizes this too is part of the issue?

It's a combination of (2) and the fact that I thought the issue had already been discussed and resolved while making the tutorial (i.e., that someone had already gone through the process of getting consensus on this issue, and there was no need to do that again). It looks like I underestimated the controversy on the order of tabs in the tutorial, though :(

Oh well. I'm going to change the PR to list backends alphabetically on the "Tool recommendations" page. The order will be different on the tutorial, but I'm not going to change the tutorial now (and any change there is going to be hard because, as @pfmoore said, having some tab as the default is unavoidable).

@jeanas
Copy link
Contributor Author

jeanas commented Jan 3, 2024

(@pfmoore

Poetry still uses non-standard syntax in some places (specifiers, I believe).

Poetry (still) doesn't even use the [project] table, a.k.a. PEP 621.)

@henryiii
Copy link
Contributor

henryiii commented Jan 3, 2024

The order was a point of long discussion. I even had the idea of randomizing the initial open tab, but that wasn't popular. :) The current order came from:

  1. Speed. Setuptools is way slower than the backends that have less than 40,000 lines of code. A nice snappy experience is attractive to new users. I benchmarked all the backends, you can see the results in the discussions.
  2. Error messages: Along the same line, Setuptools exposes its multilayer massive codebase whenever an error occurs. The other backends have much shorter and cleaner errors, with Hatchling working the hardest in providing clean and readable errors.
  3. Ease-of-use. Flit-core is awful here when it comes to including files, as it is extremely likely to not include files in your SDist (unless you use the flit command line tool, which literally picks up different files than when using standard-based tools, adding to the confusion). Anything past exactly what's in the tutorial and you'll probably have missing files unless you use tool-specific configuration. Because we have to support Flit-core in bootstrapping packages, I wrote the check-sdist tool, literally to deal with Flit-core's file inclusion. Hatchling and PDM-backend are much, much better here, as they have nice defaults for modern software development. Setuptools is in-between, which a more extensive default list.
  4. Scalability. If you start to want more features (VCS version control or other such plugins), the others offer quite a bit more than Flit-core, meaning users are less likely to have to switch backends as they progress. Setuptools is by far the leader here, but everything besides Flit-core supports plugins.
  5. Part of the PyPA: Hatchling, Flit-core, and setuptools are all PyPA projects. I intentionally wanted to include PDM-core because it's not a PyPA project, but putting a PyPA project first (since we often handle the complaints if something fails) made the most sense. IMO this is the main reason to select Hatchling over PDM-backend, which is also very good.

That's why the order is what it is. I'd say either Hatchling or PDM-backend is a great starting point for modern Python development, with Setuptools & Flit regulated to specialty cases.

I don't think the order in the tutorial should affect any other lists, though. It's very much focused on providing a good user first experience, and not on trying to declare a specific tool "recommended". You just have to have a first tab, and the one that provides the best starting experience is first. Lists are also not limited to these four backends, etc. Alphabetical sounds fine for any lists.

@abravalheri
Copy link
Contributor

abravalheri commented Jan 3, 2024

  1. Speed. Setuptools is way slower than the backends that have less than 40,000 lines of code. A nice snappy experience is attractive to new users. I benchmarked all the backends, you can see the results in the discussions.

Hi @henryiii, just a minor comment, the implementation for setuptools.build_meta previously has not been properly studied in terms of performance. Since the last discussion in this repository that changed, and the performance has improved quite a bit (still there is a lot of room for further improvement).

I was always a big fan of the random opened tab proposal. I think it is the fairer model for an umbrella association that has multiple projects with intersections 😝.

@takluyver
Copy link
Member

Or see if @takluyver has changed his position on flit as a recommended backend.

If there was otherwise a consensus that it's a good default and my objection was the only thing standing in the way, I would probably remove that objection. I doubt there is such a consensus, though, for the reasons @henryiii describes.

I'd like to avoid what happened to pipenv: PyPA appeared to recommend it as the tool for application development, lots of people tried to use it, and got upset with the project when it didn't do what they wanted. I think packaging is quite susceptible to this kind of thing, because we often don't want to think about it, so we look for whatever seems to be the default. And Flit is never going to meet everyone's needs, by design.

Flit-core is awful here when it comes to including files, as it is extremely likely to not include files in your SDist (unless you use the flit command line tool, which literally picks up different files than when using standard-based tools, adding to the confusion

This isn't the place to get into a lengthy discussion, but briefly: yeah, this didn't end up in a great state, due to some unfortunate decisions I made back before PEP 517. I'm trying to get rid of the difference between what flit build does vs. the flit_core API; I should get back to that and move it forwards.

I'm inclined to think that it's better for ease of use to have a simple rule for sdists by default - i.e. only include the files we need to build a wheel - and let people override it, than to guess what files to include based on some heuristics which will inevitably be incomplete.

@pfmoore
Copy link
Member

pfmoore commented Jan 4, 2024

I'd like to avoid what happened to pipenv: PyPA appeared to recommend it as the tool for application development, lots of people tried to use it, and got upset with the project when it didn't do what they wanted.

Agreed. IMO, this is very much a problem with any "PyPA recommended" tool - if we pick any one tool as the recommended one, then simply by being the recommended tool, it has to be suitable for everyone (otherwise we get complaints that "the recommended tool is no good" and we repeat the pipenv experience, to no-one's benefit). But we cannot dump such a requirement on any tool, as tool authors are independent of the PyPA and have the right to make their own choices about what workflows and user requirements they support.

Unfortunately, there's a strong undercurrent of support in the community for a "one tool to cover everything" solution, and it's hard to fight against that. I feel that the support is based heavily on a mistaken assumption that the "one tool" will always support the individual's preferred workflow, though, which I'd consider optimistic at best.

I'd like to see any "tool recommendations" page be very explicit that there is no one "best" tool and users need to pick the right tool for their particular workflow. While we might choose to show particular tools as they work well for common use cases, users who have their own established workflows will need to look beyond what the guide suggests, if they want to find a solution that doesn't require them to make changes to how they work.

@chrysle chrysle added component: recommendations type: discussion Discussion of general ideas, design, etc. labels Jan 12, 2024
@webknjaz
Copy link
Member

I'm inclined to think that it's better for ease of use to have a simple rule for sdists by default - i.e. only include the files we need to build a wheel - and let people override it, than to guess what files to include based on some heuristics which will inevitably be incomplete.

That's a fresh idea. I could definitely get behind having a unified shared lib implementing just producing sdists in a standardized manner, reused by the regular PEP 517 backends.

And I also agree with what Paul said, that such a document must have a huge disclaimer about it not having a tool rating.

@webknjaz
Copy link
Member

Apparently, there's a new wave of "I want you to make an all-in-one workflow tool for everyone but catering to my workflow" on LWN: https://twitter.com/hynek/status/1750519782573834568

webknjaz pushed a commit to jeanas/packaging.python.org that referenced this issue Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: recommendations type: discussion Discussion of general ideas, design, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants