Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIP dependencies for Python plugins #202

Open
olivierdalang opened this issue Nov 25, 2020 · 42 comments
Open

PIP dependencies for Python plugins #202

olivierdalang opened this issue Nov 25, 2020 · 42 comments

Comments

@olivierdalang
Copy link

olivierdalang commented Nov 25, 2020

QGIS Enhancement: PIP dependencies for Python plugins


⚠️ Only part 1. of this QEP is concerned by 2021 grants ⚠️ (see below)


Date 2020/11/25

Author Olivier Dalang (@olivierdalang)

Contact [email protected]

maintainer @olivierdalang

Version QGIS 3.18

Summary

Add an optional pip_dependencies config in metadata.txt for plugins to be able to define pip dependencies, which are installed upon plugin installation into a virtual environment located in the user's profile directory.

Proposed Solution

A virtual environment (see below) will always be created when creating a user profile (including default). It will be activated when starting QGIS.

A new pip_dependencies config in metadata.txt will be added. Syntax will match the syntax of the plugin_dependencies config (PIP-like comma separated list of libraries to install).

Before installation, requirements will be checked using pkg_resources.parse_requirements and pkg_resources.require. In case of missing dependencies, user will be informed that additional downloads will occur. In case of conflicting dependencies, user will be warned but will still be able to proceed.

During plugin installation, plugins will be installed using pip into that current virtual environment.

A new tab in the plugin manager would show currently installed dependencies and results of "pip check" (which supposedly shows conflicting requirements if any). (added 26/02/2021)

Example(s)

#metadata.txt
name=HelloWorld
[email protected]
author=Just Me
...

pip_dependencies=requests==2.24.0, pytz

Affected Files

Plugin manager
Plugin repository

Performance Implications

May make plugin installation a bit slower

Further Considerations/Improvements

pip availability

This requires pip to be available. I think this is now the case on all platforms (to be confirmed) :

  • Windows standalone installer installs it (tested with 3.18)
  • OSGeo4W advanced installer has the python3-pip package, but doesn't install it by default -> we would have to set it as a dependency of QGIS (added 26/02/2021)
  • QGIS MacOS Package Improvements #177 added it for mac os X
  • As per python docs, Starting with Python 3.4, it is included by default with the Python binary, meaning it should be available on linux distros as well

virtual environment

Having a per-profile virtual environment has several advantages :

  • each user profile has it's own plugins, so it could happen that two user profiles have incompatible dependencies
  • probably solves permission issues, as we know we can write to the user profile (without this, we can't install plugins anyways)
  • users already know that location

[edit 08/04/2021] It's not clear yet whether we will use an actual virtual environnement (venv or virtualenv - and with what exact arguments) or if we merely alter the PATH/PYTHONPATH. The later approach may work better as users may use different version of python (if they work with multiple versions of QGIS).

current state of the code

The plugin manager code is not in the best shape (halfway split between C++ and Python, which may complicate implementation.

requirements.txt (edit 25/11/2020 13:00)

Alternatively to metadata.txt, we could list the dependencies in the regular requirements.txt.
Pros :

  • would work with existing dev tools/workflows, such as pip install -r requiements.txt or github's dependabot

Cons :

  • we don't have access to requirements before downloading the plugin
  • we already have plugin_dependencies in metadata.txt, makes sense to have all dependencies listed at the same place

proxy configuration (added 26/02/2021)

We will not be able to use QGIS's network manager, as the retrieval of packages would be done by pip in a subprocess. This will be an issue in corporate environnements that have some specific proxys requirements. I'm not too familiar with these issues, but we may be able to work around this with pip'3 --proxy argument (https://pip.pypa.io/en/stable/user_guide/#using-a-proxy-server)

Libraries conflicts (added 18/03/2021)

Some python packages may install incompatible libraries (gdal, proj...) which would result in crashes.
This is not easily solvable as part of this QEP : simply blacklisting some packages would not work for indirect dependencies, and relying on a binary-aware package manger (Conda) requires much bigger change (such as replacing OSGeo4W by Conda altogether ?).

This QEP doesn't deal with these cases, for which this QEP would result in a status quo (still needed to add the packages through OSGeo4W or other platform specific step).

It will be clearly documented for develpers to only require packages that work with versions of libraries provided by QGIS. There's a high chance that the developer will get crashes early, so that the probability of having this type of problematic dependencies in the plugin repository should be quite low.

Conda (added 18/03/2021)

Using a Conda environment instead of a simple virtualenv would allow requiring some dependencies that are not simply pip-installable (because they require binaries or advanced compilation steps). It would however not solve the issue of depending on incompatible versions of libraries (such as proj or gdal) and complexify the installation process because of the need to fall back to pip (inside the conda environment) to install non-conda packages.

Thus, the proposal currently does not integrate Conda. Leveraging Conda if available is however a plausible further improvement, as Conda also understands requirements.txt.

virtualenv backend (venv vs virtualenv vs pipenv) (added 18/03/2021)

There are several virtual environment managers available for python (venv, virtualenv and pipenv). pipenv does more than just managing the virtualenv, as it also manages dependencies in a more controlled way than just pip (by creating a lockfile, allowing to created reproducible deployements). In our context, where each plugin does independently define it's own dependencies, a lockfile cannot be used, as we need the exact opposite (a set of dependecies as loose as possible).

Suggestion is to use venv as it's almost equivalent to virtualenv but part of the standard library (the additional features of virtualenv are not relevant to the needs of this QEP). It uses the simpler requirements.txt.
In any case, as activating a virtual env from withing python is not supported, it's likely that this will have to be done from QGISs bootstrapping code

see also

Backwards Compatibility

The new pip_requirements metadata will be ignored in previous versions of QGIS, where this will result in status quo.

Issue Tracking ID(s)

None yet

Votes

(required)

Part covered by grant 2021 (edit : 2021/04/08)

Due to relatively high uncertainty about how easy/feasible this idea is, this QEP is split in two parts :

  1. Feasibility study (and if feasible proof-of-concept - delivered as a branch/draft PR)
    • on QGIS startup, create/load virtualenv in the user profile
    • ensure pip install works from a subcommand from within QGIS and doesn't cause conflicts with commonly required pip librairies on all platforms
  2. Implementation
    • install requirements on plugin installation
    • GUI (confirm installation, review current dependencies)
    • adapt qgis-django
    • documentation
    • tests

The proposal for 2021 grants only concerts part 1. If the feature is doable, part 2 will be submitted during the next call for proposals (unless funded earlier).

@elpaso
Copy link

elpaso commented Nov 25, 2020

Nice, I'm having a couple of "Dèjà-vu" here.

Some points to be cleared: how would you create a venv?

Also, the last time I checked for this approach (probably many years ago though), the blocker was that PIP did not provide a library (they said it was on purpose), so you would need to call pip as a process with all the problems and architecture issues that come with this approach. Is that still the case or we can use PIP as a library now and call it from our code directly?

@olivierdalang
Copy link
Author

olivierdalang commented Nov 25, 2020

Some points to be cleared: how would you create a venv?

Not 100% clear yet. If we create it on startup (running something like venv {profile_name}/python/venv from a subprocess if it doesn't exist already), I guess there must be some way to activate it before python is initialized ? Do you see some potential blockers with such an approach ? (I'm not familiar with python bootstrapping in QGIS).

Also, the last time I checked for this approach (probably many years ago though), the blocker was that PIP did not provide a library (they said it was on purpose), so you would need to call pip as a process with all the problems and architecture issues that come with this approach. Is that still the case or we can use PIP as a library now and call it from our code directly?

Yes AFAIK this is still the case... I think installing from a process should not be an issue itself if we have an activated venv in the profile directory though, as there we should have no permissions issues and not ambiguity with pip versions ? It's more about how to live load/reload libraries, which I'm not so sure is doable reliably. Worst case, we'd have to ask the user to relaunch QGIS after dependencies installation. Best case, something like https://stackoverflow.com/a/45405667/13690651 works.

@kikislater
Copy link

kikislater commented Nov 26, 2020

Does pip with prefix not sufficient ?

pip install --install-option="--prefix=$PREFIX_PATH" package_name

You can set the following environment variable:
PIP_PREFIX=$PREFIX_PATH
or
PIP_TARGET=$PREFIX_PATH

It's recommended in NixOS :

https://nixos.wiki/wiki/Python#Emulating_virtualenv_with_nix-shell

PIP Path options ( source : https://stackoverflow.com/a/53870246 ) :

pip install --target /myfolder [packages]

Installs ALL packages including dependencies under /myfolder. Does not take into account that dependent packages are already installed elsewhere in Python. You will find packages from /myfolder/[package_name]. In case you have multiple Python versions, this doesn't take that into account (no Python version in package folder name).

pip install --prefix /myfolder [packages]

Checks are dependencies already installed. Will install packages into /myfolder/lib/python3.5/site-packages/[packages]

pip install --root /myfolder [packages]

Checks dependencies like --prefix but install location will be /myfolder/usr/local/lib/python3.5/site-packages/[package_name].

pip install --user [packages]

Will install packages into $HOME: /home/[USER]/.local/lib/python3.5/site-packages Python searches automatically from this .local path so you don't need to put it to your PYTHONPATH.

@elpaso
Copy link

elpaso commented Nov 26, 2020

@olivierdalang other things to consider:

  1. pip modules dependencies: we must be careful of pip modules that might install binary blobs of incompatible libraries (such as geos, gdal, proj)
  2. IMO we need to ask for the user consent before installing anything from the network, this is for security reasons

@pcav
Copy link
Member

pcav commented Dec 31, 2020

Any development on this?

@olivierdalang
Copy link
Author

Just edited the QEP:

  • mentioned of a new tab in the plugin manager showing the state of dependencies
  • edited the comment about pip availability (which may require setting python3-pip as a dependency of QGIS in OSGeo4W)
  • added a section about proxy for corporate environnements

Otherwise, will post on the ML to ask for comments on this, as we'd like to propose this QEP for the next QGIS grants round.

@jfbourdon
Copy link

To add on the potential issue in a corporate environment, here I didn't used --proxy argument, but rather --trusted-host pypi.python.org --trusted-host files.pythonhosted.org --trusted-host pypi.org so that I could pass through our firewall (at least that's my comprehension of what is going on). Without this, I just get SSL certificate related errors.

@lucernae
Copy link

What happens if the python modules needs to be built and it depends on native builder dependencies (like compilers, or distro packages)?

Binaries referred by different plugins can also have different version and may not be installable. We probably need to limit the scope, so we can only install python modules that are "easy" to handle.

For me, it may be possible to let the user manage native dependencies by themselves. Like using Nix package manager, for example. Then keep the native dependencies out of scope from QGIS handling. The user and plugin developer may provide their own setup if QGIS provide a system hook for each plugin, before each of them are initialized.

Also I agree that it is important for QGIS to let us know what the plugin will do to our system. Is it possible to do it like how mobile app are developed? Providers take a list of permission the plugin declares, then show it to user. The QGIS API or hook that needs these permission can't execute if the user won't allow it?

Sorry, I'm just throwing ideas at the moment.

@nyalldawson
Copy link
Contributor

Binaries referred by different plugins can also have different version and may not be installable. We probably need to limit the scope, so we can only install python modules that are "easy" to handle.

This is a very important point. For instance, if someone tries to install pyproj through pip then they'll get a new proj library build as a result, leading to qgis crashes due to the clashing linked proj versions between the version qgis was built with and the one from pyproj...

@nyalldawson
Copy link
Contributor

Is this for a qgis grant request? If so, there's some things that concern me:

. I think this is now the case on all platforms (to be confirmed

I'm not sure yet how creating and activating the virtual environnement will work, nor if this will come with some complex environment dependent issues

I'm not familiar with python bootstrapping in QGIS

This research should all be conducted before proposing this for a grant request, so that the qgis org can be confident that the results will be delivered. If not, you need to careful word the grant request as a "research project", which may not result in any changes if the implementation proves unfeasible...

@anitagraser
Copy link
Member

I'm wondering if PIP is the best way forward for all our users. Particularly on Windows, many spatial libraries cannot readily be installed with PIP, see e.g. https://geopandas.org/getting_started/install.html#installing-with-pip. Instead, conda seems to be the default recommendation.

@reinvantveer
Copy link

@olivierdalang other things to consider:

1. pip modules dependencies: we must be careful of pip modules that might install binary blobs of incompatible libraries (such as geos, gdal, proj)

I've been kinda following along this discussion with a lot of interest, hope you don't mind me chipping in. It appears to me a dependency verification system is needed here. For 'stand-alone' Python development, we've been using Pipenv for a few years now successfully - it manages both (sub)dependency installs & possible collisions, virtual environments and security vulnerability checks. It's the Python packaging authority's recommended way to dependency resolution, comes with a runnable binary pipenv that could (in theory) be called as a subprocess to create a specified virtualenv before bootstrapping a QGIS-embedded Python interpreter. It's higly configurable. If anyone needs convincing on the activity on this package, just head over to their repo; there's been nearly 300 releases.

For managing (sub)dependencies, it uses a lockfile that guarantees deterministic builds, which, I hazard, is what you are referring to here. No other version of QGIS-required pip packages are allowed as subdependencies of installable plugins. I think Pipenv could be the ticket here. I'm not certain, but I don't think the conda installer is up to the task of this kind of dependency resolution at the moment.

Hope this helps.

@m-kuhn
Copy link
Member

m-kuhn commented Mar 18, 2021

I'm wondering if PIP is the best way forward for all our users. Particularly on Windows, many spatial libraries cannot readily be installed with PIP, see e.g. https://geopandas.org/getting_started/install.html#installing-with-pip. Instead, conda seems to be the default recommendation.

Unfortunately this is not as easy as switching the python package manager.
We currently install libraries like proj, gdal and geos from osgeo4w, so if we install pyproj (directly or as a dependency of geopandas) from conda we will end up with incompatible versions and qgis will crash. The problem is exactly the same as mentionned by @elpaso for pip:

  1. pip modules dependencies: we must be careful of pip modules that might install binary blobs of incompatible libraries (such as geos, gdal, proj)

In short, if we want conda to be able to install those libs we need to install the complete qgis from conda. And not osgeo4w or directly from a linux distribution repository.

However, since a requirements.txt file is also understood by conda, in the case where qgis is actually installed through conda, this should be easily extensible to make use of the dependency information, so we also win for the conda case.

Pipenv ...

Pipenv is a nice package management system. Especially for standalone development. I am not sure its lockfiles will fix the issue here with the osgeo4w installed packages though. And just like conda I think it could be added as a new backend any time.

@olivierdalang
Copy link
Author

Thanks for the inputs !

I've updated the QEP accordingly.

@nyalldawson @lucernae I've added a paragraph about incompatible libraries. In short, I don't think it's possible to manage these cases cleanly with the current infrastructure (system/osgeo4w libs), so the proposition is to document this limitation and to

@anitagraser Yes supporting Conda would be nice. It however does not solve the issue above (incompatible versions) and would complexify the process. Since Conda also understands requirements.txt, it could be done in a next step.

@reinvantveer As said by @m-kuhn , the lockfile isn't too useful in this case, as we actually want the opposite (dependencies as looses as possible to accommodate the requirements of multiple independent plugins). I think it's more meant for cases where you need reproducible deployments for one well defined application. But pipenv is also able to parse requirements.txt. so indeed it would be possible in the future if needed to switch to pipenv instead of venv if it proves useful.

@Guts
Copy link

Guts commented Mar 19, 2021

Hello there,

Glad to see this topic being discussed as it's a recurring question in forums, trainings, etc. The actual solutions are often bad practices (embedding all the source code of the dependencies in the plugin) or too complex to be realistic (paver...).

Maybe I'm wrong but I have the impression that as most QGIS developers are working in C++, so package managers such as pip are seen with a bad eye or misunderstood because they are not used. A bit like the recurring problems with Windows or Oracle, part of which comes from the fact that no QGIS developer works with them. Maybe a first requirement would be to clearly identify a group of developers to maintain this part?

Speaking about the QEP, I'm not sure that having a virtual environment per profile is enough to guarantee the plugin isolation. If you want to use virtual environments, why not create one per plugin? If not, there will be blood in dependencies conflicts (plugin A needs this package >= 2.15.1 but plugin B is incompatible with).

Maybe a good option would be to maintain a list of accepted packages? Or blocking the installation of binaries (or from git)?

Problems and topics to anticipate

  • corrupted virtual env
  • new Python version in QGIS (or system), etc.
  • options to manage the venv: clear, etc.
  • link to Python system packages?
  • cache management?

Additional tools

Speaking about "overlayers" tools like conda, pipfile or poetry, I really think it should be avoided because it would add adherence to projects that the QGIS community has no control over. And I don't think the benefits they would bring outweigh the drawbacks.

Related suggested minor improvments

Before to make any big change, I think there are already several points of "lights" improvements:

  • documenting third-party Python packages embedded in QGIS (requests, etc.)
  • ensure that we keep the same version of Python between Linux, Mac and Windows platforms since this often influences the version of the dependencies

Related question

To speak about the dependencies issues, I have to say that I never understood why QGIS plugins were not Python packages to avoid reinventing the wheel (install, update, uninstall, dependencies, packaging, etc.) with a framework keyword = QGIS (as it's working for Sphinx, PyTest, etc.). This way, it would be really easier to use the official Python Package Index or deploy a QGIS Python Index (ie QGIS Plugins) using a supported tool chain (warehouse or another ones...) instead of creating/maintaining a specific one.
Thus, plugin developers would also gain consistency with the Python ecosystem and the good practices that go with it (CI tools to package and deploy, docstrings, packages and modules principles, semantic versioning, etc.).

@m-kuhn
Copy link
Member

m-kuhn commented Mar 19, 2021

@Guts thanks for the comments.

Plugins in QGIS are unfortunately not "isolatable" because they all run in one single process/python interpreter and therefore will share one set of dependencies. This is conceptually not perfect (aka. dependency hell), but not avoidable (at least not without a major effort on infrastructure, api and plugin side).
Packaging plugins into python packages sounds like an idea to discuss. This will affect the deployment of each plugin and as such is also a major infrastructure tasks with many stakeholders.

To keep this discussion here focused, I would like to keep this discussions separate. If you want to further with these ideas, please open a separate new QEP.

@Guts
Copy link

Guts commented Mar 19, 2021

Packaging plugins into python packages sounds like an idea to discuss. This will affect the deployment of each plugin and as such is also a major infrastructure tasks with many stakeholders.

To keep this discussion here focused, I would like to keep this discussions separate. If you want to further with these ideas, please open a separate new QEP.

Okay. I try to get time next week to create it with full details and proposal.

@3nids 3nids added the Grant-2021 QEP for 2021 Grant Program label Mar 22, 2021
@reinvantveer
Copy link

reinvantveer commented Mar 22, 2021

Regarding the format of dependencies: I would strongly recommend (as a Python developer) choosing a commonly used spec method (i.e. requirements.txt). It will be of great benefit for plugin development. Something like plugin_dependencies or pip_dependencies in metadata.txt is understood only by QGIS and will hamper IDE integration - you'd have to write your own IDE plugin.

As much as I dislike requirements.txt (no separation of dev-dependencies from production dependencies, no deterministic sub-dependency locking mechanism, no Python version specification, no alternate pypi package source location) it's probably the best understood and interoperable solution.

@Guts
Copy link

Guts commented Mar 22, 2021

As much as I dislike requirements.txt (no separation of dev-dependencies from production dependencies, no deterministic sub-dependency locking mechanism, no Python version specification, no alternate pypi package source location) it's probably the best understood and interoperable solution.

We agree on your conclusion, but just to say that it's possible (doc) to have multiple requirements files (and relations between them) and handle different pypi package per each requirements file or subsection.

It's also possible to avoid installing binaries.

@s-m-e
Copy link

s-m-e commented Mar 30, 2021

From a pure business perspective, I can only support this QEP. It is a great idea because it will keep generating support work for all of us, both QGIS developers and plugin authors, for ages. Besides, as this QEP is implying, it does not change the status quo - in almost every conceivable way. Only a small group of people and entities with sufficient background knowledge, the same as before, will continue to be able to distribute and support advanced software solutions nicely integrating with QGIS. A lot less potential for conflict, I'd say, keeping current business structures intact. We all need to make a living, after all. Keep going.

Broader QEP of which this could arguably be a first step : #179

Step by step ... this QEP (#202) is a step in a fundamentally different direction than what QEP #179 is about. At its core, this QEP (#202) is proposing to extend a technically outdated software package format, QGIS' metadata.txt, by effectively introducing a hard dependency between the said package format (and therefore QGIS itself) and a completely different ecosystem, pip & PyPI. Although pip becomes a hard dependency of the package format itself, no guarantee about it actually working as expected is made to its users as well as to plugin authors. Although claimed, no part of this QEP (#202) is really extensible or compatible, e.g. with respect to alternative systems such as conda. In contrast, QEP #179 proposes a clean and extensible separation of QGIS and pip without actually changing the metadata.txt package format itself. Besides, if any of the suggested ideas around virtual environments are actually implemented inside QGIS, it will further strengthen the tradition of having to patch and re-compile QGIS for many use-cases, which is anything but trivial. QEP #179 does intentionally not touch virtual environments at all, leaving their management in the spirit of maximum compatibility to well-established tools at the users' discretion.

@nyalldawson
Copy link
Contributor

Having given this more thought, and seen further reports of crashes due to users installing pyproj via pip, I'm a strong -1 to the proposal as it stands.

We'd need safeguards in place to block any installation of binary dependencies like proj and geos, Otherwise it would be very easy for a plugin to list "pyproj" as a dependancy because it works on their system, and not realise that installing their plugin on different environments would trigger a clashing installation of the library resulting (at best) in QGIS crashes, and at worst an extremely difficult to fix local environment.

And I just can't see how we could manage a blocklist like this. We could hardcode blocks for some known problematic libraries like pyproj and shapely, but we'll never be able to catch them all and accordingly this will always be a risk for users.

So unless we can handle this issue in a manageable way, I'm -1 to the proposal.

@m-kuhn
Copy link
Member

m-kuhn commented Mar 31, 2021

Full disclosure, I have been actively taking part in writing this proposal and will be closely accompanying the development.

While writing the proposal we were of the opinion that the responsibility for only specifying compatible plugins lies with the plugin developer and the potential impact is small by installing it into the user profile and not into the system. This is aligned with the general way of handling Python code in QGIS (author uploads a plugin, manual checking and publishing is done, if plugin still causes a crash we identify this by running it in a new profile and disable the affected plugin version and push the responsibility to fix this to the plugin author).

We will discuss what we can do to deal with the apparent increased safety requirements for dependency management.

@PeterPetrik
Copy link

I strongly agree with @nyalldawson about packages with binaries and the all possible and impossible crashes/bugs. Voting -1 for the proposal as it stands.

@m-kuhn
Copy link
Member

m-kuhn commented Mar 31, 2021

@nyalldawson, @PeterPetrik, @elpaso apart from the binary package dependency risks, do you agree on the underlying idea?

  • having a virtual environment in the user profile
  • specifying plugin dependencies through metadata.txt or requirements.txt
  • using pip as a default implementation for package management as it's compatible with the current package distribution systems on all major platforms (provided we find a way to deal with binary package collisions)

@PeterPetrik
Copy link

apart from the binary packages, I do not see major/critical risks for macOS packaging. Specifying deps through metadata or requirements.txt looks good. For the other 2 items, I would need to actually spend more time on evaluating different options to have qualified vote on the topic, so I would rather not comment on that ATM.

@elpaso
Copy link

elpaso commented Mar 31, 2021

@nyalldawson, @PeterPetrik, @elpaso apart from the binary package dependency risks, do you agree on the underlying idea?

* having a virtual environment in the user profile

+1 (you probably do not need a full venv and a PYTHONPAH would be sufficient, but this is an implementation detail)

* specifying plugin dependencies through metadata.txt or requirements.txt

+1 , Just a question: how would you handle conflicts? I mean plugin A requests a version 1.0 of a dependency while plugin B requests version 2.0, what would you do?

Also: the GUI must ask the user for authorization to download and install the dependencies.

* using pip as a default implementation for package management as it's compatible with the current package distribution systems on all major platforms (provided we find a way to deal with binary package collisions)

pip has no stable/supported API and you cannot use it as a library, if called as a process I see no issue here (besides the already discussed binary problem).

@olivierdalang
Copy link
Author

@nyalldawson @PeterPetrik
About binary conflicts, IMO the safest is to clearly document one should not pip-depend on a library that may install incompatible binaries, and to use the current package manager instead (OSGeo4W on Windows and system packages on other systems), both towards plugin developers and reviewers. This is a status quo for these libraries.

With the --no-binary argument (which works including for sub-dependencies, e.g. pip install --no-binary pyproj OWSLib), we can still provide a list of known packages with such issue, to reduce the risk in case such a dependency slips through plugin review. This indeed requires a little bit of maintenance, but I think there is a relatively limited number of python libraries that try to install conflicting versions
This seems to work quite elegantly, at least for pyproj. For instance, currently, on Windows, in OSGeo4W shell, pip install --no-binary pyproj pyproj tries to install pyproj-3.0.1, warns that ERROR: Minimum supported PROJ version is 7.2.0, installed version is 6.3.2., but proceeds to try older pyproj versions until it finds one that works with PROJ 6.3.2, and ends up installing pyproj-2.6.1.post1.

Worst case scenario, if these two counter-measures fail, the incompatible packages would be installed in a virtualenv in the user profile, which is easier to fix (starting with an empty profile is well known as one of the first things to try in case of crashes). This is both easier to fix and to detect (thanks to requirements.txt) than the same scenario without this QEP (where a plugin code tries to automatically install or a plugin author tell the user to manually pip-install such a library system-wide).

@elpaso
About pip dependencies conflicts, they would be displayed before installation, but non-blocking (meaning you can still install the plugin, and still have the choice to install/upgrade dependencies or not which you would have anyways). It would clearly state that this has a high chance of resulting in instabilities and that using a different user profile is recommended.
The recommendations about metadata.txt would include the principle of defining dependencies as loosely as possible to reduce fake conflicts.

@wonder-sk
Copy link
Member

The idea of using virtual environments is unclear to me - could you please explain why they would be needed and how they would be managed? At this point I can only see downsides if they would be introduced, but I may be missing some bits of the puzzle.

My understanding is that you would use venv to create isolated environment where you can then run/install packages separately and run python scripts within that environment.

With venv, you get a directory that contains symlinks to python executable. If venv would be located in user's profile, suddenly user profiles are tied to a particular installation of python - I can't even foresee how things would work if e.g. user starts a profile using QGIS 3.10 installed in "c:\program files\qgis 3.10" and later installs QGIS 3.16 to a different directory (with possibly different python version and a different set of python libraries) - the virtual env would still link to the "old" installation - or how would that be managed?

Also, in case of QGIS, python is embedded and using python libraries from the installation, rather than having a choice of the version like with venv's activation scripts - I am trying to wrap my head around how things would affect the qgispython integration library, but I don't really know what to expect.

Few more questions:

  • How would things work when plugin A needs module X version Y and plugin B needs module X version Z? Would there be GUI for user to pick one or the other?
  • How would things work when plugins are uninstalled, would their dependencies get uninstalled as well? If yes, how would that be managed? If not, wouldn't old unused libraries block installation of plugins?

If we do significant changes to QGIS plugin system, it would be good to slowly take it closer to the standard way of installation of python packages. For example, we should look at using requirements.txt and think of how we can deprecate metadata.txt (which is only used in QGIS world - but at the time of introduction of it (2008?) the world was a different place 😄 and pip did not exist yet, or I was not aware of it). I hope that one day we could get to the point where QGIS plugins would be ordinary pypi packages and the whole QGIS plugin repository would be "just" a kind of whitelist of python packages that are considered as plugins. In this particular case of whether to have deps listed in metadata.txt or requirements.txt, my suggestion would be to use requirements.txt primarily, and for compatibility we could list them also in metadata.txt (where QGIS plugin repo would have an extra check during upload verifying that these two match).

All this being a relatively complex thing, it would be very useful to have a proof of concept before proceeding with particular implementation choices. I am in favor of having an automated way to install plugin dependencies, but we should be really careful not to make our life even more complicated...

@Guts
Copy link

Guts commented Mar 31, 2021

With venv, you get a directory that contains symlinks to python executable. If venv would be located in user's profile, suddenly user profiles are tied to a particular installation of python - I can't even foresee how things would work if e.g. user starts a profile using QGIS 3.10 installed in "c:\program files\qgis 3.10" and later installs QGIS 3.16 to a different directory (with possibly different python version and a different set of python libraries) - the virtual env would still link to the "old" installation - or how would that be managed?

I assume that the virtual environment would be created using the --system-site-packages option (from the cli help: give the virtual environment access to the system site-packages dir ), pointing to the Python embedded libraries.

And, when a new Python version is deployed at the system level (i.e. QGIS), the virtual environment could be upgraded using the --upgrade option (from the cli help: Upgrade the environment directory to use this version of Python, assuming Python has been upgraded in-place.).

I hope that one day we could get to the point where QGIS plugins would be ordinary pypi packages and the whole QGIS plugin repository would be "just" a kind of whitelist of python packages that are considered as plugins. In this particular case of whether to have deps listed in metadata.txt or requirements.txt, my suggestion would be to use requirements.txt primarily,

If the goal is to make QGIS plugins a list of Python packages, my suggestion would be to use a specific section inside the setup.cfg or pyproject.toml where the actual metadata would be stored: https://packaging.python.org/tutorials/packaging-projects/#creating-pyproject-toml


FYI: the Python package manager into ArcGIS Pro https://pro.arcgis.com/fr/pro-app/latest/arcpy/get-started/what-is-conda.htm

@wonder-sk
Copy link
Member

And, when a new Python version is deployed at the system level (i.e. QGIS), the virtual environment could be upgraded using the --upgrade option (from the cli help: Upgrade the environment directory to use this version of Python, assuming Python has been upgraded in-place.).

But how about users running multiple versions of QGIS at the same time (and therefore possibly different versions of Python) ? This is not some theoretical scenario - often people prefer to stick to older versions for production work, yet they install also new version(s) to try out new features...

@elpaso
Copy link

elpaso commented Mar 31, 2021

And, when a new Python version is deployed at the system level (i.e. QGIS), the virtual environment could be upgraded using the --upgrade option (from the cli help: Upgrade the environment directory to use this version of Python, assuming Python has been upgraded in-place.).

But how about users running multiple versions of QGIS at the same time (and therefore possibly different versions of Python) ? This is not some theoretical scenario - often people prefer to stick to older versions for production work, yet they install also new version(s) to try out new features...

Yeah, that's why I was thinking that venv would be overkill, I believe that messing with PYTHONPATH would probably be enough for the purpose of loading dependencies from a known insulated location.

@nyalldawson
Copy link
Contributor

@m-kuhn

do you agree on the underlying idea?

having a virtual environment in the user profile

Yes-ish, but @wonder-sk raises very valid concerns about this approach which need to be considered.

specifying plugin dependencies through metadata.txt or requirements.txt

Yes

using pip as a default implementation for package management as it's compatible with the current package distribution systems on all major platforms (provided we find a way to deal with binary package collisions)

Yes

@nyalldawson
Copy link
Contributor

@olivierdalang

It may be a language thing, but honestly the amount of "i think" and "it should" used throughout this discussion concerns me. Given that this is the most expensive grant proposal submitted this round by far, I don't like to see this ambiguity used in the proposal. Rather I'd expect to see evidence that the approach is already deemed feasible, and a concrete list of deliverables included.

OR (my suggestion):

Rework this proposal as a research project, designed to experiment and determine whether your approach is feasible, and identify the real-world shortcomings which need to be addressed before a full-scale implementation can go ahead. Lower the cost of the proposal to cover just this initial research component (eg 1-2 days), and get a rock-solid implementation plan written up as a concrete deliverable for the work.

Then in the next round of grant proposals you'll be very well placed to put in a follow-up request for the funds to do the actual implementation 👍

@olivierdalang
Copy link
Author

@nyalldawson It's not only language ;-) Ok with splitting this QEP, it's indeed more clear that way.

I've added a section Part covered by grant 2021 in the QEP, is it OK like that ?

@elpaso
Copy link

elpaso commented Apr 8, 2021

@olivierdalang I cannot find an answer to some of the points:

  • is a virtualenv orverkill (we cannot have different versions of Python: the version of python is determined by the version we have linked QGIS with), can we just use PYTHONPATH to insulate libraries ?
  • what happens if different plugins have dependencies on different versions of the same library?

@olivierdalang
Copy link
Author

@wonder-sk @elpaso @Guts About virtualenv
Thanks for the comments. Indeed an actual virtualenv may not be the way to go to allow multiple versions of QGIS/python. Just altering the PATH on startup may be better suited. What I liked about an actual virtualenv is the ability to use it outside of the context of QGIS, but there may not be many actual use cases for that.

@wonder-sk @elpaso About conflicting requirements
Check would be done on installation only, and still allow to install/upgrade conflicting dependencies (warning the user and recommending to use separate profiles). This means the plugin that was installed latest "wins" (which by the way allows to fix deps for one particular plugin by reinstalling it).

In the GUI (plugin manager), plugins with missing or incompatible libraries would be shown with a warning.

This means we don't actually deal with conflicts, and the user can end up with incompatible libraries, but would be warned by doing so.

The only other approach I can think of is preventing the installation of incompatible plugins altogether, but think this would be an annoyance in most cases, as sometimes the incompatibility may only be in the declared dependencies, and not an actual incompatibility. Obviously, with incompatible plugins, it's quite likely that users will get python exceptions, but then they have been warned.

This also removes the need to remove unused libraries on plugin uninstalls. I still think we could add "uninstall" and "uninstall all" buttons in the GUI

@vmora
Copy link

vmora commented Apr 8, 2021

Although I'd love to have an easy way to get python dependencies for a plugin, I don't believe virtualenv is the way to go.

I believe that most of the useful modules in python are wrappers for usefull (binary) libraries. Packaging a system is hard enough work as it stands without the combinatorial complexities of (partial) virtual environments.

what happens if different plugins have dependencies on different versions of the same library?

Hell breaks loose, as far as my experience goes.

I've strugled with that problem for years, both on windows and linux, and the best answear I found is : trust system packagers (osgeo4w is the closest thing to a "system" on windows), help them if you can, but don't try to go around with unrelated package managers (like pip).

The idea of installing depencies in the profile directory seems sound since it allows relation to a particular system install of QGIS, but if an "Install dependencies" button exists, I'd like it to:

  • be able to "run as admin" and install system-wide packages (i.e. packages available in the distribution) and install the plugin system-wide
  • for user install in the profile directory, allows only system-available dependencies, or recompile the package for the system if possible.

I'm aware this is a conservative point of view.

@elpaso
Copy link

elpaso commented Apr 8, 2021

@vmora @olivierdalang actually, the way we solved the problem of conflicting versions of libraries with our plugins when I was working for Boundles was to alter the path through sites and ship all the dependencies within the plugin itself.

I can't remember the details @volaya knows probably more. Here is an example of a plugin that was doing that: https://github.com/planetfederal/qgis-geoserver-plugin/blob/master/geoserverexplorer/extlibs/site.py

So, each plugin imported its dependencies from a module plugin_namespace.extlibs actually insulating them from other plugins. IIRC it was working well but I can't remember if it led to issues with binaries.

@koyo-jakanees
Copy link

Not sure if this would be relevant to this discussion. Besides approach mentioned by @elpaso above, another one would be the approach by EnMap plugin developers, implementing a package installer for dependencies through a gui, where the user can choose to install the dependencies https://enmap-box.readthedocs.io/en/latest/usr_section/usr_installation.html#package-installer. but underneath still uses sytem's python pip to install them. One concern with pip is it tries to install latest version of a module (if not explicitly indicated) which often conflict with system packages. for instance on ubuntu focal(apt pyqt5 == 5.14.1, pip pyqt5==5.15.4) and same cuts across many others

@m-kuhn
Copy link
Member

m-kuhn commented Jan 29, 2024

There is a QGIS plugin at https://github.com/opengisch/qpip that facilitates the management of dependencies and implements a similar approach to what is proposed in here. Please note that at the current time, the plugin is experimental and therefore "experimental plugins" need to be allowed in QGIS. Depending on feedback from testers, this can be changed.

@Jannik-Schilling
Copy link

2. IMO we need to ask for the user consent before installing **anything** from the network, this is for security reasons

Therefore I´d propose a requirement here (https://plugins.qgis.org/publish/) that python packages must not be installed automatically. Maybe this should also be part of the review process, when plugins are uploaded.

@T4mmi
Copy link

T4mmi commented Mar 14, 2024

Hi everyone, just to share some of our experiments:

  • venv seems broken using QGIS interpreter, couldn't manage to correctly use it in a QGIS plugin.
  • pip_dependencies makes a lot of sense since QGIS pugins do not rely on pythonic standard layout (such as MANIFEST.in, requirements.txt, pyproject.toml ...) but on a single metadata.txt and this would mimic the plugin_dependencies
  • but relying on requirements.txt makes a ton of things available and easy, namely:
    • pip-syntax (using conditional requirements .. usefull to handle differents QGIS python versions)
    • --extra-index-url to fetch package outside from pypi (required for some internal projects)
    • use --find-links to embed some dependencies
    • use differents environments for develop and realease (multiple requirements.txt)

A little summary of what we tried:

  1. We used a macro in our plugins that reads a pip_dependencies attribute in the metadata.txt and installs everything using:
    init.py
def pip_dependencies():
  import subprocess
  from configparser import ConfigParser
  from pathlib import Path
  from qgis.PyQt.QtCore import QStandardPaths

  metadata_txt = Path(__file__).parent / "metadata.txt"
  config = ConfigParser(allow_no_value=True)
  config.read(metadata_txt)
  metadata = dict(config["general"])
  deps = metadata.get('pip_dependencies', '')
  if not deps: return  
  python = QStandardPaths.findExecutable("python")
  subprocess.run(
                f"{python} -m pip install --user {deps}",
                shell=True,
                check=True,
            )

def classFactory(iface):
  try:
    from .plugin import PyPlugin
  except ImportError:
    pip_dependencies()
    from .plugin import PyPlugin
  
  return PyPlugin(iface)

that worked ... just note that we MUST use the --user site-package so it would not requires elevated privileges (default behavior with default settings install of QGIS on Windows requires to run the command as admin)

  1. We developped a setuptools extension that would create a metadata.txt for any PEP621 python package at build-time (it was designed so we could produce QGIS plugins just as any regular python package, the command python -m setuptools-qgis bdis_qgis . creates the metadata.txt using PEP621 arguments plus some [tool.setuptools_qgis] extra arguments, leveraging setuptools capabilities (namely setuptools_scm...), this would also generate the __init__.py with the boilerplate code aforementioned.

  2. We enjoyed some mess using the --user site for our installations ... so we tweaked the code a bit to use QGIS python plugins directory as home for the dependencies:

def pip_dependencies(**kwargs):
  import os
  import subprocess
  from configparser import ConfigParser
  from pathlib import Path
  from qgis.PyQt.QtCore import QStandardPaths
  from qgis.core import QgsApplication

  metadata_txt = Path(__file__).parent / "metadata.txt"
  config = ConfigParser(allow_no_value=True)
  config.read(metadata_txt)
  metadata = dict(config["general"])
  deps = metadata.get('pip_dependencies', '')
  if not deps: return  

  # set ${PYTHONUSERBASE} to "%appdata%/QGIS/QGIS3/profiles/{ default}/python"
  python = QStandardPaths.findExecutable("python")
  env = os.environ.copy()
  env.setdefault("PYTHONUSERBASE", str(Path(QgsApplication.qgisSettingsDirPath()) / "python"))
  # get exact location of the usersite (for the current QGIS interpreter version)
  site_packages = (
            subprocess.run(
                f"{python} -m site --user-site",
                env=env,
                shell=True,
                capture_output=True,
            )
            .stdout.decode()
            .split()
        )
  site_packages = Path(site_packages[0])
  assert site_packages.relative_to(PYTHONUSERBASE)
  # set the priority to this custom usersite
  if not site_packages.is_dir():
      site_packages.mkdir(parents=True, exist_ok=True)
  sys.path.insert(1, str(site_packages))

  # add a logfile for each plugin
  log_file = site_packages.parent / f"{metadata['name']}.log"
  with log_file.open("w") as stdout:
        subprocess.run(
            f"{python} -m pip install --user {deps}",
            env=env,
            shell=True,
            stdout=stdout,
            stderr=subprocess.STDOUT,
            check=True,
        )

again, this works ... and offers a customization for site installation

  1. We faced the restrictions of single line pip install command mentioned in introduction ... (mostly we needed to access our private python repository) so we switched to using -r requirements.txt ... if there was one:
def pip_dependencies(**kwargs):
 [...]
  requirements_txt = Path(__file__).parent / f"requirements{('-'+kwargs.get('req_suffix')) else ''}.txt"
  if not requirements_txt.is_file(): return
  metadata_txt = Path(__file__).parent / "metadata.txt"
  config = ConfigParser(allow_no_value=True)
  config.read(metadata_txt)
  metadata = dict(config["general"])
  [...]
    subprocess.run(
      f"{python} -m pip install --user -r {requirements_txt}",
      env=env,
  [...]

which seems the most comfortable solution at the moment ... but the boilerplate grows and we are investigating a better way to externalise this pip_dependencies() boilerplate ... QPip seems a promising tool to do that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests