From 218e1af2abbacb88377912e95bcce7ac6841dd5e Mon Sep 17 00:00:00 2001 From: Anderson Bravalheri Date: Mon, 20 Nov 2023 10:39:38 +0000 Subject: [PATCH 1/3] Update guides on datafiles Attempts to solve common user doubts/problems and make it clear how to use the configurations. ``pyproject.toml`` tabs were moved as the 1st tab, following other pages like the Quickstart guide. --- docs/userguide/datafiles.rst | 194 ++++++++++++++++++++--------------- 1 file changed, 112 insertions(+), 82 deletions(-) diff --git a/docs/userguide/datafiles.rst b/docs/userguide/datafiles.rst index 9bd2efd863..f641605778 100644 --- a/docs/userguide/datafiles.rst +++ b/docs/userguide/datafiles.rst @@ -30,6 +30,19 @@ For example, if the package tree looks like this:: and you supply this configuration: +.. tab:: pyproject.toml + + .. code-block:: toml + + [tool.setuptools] + # ... + # By default, include-package-data is true in pyproject.toml, so you do + # NOT have to specify this line. + include-package-data = true + + [tool.setuptools.packages.find] + where = ["src"] + .. tab:: setup.cfg .. code-block:: ini @@ -56,19 +69,6 @@ and you supply this configuration: include_package_data=True ) -.. tab:: pyproject.toml - - .. code-block:: toml - - [tool.setuptools] - # ... - # By default, include-package-data is true in pyproject.toml, so you do - # NOT have to specify this line. - include-package-data = true - - [tool.setuptools.packages.find] - where = ["src"] - then all the ``.txt`` and ``.rst`` files will be automatically installed with your package, provided: @@ -84,6 +84,14 @@ your package, provided: (See the section below on :ref:`Adding Support for Revision Control Systems` for information on how to write such plugins.) +.. note:: + .. versionadded:: v61.0.0 + The default value for ``tool.setuptools.include-package-data`` is ``True`` + when projects are configured via ``pyproject.toml``. + This behaviour differs from ``setup.cfg`` and ``setup.py`` + (where ``include_package_data=False`` by default), which was not changed + to ensure backwards compatibility with existing projects. + package_data ============ @@ -108,6 +116,16 @@ For example, if the package tree looks like this:: then you can use the following configuration to capture the ``.txt`` and ``.rst`` files as data files: +.. tab:: pyproject.toml + + .. code-block:: toml + + [tool.setuptools.packages.find] + where = ["src"] + + [tool.setuptools.package-data] + mypkg = ["*.txt", "*.rst"] + .. tab:: setup.cfg .. code-block:: ini @@ -138,16 +156,6 @@ data files: package_data={"mypkg": ["*.txt", "*.rst"]} ) -.. tab:: pyproject.toml - - .. code-block:: toml - - [tool.setuptools.packages.find] - where = ["src"] - - [tool.setuptools.package-data] - mypkg = ["*.txt", "*.rst"] - The ``package_data`` argument is a dictionary that maps from package names to lists of glob patterns. Note that the data files specified using the ``package_data`` option neither require to be included within a :ref:`MANIFEST.in ` @@ -158,9 +166,9 @@ file, nor require to be added by a revision control system plugin. the path separator, even if you are on Windows. Setuptools automatically converts slashes to appropriate platform-specific separators at build time. -.. note:: - Glob patterns do not automatically match dotfiles (directory or file names - starting with a dot (``.``)). To include such files, you must explicitly start +.. important:: + Glob patterns do not automatically match dotfiles, i.e., directory or file names + starting with a dot (``.``). To include such files, you must explicitly start the pattern with a dot, e.g. ``.*`` to match ``.gitignore``. If you have multiple top-level packages and a common pattern of data files for all these @@ -181,6 +189,17 @@ Here, both packages ``mypkg1`` and ``mypkg2`` share a common pattern of having ` data files. However, only ``mypkg1`` has ``.rst`` data files. In such a case, if you want to use the ``package_data`` option, the following configuration will work: +.. tab:: pyproject.toml + + .. code-block:: toml + + [tool.setuptools.packages.find] + where = ["src"] + + [tool.setuptools.package-data] + "*" = ["*.txt"] + mypkg1 = ["data1.rst"] + .. tab:: setup.cfg .. code-block:: ini @@ -211,28 +230,35 @@ use the ``package_data`` option, the following configuration will work: package_data={"": ["*.txt"], "mypkg1": ["data1.rst"]}, ) -.. tab:: pyproject.toml - - .. code-block:: toml - - [tool.setuptools.packages.find] - where = ["src"] - - [tool.setuptools.package-data] - "*" = ["*.txt"] - mypkg1 = ["data1.rst"] - Notice that if you list patterns in ``package_data`` under the empty string ``""`` in ``setup.py``, and the asterisk ``*`` in ``setup.cfg`` and ``pyproject.toml``, these patterns are used to find files in every package. For example, we use ``""`` or ``*`` to indicate that the ``.txt`` files from all packages should be captured as data files. +These placeholders are treated as a special case, ``setuptools`` **do not** +support glob patterns on package names for this configuration +(patterns are only supported on the file paths). Also note how we can continue to specify patterns for individual packages, i.e. we specify that ``data1.rst`` from ``mypkg1`` alone should be captured as well. .. note:: - When building an ``sdist``, the datafiles are also drawn from the - ``package_name.egg-info/SOURCES.txt`` file, so make sure that this is removed if - the ``setup.py`` ``package_data`` list is updated before calling ``setup.py``. + When building an ``sdist``, the data files are also drawn from the + ``package_name.egg-info/SOURCES.txt`` file which works as a form of cache. + So make sure that this file is removed if ``package_data`` is updated, + before re-building the package. + +.. attention:: + In Python any directory is considered a package + (even if it does not contain ``__init__.py``, + see *native namespaces packages* on :doc:`PyPUG:guides/packaging-namespace-packages`). + Therefore, if you are not relying on :doc:`automatic discovery `, + you *SHOULD* ensure that **all** packages (including the ones that don't + contain any Python files) are included in the ``packages`` configuration + (see :doc:`/userguide/package_discovery` for more information). + + Moreover, it is advisable to use full packages name using the dot + notation instead of a nested path, to avoid error prone configurations. + Please check :ref:`section subdirectories ` below. + exclude_package_data ==================== @@ -250,6 +276,16 @@ Supposing you want to prevent these files from being included in the installation (they are not relevant to Python or the package), then you could use the ``exclude_package_data`` option: +.. tab:: pyproject.toml + + .. code-block:: toml + + [tool.setuptools.packages.find] + where = ["src"] + + [tool.setuptools.exclude-package-data] + mypkg = [".gitattributes"] + .. tab:: setup.cfg .. code-block:: ini @@ -281,16 +317,6 @@ use the ``exclude_package_data`` option: exclude_package_data={"mypkg": [".gitattributes"]}, ) -.. tab:: pyproject.toml - - .. code-block:: toml - - [tool.setuptools.packages.find] - where = ["src"] - - [tool.setuptools.exclude-package-data] - mypkg = [".gitattributes"] - The ``exclude_package_data`` option is a dictionary mapping package names to lists of wildcard patterns, just like the ``package_data`` option. And, just as with that option, you can use the empty string key ``""`` in ``setup.py`` and the @@ -300,6 +326,9 @@ Any files that match these patterns will be *excluded* from installation, even if they were listed in ``package_data`` or were included as a result of using ``include_package_data``. + +.. _subdir-data-files: + Subdirectory for Data Files =========================== @@ -324,6 +353,21 @@ In this case, the recommended approach is to treat ``data`` as a namespace packa (refer :pep:`420`). With ``package_data``, the configuration might look like this: +.. tab:: pyproject.toml + + .. code-block:: toml + + # Scanning for namespace packages in the ``src`` directory is true by + # default in pyproject.toml, so you do NOT need to include the + # `tool.setuptools.packages.find` if it looks like the following: + # [tool.setuptools.packages.find] + # namespaces = true + # where = ["src"] + + [tool.setuptools.package-data] + mypkg = ["*.txt"] + "mypkg.data" = ["*.rst"] + .. tab:: setup.cfg .. code-block:: ini @@ -358,28 +402,30 @@ the configuration might look like this: } ) +In other words, we allow Setuptools to scan for namespace packages in the ``src`` directory, +which enables the ``data`` directory to be identified, and then, we separately specify data +files for the root package ``mypkg``, and the namespace package ``data`` under the package +``mypkg``. + +With ``include_package_data`` the configuration is simpler: you simply need to enable +scanning of namespace packages in the ``src`` directory and the rest is handled by Setuptools. + .. tab:: pyproject.toml .. code-block:: toml + [tool.setuptools] + # ... + # By default, include-package-data is true in pyproject.toml, so you do + # NOT have to specify this line. + include-package-data = true + [tool.setuptools.packages.find] # scanning for namespace packages is true by default in pyproject.toml, so - # you do NOT need to include the following line. + # you need NOT include the following line. namespaces = true where = ["src"] - [tool.setuptools.package-data] - mypkg = ["*.txt"] - "mypkg.data" = ["*.rst"] - -In other words, we allow Setuptools to scan for namespace packages in the ``src`` directory, -which enables the ``data`` directory to be identified, and then, we separately specify data -files for the root package ``mypkg``, and the namespace package ``data`` under the package -``mypkg``. - -With ``include_package_data`` the configuration is simpler: you simply need to enable -scanning of namespace packages in the ``src`` directory and the rest is handled by Setuptools. - .. tab:: setup.cfg .. code-block:: ini @@ -405,22 +451,6 @@ scanning of namespace packages in the ``src`` directory and the rest is handled include_package_data=True, ) -.. tab:: pyproject.toml - - .. code-block:: toml - - [tool.setuptools] - # ... - # By default, include-package-data is true in pyproject.toml, so you do - # NOT have to specify this line. - include-package-data = true - - [tool.setuptools.packages.find] - # scanning for namespace packages is true by default in pyproject.toml, so - # you need NOT include the following line. - namespaces = true - where = ["src"] - Summary ======= @@ -444,11 +474,11 @@ In summary, the three options allow you to: .. note:: Due to the way the build process works, a data file that you include in your project and then stop including may be "orphaned" in your - project's build directories, requiring you to run ``setup.py clean --all`` to - fully remove them. This may also be important for your users and contributors + project's build directories, requiring you to manually deleting them. + This may also be important for your users and contributors if they track intermediate revisions of your project using Subversion; be sure to let them know when you make changes that remove files from inclusion so they - can run ``setup.py clean --all``. + can also manually delete them. .. _Accessing Data Files at Runtime: From ad4c4a3cdeb453b72d431b58bcb2df9a118879ff Mon Sep 17 00:00:00 2001 From: Anderson Bravalheri Date: Mon, 20 Nov 2023 10:58:43 +0000 Subject: [PATCH 2/3] Add note abot using namespace packages for data files --- docs/userguide/datafiles.rst | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/docs/userguide/datafiles.rst b/docs/userguide/datafiles.rst index f641605778..2e37289d5f 100644 --- a/docs/userguide/datafiles.rst +++ b/docs/userguide/datafiles.rst @@ -548,6 +548,20 @@ See :doc:`importlib-resources:using` for detailed instructions. pre-existing file is found. +Data Files from Plugins and Extensions +====================================== + +You can resort to a :doc:`native/implicit namespace package +` (as a container for files) +if you want plugins and extensions to your package to contribute with package data files. +This way, all files will be listed during runtime +when :doc:`using importlib.resources `. +Note that, although not strictly guaranteed, mainstream Python package managers, +like :pypi:`pip` and derived tools, will install files belong to multiple distributions +that share a same namespace into the same directory in the file system. +This means that the overhead for :mod:`importlib.resources` will be minimum. + + Non-Package Data Files ====================== From 7676a4365a49d6f7ab094666ed076f9237764eb8 Mon Sep 17 00:00:00 2001 From: Anderson Bravalheri Date: Mon, 20 Nov 2023 11:14:11 +0000 Subject: [PATCH 3/3] Add note about dynamic configs via attr and imports --- docs/userguide/declarative_config.rst | 22 +++++++++++++++++----- docs/userguide/pyproject_config.rst | 14 ++++++++++++++ 2 files changed, 31 insertions(+), 5 deletions(-) diff --git a/docs/userguide/declarative_config.rst b/docs/userguide/declarative_config.rst index fa104b10e3..047e08c6ef 100644 --- a/docs/userguide/declarative_config.rst +++ b/docs/userguide/declarative_config.rst @@ -155,13 +155,25 @@ Type names used below: Special directives: -* ``attr:`` - Value is read from a module attribute. ``attr:`` supports - callables and iterables; unsupported types are cast using ``str()``. +* ``attr:`` - Value is read from a module attribute. + + It is advisable to use literal values together with ``attr:`` (e.g. ``str``, + ``tuple[str]``, see :func:`ast.literal_eval`). This is recommend + in order to support the common case of a literal value assigned to a variable + in a module containing (directly or indirectly) third-party imports. - In order to support the common case of a literal value assigned to a variable - in a module containing (directly or indirectly) third-party imports, ``attr:`` first tries to read the value from the module by examining the - module's AST. If that fails, ``attr:`` falls back to importing the module. + module's AST. If that fails, ``attr:`` falls back to importing the module, + using :func:`importlib.util.spec_from_file_location` recommended recipe + (see :ref:`example on Python docs ` + about "Importing a source file directly"). + Note however that importing the module is error prone since your package is + not installed yet. You may also need to manually add the project directory to + ``sys.path`` (via ``setup.py``) in order to be able to do that. + + When the module is imported, ``attr:`` supports + callables and iterables; unsupported types are cast using ``str()``. + * ``file:`` - Value is read from a list of files and then concatenated diff --git a/docs/userguide/pyproject_config.rst b/docs/userguide/pyproject_config.rst index 8f9d5f3745..2529bf1ba8 100644 --- a/docs/userguide/pyproject_config.rst +++ b/docs/userguide/pyproject_config.rst @@ -242,6 +242,20 @@ however please keep in mind that all non-comment lines must conform with :pep:`5 .. versionchanged:: 66.1.0 Newer versions of ``setuptools`` will automatically add these files to the ``sdist``. +It is advisable to use literal values together with ``attr`` (e.g. ``str``, +``tuple[str]``, see :func:`ast.literal_eval`). This is recommend +in order to support the common case of a literal value assigned to a variable +in a module containing (directly or indirectly) third-party imports. + +``attr`` first tries to read the value from the module by examining the +module's AST. If that fails, ``attr`` falls back to importing the module, +using :func:`importlib.util.spec_from_file_location` recommended recipe +(see :ref:`example on Python docs ` +about "Importing a source file directly"). +Note however that importing the module is error prone since your package is +not installed yet. You may also need to manually add the project directory to +``sys.path`` (via ``setup.py``) in order to be able to do that. + ---- .. rubric:: Notes