Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch repodata to relax HTSlib dependency upper bounds #42895

Merged
merged 2 commits into from
Sep 12, 2023

Conversation

jmarshall
Copy link
Member

@jmarshall jmarshall commented Sep 7, 2023

Somewhat similarly to PR #40675 for libdeflate, this PR recommends patching htslib dependencies to extend version ranges instead of pinning htslib in bioconda. Pinning htslib causes deployment difficulties each time there is an upstream htslib/samtools/bcftools joint release; e.g. #42167 and #42168 have still not been merged six weeks after bcftools and samtools 1.18 were released, because the htslib pinning still has not been updated (cf bioconda/bioconda-utils#915).

The reason for pinning htslib is that it is a common dependency and if several tools using it are all going to be installed in the same environment they all need to be using (and hence built against) the same htslib version.

If we patch repodata to loosen these htslib dependencies, this need will go away. Then we can simplify updates by removing the pinning of htslib (which is now bioconda/bioconda-utils#917).

Upstream HTSlib is careful to maintain binary compatibility between soversion bumps. Note though that those bumps are not correlated with e.g. major version number bumps, as the HTSlib maintainers would use that to reflect a break in source compatibility not binary compatibility.

Hence building other bioconda packages that use htslib produces a tight bound on a single x.x htslib version, but we can use repodata patching to widen the bounds on previously build packages, due to this forward compatibility. Doing so will mean bioconda no longer needs to pin HTSlib, as it will be less vital to ensure all dependent packages are built against the exact same version of htslib. In turn, this will simplify packaging htslib/samtools/bcftools updates.


Please read the guidelines for Bioconda recipes before opening a pull request (PR).

General instructions

  • If this PR adds or updates a recipe, use "Add" or "Update" appropriately as the first word in its title.
  • New recipes not directly relevant to the biological sciences need to be submitted to the conda-forge channel instead of Bioconda.
  • PRs require reviews prior to being merged. Once your PR is passing tests and ready to be merged, please issue the @BiocondaBot please add label command.
  • Please post questions on Gitter or ping @bioconda/core in a comment.

Instructions for avoiding API, ABI, and CLI breakage issues

Conda is able to record and lock (a.k.a. pin) dependency versions used at build time of other recipes.
This way, one can avoid that expectations of a downstream recipe with regards to API, ABI, or CLI are violated by later changes in the recipe.
If not already present in the meta.yaml, make sure to specify run_exports (see here for the rationale and comprehensive explanation).
Add a run_exports section like this:

build:
  run_exports:
    - ...

with ... being one of:

Case run_exports statement
semantic versioning {{ pin_subpackage("myrecipe", max_pin="x") }}
semantic versioning (0.x.x) {{ pin_subpackage("myrecipe", max_pin="x.x") }}
known breakage in minor versions {{ pin_subpackage("myrecipe", max_pin="x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
known breakage in patch versions {{ pin_subpackage("myrecipe", max_pin="x.x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
calendar versioning {{ pin_subpackage("myrecipe", max_pin=None) }}

while replacing "myrecipe" with either name if a name|lower variable is defined in your recipe or with the lowercase name of the package in quotes.

Bot commands for PR management

Please use the following BiocondaBot commands:

Everyone has access to the following BiocondaBot commands, which can be given in a comment:

@BiocondaBot please update Merge the master branch into a PR.
@BiocondaBot please add label Add the please review & merge label.
@BiocondaBot please fetch artifacts Post links to CI-built packages/containers.
You can use this to test packages locally.

Note that the @BiocondaBot please merge command is now depreciated. Please just squash and merge instead.

Also, the bot watches for comments from non-members that include @bioconda/<team> and will automatically re-post them to notify the addressed <team>.

jmarshall added a commit to jmarshall/bioconda-utils that referenced this pull request Sep 7, 2023
By patching repodata htslib dependencies, we no longer need
to pin htslib. See bioconda/bioconda-recipes#42895.
Upstream HTSlib is careful to maintain binary compatibility between
soversion bumps. Note though that those bumps are not correlated
with e.g. major version number bumps, as that would reflect a break
in source compatibility not binary compatibility.

Hence building other bioconda packages that use htslib produces a tight
bound on a single x.x htslib version, but we can use repodata patching
to widen the bounds on previously build packages, due to this forward
compatibility. Doing so will mean bioconda no longer needs to pin HTSlib,
as it will be less vital to ensure all dependent packages are built
against the exact same version of htslib. In turn, this will simplify
packaging htslib/samtools/bcftools updates.
@dpryan79
Copy link
Contributor

@jmarshall This looks great and I completely agree that this should make things easier in the future. Thanks a ton!

@dpryan79 dpryan79 merged commit 6161bc4 into bioconda:master Sep 12, 2023
5 checks passed
dpryan79 pushed a commit to bioconda/bioconda-utils that referenced this pull request Sep 12, 2023
By patching repodata htslib dependencies, we no longer need to pin
htslib. See bioconda/bioconda-recipes#42895.
@jmarshall jmarshall deleted the repopatch-htslib branch September 13, 2023 22:31
@jdblischak
Copy link
Member

Hence building other bioconda packages that use htslib produces a tight bound on a single x.x htslib version, but we can use repodata patching to widen the bounds on previously build packages, due to this forward compatibility. Doing so will mean bioconda no longer needs to pin HTSlib, as it will be less vital to ensure all dependent packages are built against the exact same version of htslib. In turn, this will simplify packaging htslib/samtools/bcftools updates.

What about the run_exports in the htslib recipe itself? If the repodata patch is automatically loosening the htslib runtime requirement for all recipes, should we loosen (or simply remove) the current run_exports?

run_exports:
- {{ pin_subpackage('htslib', max_pin='x.x') }}

@jmarshall
Copy link
Member Author

jmarshall commented Jan 4, 2024

Loosening the run_exports would require predicting the future. HTSlib has previously bumped its shared library soversion in versions 1.4 and 1.10, and there is no predicting in advance when it will do so again. As conda does not model soversion dependencies, it needs to build packages using overly conservative version dependencies for libraries instead. See the discussion (in a somewhat different context) in #24199 (comment) onwards (in particular, #24199 (comment)).

(If you think semver would miraculously solve this problem, reflect on the fourth paragraph in the initial comment above and the fact that semver does not distinguish between source code compatibility and ABI compatibility.)

@jdblischak
Copy link
Member

Loosening the run_exports would require predicting the future. HTSlib has previously bumped its shared library soversion in versions 1.4 and 1.10, and there is no predicting in advance when it will do so again.

@jmarshall thanks for the explanation! That makes sense

As conda does not model soversion dependencies, it needs to build packages using overly conservative version dependencies for libraries instead

Yes, I recall that discussion on Twitter. Reading your other posts helped solidify my understanding


I think part of my confusion yesterday was not understanding the arguments to _pin_looser(). They aren't documented, but from reading the function definition, I think I understand now:

# future HTSlib versions are binary compatible until they bump their soversion
if has_dep(record, 'htslib'):
# skip deps prior to 1.10, which was the first with soversion 3
# TODO adjust replacement (exclusive) upper bound with each new compatible HTSlib
_pin_looser(fn, record, 'htslib', min_lower_bound='1.10', upper_bound='1.20')

translates to:

  • if the htslib pin is <1.10, do nothing
  • if the htslib pin is >=1.10, then edit the upper bound to be <1.20

Thus a recipe built against htslib 1.10 would be edited to have a pin of htslib >=1.10,<1.20, a recipe built against htslib 1.15 would be edited to have a pin of htslib >=1.15,<1.20, etc. As far as I can tell, there is no way to reduce the lower pin. But I assume that is because the same SO version indicates forward compatibility and not backwards compatibility.

Now let's say that hypothetically in the future htslib 1.20 introduces a new SO version, and then releases 1.21, 1.22, and 1.23 are compatible. My guess would be that we would add an additional _pin_looser() call to edit binaries built against htslib 1.20+, eg

 # future HTSlib versions are binary compatible until they bump their soversion 
 if has_dep(record, 'htslib'): 
     # skip deps prior to 1.10, which was the first with soversion 3 
     # TODO adjust replacement (exclusive) upper bound with each new compatible HTSlib 
     _pin_looser(fn, record, 'htslib', min_lower_bound='1.10', upper_bound='1.20') 
    # htslib 1.20 bumped the soversion
     _pin_looser(fn, record, 'htslib', min_lower_bound='1.20', upper_bound='1.24') 

Do I understand the plan correctly?

@jmarshall
Copy link
Member Author

Bioconda-repodata-patches's scripts are fairly niche and undocumented. You have translated the existing code correctly.

A program built against htslib-1.15 will be compatible until htslib bumps its soversion (unless they make a mistake!), but as it may well use functions newly introduced in htslib-1.15 it won't in general be compatible with earlier versions such as 1.14. Thus it's mostly not useful to reduce the lower pin — but if it was useful for something such functionality could be added to gen_patch_json.py.

Adding another _pin_looser(…) call like that is indeed the plan for after HTSlib bumps its soversion again (along with moving the TODO comment down, as only the latest _pin_looser ever needs adjusting). I could have added _pin_looser(…, '1.0', '1.4'); _pin_looser(…, '1.4', '1.10') initially to do this for those older versions too, but those were already ancient history at the time so best left alone IMHO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
please review & merge set to ask for merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants