Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outroduction of rdflib-jsonld #1405

Closed
ajnelson-nist opened this issue Sep 8, 2021 · 19 comments
Closed

Outroduction of rdflib-jsonld #1405

ajnelson-nist opened this issue Sep 8, 2021 · 19 comments

Comments

@ajnelson-nist
Copy link
Contributor

Some time recently, code bases that import the rdflib-jsonld package started to break, due to a feature related to 2to3. Unfortunately, this has had a negative effect on Continuous Integration workflows for packages that (still) include rdflib-jsonld as a dependency. One issue was filed with good background on the matter, here (h/t to @Panaetius).

It was not clear that the rdflib-jsonld package had entered an archive state to those who were not interacting with the Github repository. There was no deprecation notice. Unfortunately, we now have a situation where all CI involving rdflib-jsonld in a package or the package's dependencies fails.

I would like to request the rdflib-jsonld package get one more release posted to PyPI, reducing the package to having two effects:
(1) Having a dependency on rdflib>=6.0.0, so any user of JSON-LD features will still have access to JSON-LD features.
(2) Causing a DeprecationWarning to be raised on import---or better yet, on pip installing. I don't know how to do the latter, but I know the former can be done with the warnings module. E.g., the __init__.py file currently contains:

"""
"""
__version__ = "0.6.0-dev"

It could instead contain:

"""
"""
__version__ = "0.6.0"
import warnings
warning.warn("The rdflib-jsonld package has been integrated into rdflib as of rdflib 6.0.0.  Please remove rdflib-jsonld from your project's dependencies.", DeprecationWarning)

or other appropriate text.

This would allow for a more graceful transition for the package maintainers and their downstream consumers that just needed to support JSON-LD reading.

I could see some discussion around whether it would also be appropriate to delete all other functionality from the rdflib-jsonld package. I'd argue, yes it would be appropriate, because anyone using deeper module features should be importing those features from where they're being maintained. For better or worse, those deeper module features are unavailable now because of the use_2to3 issue. Last, the use_2to3 issue is accidentally a good argument for reducing this package to a stub: risk of future deprecated features would be removed by removing other functionality and reducing setup.py.

To recap, I request rdflib-jsonld be unarchived just long enough for one further release on PyPI. That release should include a dependency on rdflib>=6.0.0. Its main __init__.py should raise a DeprecationWarning on import. If possible, a DeprecationWarning should be raised during pip install. All other functionality (other .py files, and complexity in setup.py) should be removed.

@ajnelson-nist
Copy link
Contributor Author

Tagging @nicholascar , as the one who posted the archiving notice. Would this be feasible?

@Panaetius
Copy link
Contributor

Thank you for opening this issue 🙏

All other functionality (other .py files, and complexity in setup.py) should be removed.

I think most code currently using rdflib_jsonld looks something like

    from rdflib_jsonld.parser import JsonLDParser
    from rdflib_jsonld.serializer import JsonLDSerializer
    from rdflib.plugin import register, Serializer, Parser
    register('json-ld', Parser, 'rdflib_jsonld.parser', 'JsonLDParser')
    register('json-ld', Serializer, 'rdflib_jsonld.serializer', 'JsonLDSerializer')

i.e. registering the Parser and/or Serializer from rdflib_jsonld.

To make a transition package such as you proposed functional, I think it'd at least need to import/expose rdflib.plugins.parsers.jsonld.JsonLDParser and rdflib.plugins.serializers.jsonld.JsonLDSerializer from RDFLib into rdflib_jsonld. Though I wonder if this could cause issues with registering the same plugin twice when the downstream code calls register. Or is then intention here that this new transitionary version of rdflib_jsonld should break downstream code?

Another thing to consider is that setuptools supports all versions of python3 and RDFLib now is python >= 3.7. So for anyone using Python 3.6 (which I think is still ~30-40% last I checked) or lower, this solution wouldn't work. They might be able to pin setuptools and keep working with the old version, though that's not always the case, as setuptools often is an external dependency or in cases as outlined below. For our project it wouldn't matter, I just wonder if there could be a solution that keeps old code with properly pinned dependencies for rdflib_jsonldworking.

As a bit of an aside on how we're currently affected, in our case it's a bit more tragic than CI failing. We provide interactive environments to users to do analysis in, and as part of that we have Docker images with our application installed using pipx. pipx has its own shared virtual environment that it installs software into and that environment contains setuptools etc. pipx will update the version of these dependencies automatically every 30 days or so, irrespective of the pipx version. So pinning pipx doesn't help. Anyone using pipx to install software will end up with the newest setuptools being used in the next <30 days, which will lead to this issue.
We currently have a workaround which amounts to running ~/.local/pipx/shared/bin/pip install setuptools==57.5.0 to force rollback pipx's setuptools version before doing a pipx command (this might still fail since pipx could reinstall the newest version again as part of any command...).

@nicholascar
Copy link
Member

Un-archiving is easy - I can just click the button for that one, so I'll do that now.

I would like to request the rdflib-jsonld package get one more release posted to PyPI,

I think we can manage that in the next few days.

Having a dependency on rdflib>=6.0.0, so any user of JSON-LD features will still have access to JSON-LD features.

Do you perhaps mean "a dependency on rdflib<=6.0.0", i.e. versions of rdflib less than 6.0.0?

Causing a DeprecationWarning to be raised on import

Yes I can do this.

@ajnelson-nist
Copy link
Contributor Author

Re: @Panaetius
On the parser/serializer matter - I'm not sure what the right answer there is. As of rdflib 6.0.0, that code would be completely vestigial, wouldn't it? And, FWIW, I hadn't had to do those register() calls before. With rdflib-jsonld installed in my (virtual) environment, the json-ld parser/serializer modes were always available.
(My own aside - I filed this PR to make some more of the JSON-LD features transparently available.)

On Python 3.6 - I personally think it is fine to encourage people to transition away from 3.6, as 3.6 is reaching end of life in around 3.5 months.

I haven't heard of pipx before. It sounds like you have a significantly more complex technology stack, and would benefit more from rdflib-jsonld not breaking build processes, rather than attempting to pin deeper Python packages.

Re: @nicholascar
Thanks for all that! And, no, I do mean rdflib>=6.0.0, because then importing rdflib-jsonld is guaranteed to import rdflib with JSON-LD features integrated.

@Panaetius
Copy link
Contributor

Panaetius commented Sep 9, 2021

And, FWIW, I hadn't had to do those register() calls before

Well I don't know how others use it, I just know that's how we used it in the past and also what I found in OWL-RL https://github.com/RDFLib/OWL-RL/blob/master/owlrl/__init__.py#L197-L205 so I assumed that's how everyone uses it.

On Python 3.6 - I personally think it is fine to encourage people to transition away from 3.6

I completely agree, we discontinued support for 3.6 in our project last week for the same reason. When we did that I did some analysis against pypi to check which versions of python were used to download some popular datascience packages in the last month and it was 10-38% of people still using 3.6, depending on the package. So still a sizable portion of users. So I just wanted to mention it so there is a discussion on that 😄

would benefit more from rdflib-jsonld not breaking build processes, rather than attempting to pin deeper Python packages.

We try to support reproducibility in science, so it's important to us that old stuff works indefinitely. But it's really hard to actually offer that and this (pipx) was a new, interesting way that broke things that we hadn't encountered before. We're working on a solution in our stack, just wanted to say there might be cases where pinning setuptools isn't possible. Mostly I'm trying to think of a solution that could keep old user projects on our platform working, but I think it's a bit beside the discussion here, sorry.

@rchateauneu
Copy link
Contributor

Many projects cannot easily migrate to the next Python version. It might sometimes take years. Here are some statistics about use of various Python versions: Python version share over time

@nicholascar
Copy link
Member

Many projects cannot easily migrate to the next Python version.

OK, so I think it's sensible to keep this package working so people who can't upgrade can pin to rdflib 5.0.0 + rdflib-jsonld, just as some can pin to rdflib 5.0.0 for Python 2.7!

It's correct that you don't need to also manually register JSON-LD parsers/serializers if installing rdflib-jsonld but there are a couple of tricks there and I cen't remember them all (of course I'm trying to forget them after merging in rdflib-jsonld into rdflib 6.0.0!).

Perhaps a tombstoning pypi for rdflib-jsonld really is sensible. As promised before, I'll try and look into this. Any help appreciated!

OWL-RL needs care too. I have planned for a couple of years now, to try and update it but haven't found a specific motivator. I'd like to see it:

  • using latest rdflib, including the packaged namespaces
  • using modern Python conventions
  • implementing EL & QL, as well as RL & RDFS

I also want to check RL status: there appear to be rules commented out. Is it actually doing completely what RL profile says it should?

@ajnelson-nist
Copy link
Contributor Author

ajnelson-nist commented Sep 10, 2021

The OWL-RL matters could probably use separate tickets for discussion on its repository. It at least looks like your first point on it using the latest rdflib is already raised here.

For rdflib-jsonld, I'm eager for it to function again, because almost all of the CI in my community is broken due to either direct or indirect (upstream) dependencies on rdflib-jsonld. It seems (thanks again to @Panaetius ) that the minimal change needed is removing use_2to3 from setup.py, due to setuptools 2769. That would restore builds.

I think it would be reasonable to include the deprecation warning on importing the top module. That would notify users that features have migrated, and so should their code.

It might be a bit more discussion to do the other things I'd suggested, as I hadn't appreciated the preservation angles of Python 3.6 as fully.

So, there might be two "Tombstone" releases. I'll file the PRs to start the first.

ajnelson-nist added a commit to ajnelson-nist/rdflib-jsonld that referenced this issue Sep 10, 2021
The resolution of setuptools 2769 made any package using `use_2to3` to
fail its build.  This patch removes the flag, in support of outroducing
rdflib-jsonld.

References:
* pypa/setuptools#2769
* RDFLib/rdflib#1405

Reported-by: Ralf Grubenmann <[email protected]>
Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to ajnelson-nist/rdflib-jsonld that referenced this issue Sep 10, 2021
The resolution of setuptools 2769 made any package using `use_2to3` to
fail its build.  This patch removes the flag, in support of outroducing
rdflib-jsonld.

The test suite is showing other follow-on patches will be necessary to
fix matters 2to3 had been quietly fixing along the way.

References:
* pypa/setuptools#2769
* RDFLib/rdflib#1405

Reported-by: Ralf Grubenmann <[email protected]>
ajnelson-nist added a commit to ajnelson-nist/rdflib-jsonld that referenced this issue Sep 10, 2021
The resolution of setuptools 2769 made any package using `use_2to3` to
fail its build.  This patch removes the flag, in support of outroducing
rdflib-jsonld.

The test suite is showing other follow-on patches will be necessary to
fix matters 2to3 had been quietly fixing along the way.  However, this
first patch does restore a working call to `pip install .` with
up-to-date setuptools.

References:
* pypa/setuptools#2769
* RDFLib/rdflib#1405

Reported-by: Ralf Grubenmann <[email protected]>
ajnelson-nist added a commit to ajnelson-nist/rdflib-jsonld that referenced this issue Sep 10, 2021
This patch establishes the baseline state of unit tests passing.

Unit test results: 5 tests report a FAIL status, all related to JSON-LD
compaction. I suggest that these tests are out of scope of the mission
of restoring builds functioning, related to removing the use_2to3 flag.

References:
* RDFLib/rdflib#1405

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to ajnelson-nist/rdflib-jsonld that referenced this issue Sep 10, 2021
The resolution of setuptools 2769 made any package using `use_2to3` to
fail its build.  This patch removes the flag, in support of outroducing
rdflib-jsonld.

The test suite is showing other follow-on patches will be necessary to
fix matters 2to3 had been quietly fixing along the way.  However, this
first patch does restore a working call to `pip install .` with
up-to-date setuptools.

Unit test results: This causes only the same five tests as were
previously failing to fail.

setuptools versions tested:
* 41.2.0
* 58.0.4

References:
* pypa/setuptools#2769
* RDFLib/rdflib#1405

Reported-by: Ralf Grubenmann <[email protected]>
Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to ajnelson-nist/rdflib-jsonld that referenced this issue Sep 10, 2021
The message provided in this patch is a first draft and invited for
revision by a project maintainer.

The context manager is to contain the effect of
`warnings.simplefilter()`, prevent an import of `rdflib-jsonld` from
introducing unexpected behaviors from other applications using the
`warnings` module.

References:
* RDFLib/rdflib#1405
* https://docs.python.org/3/library/warnings.html#temporarily-suppressing-warnings

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

@nicholascar , PR 105 and PR 106 are ready for your review. Together, they meet the goals of restoring CI, preserving rdflib cohabitation for both versions 5.0.0 and 6.0.0, and add an import-time DeprecationWarning. Might a release be made with those PRs merged?

Feedback or patches re welcome on how to raise a DeprecationWarning at the time of package installation. I don't have experience doing that with setuptools.

@nicholascar
Copy link
Member

@ajnelson-nist I've merged those PRs but, I remember now, I don't actually have management permissions for rdflib-jsonld on PyPI so I will ask @niklasl who does to either make a release, 0.6.0, or to give me permissions to do so.

@nicholascar
Copy link
Member

Actually I do have permissions, a 0.6.0 release is there now: https://pypi.org/project/rdflib-jsonld/

Please let me know if there are any more issues and we can always make a 0.6.1...

@ajnelson-nist
Copy link
Contributor Author

@nicholascar , thank you for posting the release. CI works again in my community.

I realized I may have accidentally expanded the dependency scope in PR 105. I removed the pin this package had on rdflib>=5.0.0. I'll revert that in one more PR.

I think my other suggestions - ultimately pinning to rdflib>=6.0.0, and deleting all functionality except the DeprecationWarning, are worth a truly-last tombstone release if you think that would be appropriate. I also think it's fine to stop the releases after fixing my unpinning. I leave that to your judgement.

@ajnelson-nist
Copy link
Contributor Author

PR 107 filed.

@nicholascar , it is fine with me if you would like to close this Issue.

@nicholascar
Copy link
Member

ultimately pinning to rdflib>=6.0.0, and deleting all functionality except the DeprecationWarning

Sure, I think this would be fine. Perhaps give it a couple of months? Or would you prefer me to make this next release straight away? I've already merged in your PR and pushed a new release to PyPI to include it: https://pypi.org/project/rdflib-jsonld/0.6.1/

@jpmccu
Copy link
Contributor

jpmccu commented Sep 15, 2021

To be clear here, starting with current (pre-release) versions of rdflib>=6.0.0, json-ld will be handled with RDFlib alone?

@ajnelson-nist
Copy link
Contributor Author

I don't have an authoritative reply for Jamie, but my understanding is yes, that has been so since July.

Re: @nicholascar
My own opinion is it should be done immediately(*) so nobody mistakes the repository as still being independently maintained. I think others will have a better opinion of whether and when to create such a release; others may be more impacted than I would be at present.

(*): Apparently, one piece of functionality was lost for at least one user, which has been restored since rdflib 6.0.0 but has not been released. So, maybe the pin should be rdflib>=6.0.1 (or whatever version it will be). See Issue 1412.

@nicholascar
Copy link
Member

OK, I have expected to have to make an rdflib 6.0.1 release to fix a bunch of things that were broken in the major 6.0.0 release and we have a bunch of updates ready to go, so I'll make that release next (day or two) and then we can pin rdflib>=6.0.1 after that.

@nicholascar
Copy link
Member

OK, rdflib 6.0.1 is out as of last night. Trying to wrap up rdflib-jsonld now.

@nicholascar
Copy link
Member

nicholascar commented Sep 18, 2021

rdflib-jsonld 0.6.2 with all functionality removed is now released: https://pypi.org/project/rdflib-jsonld/0.6.2/

Please let me know if there are further problems by opening a new issue.

nikhil pushed a commit to mskcc/rdflib-jsonld that referenced this issue Jan 5, 2024
The resolution of setuptools 2769 made any package using `use_2to3` to
fail its build.  This patch removes the flag, in support of outroducing
rdflib-jsonld.

The test suite is showing other follow-on patches will be necessary to
fix matters 2to3 had been quietly fixing along the way.  However, this
first patch does restore a working call to `pip install .` with
up-to-date setuptools.

Unit test results: This causes only the same five tests as were
previously failing to fail.

setuptools versions tested:
* 41.2.0
* 58.0.4

References:
* pypa/setuptools#2769
* RDFLib/rdflib#1405

Reported-by: Ralf Grubenmann <[email protected]>
Signed-off-by: Alex Nelson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants