-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit numpy\h5py max versions due to tensorflow 2.3.1 max supported versions #990
Merged
Merged
Changes from 4 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
64afb43
Limit max numpy version due to an issue on windows
carlogrisetti 0fa0365
Reflect TF 2.3.1 maximum versions
carlogrisetti e279ddf
Incorporated comment
w4nderlust d1b215e
Added platform check for hdf5py
w4nderlust b15dcc9
Exclude neuropod from windows requirements
carlogrisetti b48ef69
Revert platform constraint for h5py and numpy
carlogrisetti c630eca
Fix requirements typo
carlogrisetti File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the platform check needed for
h5py
? Is this related to h5py/h5py#1732?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually added it because of the numpy dependency, but maybe it's overzealous of me (pip should resolve the h5py version that better suits the windows numpy constraint). Wasn't aware of that issue. I believe that shouldn't be an issue as we are saving preprocessed data and not model weights in h5, but the bug behind the reason why weights saving is broken may pop up somewhere in Ludwig too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually those two constraints should be not platform specific: tensorflow 2.3.1 requires thaose constraints independently of platform (and I don't know why pip doesn't know that from the beginning and only tells that in the end)
I found out about this with a
pip install ludwig --upgrade --upgrade-strategy eager
Might be a bug / missing feature on pip, which should correctly map the support/requirements map of each package, but it upgrades both numpy and h5py beyond what's supported by tensorflow 2.3.1, giving a warning in the end.
I thought the quickest solution was to limit the version in requirements, but if you have a cleaner solution, it's of course way better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The platform constraint should be set only for neuropod, since it's a package that's not available under windows, and makes a
pip install ludwig[full]
fail, also ludwig[serve] fails for the same issueThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's weird though, because without those constraints it looks like on unix and mac machines there is no issue, so i guess pip does it's job at figuring out the correct versions for those platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is clear. In that case you shouldn't need to edit anything (apart from neuropod), since dependencies are correctly computed during install (will have to double-check that, but I think they are ok), and you can let pip figure out the ceiling version.
The numpy bug on windows that initially throw me into this is past the maximum supported version in tensorflow (due to the eager update), so if we're not taking eager update into account we can just forget this PR and only fix the neuropod on windows requirement.
Let me check for a correct
pip install ludwig
behavior on requirements under windows, just to be sureThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok nevermind,
pip install ludwig
installs the most up to date version of numpy, regardless of TF requirements, and results in this:tensorflow 2.3.1 requires h5py<2.11.0,>=2.10.0, but you'll have h5py 3.1.0 which is incompatible.
tensorflow 2.3.1 requires numpy<1.19.0,>=1.16.0, but you'll have numpy 1.19.4 which is incompatible.
At this point, your proposal on constraining the version only on windows only works if this issue is only present on windows. If linux\other correctly expand the dependencies and figure out that one of the dependent packages (tensorflow) has a more strict requirement than the base package (ludwig), it's ok. If this issue is also present on linux\others, I stand with my current code edit, limiting the uppermost version directly in ludwig's own requirements.txt
Another way would be to remove the dependency on numpy and h5py alltogether, and rely on the chained dependency of tensorflow, making it choose the most appropriate version for itself.
For example, TF 2.3.1's numpy must be >=1.16.0, whilist Ludwig's requirement is >=1.15.0 (and we could potentially have the same issue in the lower end of the spectrum, having pip thinking it's all ok because numpy's version is 1.15.0, while tensorflow 2.3.1 requirement would be >= 1.16.0).
I know it's a little bit tangled of an explanation, but I hope it's clear :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested on Mac:
With additional suggestion:
I tried following the suggestion of using
--use-feature=2020-resolver
and I got:Which is indeed the behavior one would expect from the dependency resolution. I can test on linux too, and on windows, but I believe it's not a platform issue.
Now te question is what do we want to do about this? My take is currently the following: for the time being I would be ok with putting the same constraints of TF 2.3.1 on both numpy and h5py, although this is not very sustainable moving forward, as we would have to keep track of them every single minor release of TF. So what I would do is to monitor when pip 20.3 is released and remove the constraints as soon as that happens. Should have been release in October, so I believeit won't be long.
What do you think @carlogrisetti and @tgaddair ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good for someone that keeps its system maintained... The user that had an open issue recently on #1000 (heh, what a milestone), was using pip 8.x for example, and although I hope it will be a sporadic aituation, many current and slightly previous pip versions will be around for a lot of time.
The question is: do we want to ease the life of older pips, or reply to anyone having this issue to "enlarge your pip"?
Why did you dismiss the "remove numpy/h5py requirement and just let TF choose its own versions" option I proposed? (It's an actual genuine question) Do we have some narrower requirements for Ludwig itself? (I don't think so)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, people using older pips may keep on posting issues. On the other hand the reson for avoiding this is maintainability. But you are probably right, we can just follow whatever restrictions tensorflow has and work with that, but that would make being compatible with multiple version of tensorflow like we do now (2.2 2.3 and nightly if you look at the tests) more difficult.