Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore how other (meta-)packages handle dependencies #1777

Closed
antonymilne opened this issue Aug 10, 2022 · 3 comments
Closed

Explore how other (meta-)packages handle dependencies #1777

antonymilne opened this issue Aug 10, 2022 · 3 comments

Comments

@antonymilne
Copy link
Contributor

Copying relevant parts from #1758...

[@AntonyMilneQB] Let’s investigate how other libraries are handling similar situations. e.g. I believe the idea for kedro-datasets might have been inspired by how django packages different components (?). @deepyaman mentioned jupyter’s metapackage approach. Again, maybe what we are doing is the best approach, but I would like to feel more confident about this. Just as we missed the possibility of namespace packages in the first place, maybe we’re missing something big here

[@deepyaman]
Agree that it would make sense to see/learn from the experience of more projects here. If somebody can find/share how Django does this, that would be great, because I haven't found it yet. :) While I think the metapackage approach sounds clean in theory, I wonder if it's overcomplicating things, if Kedro-Framework is essentially required, and Kedro-Datasets is the only additional package. Also, which (if any) of these approaches expect the underlying packages to be independent, and which support packages depending on each other (possibly again going back to the question of avoiding circular dependencies)?

Maybe there is a whole different way of handling the kedro vs. kedro-datasets split which would resolve the question of dependencies, what a user should pip install, how to handle the namespace, etc. e.g. @deepyaman suggested a kedro metapackage in which kedro-framework and kedro-datasets are both namespaced packages underneath that.

We don’t need to commit to implementing the kedro-framework split now if we don’t want to, but I think it would be good to get a feeling for whether this a route we might want to go down in future because it influences our current decision on how to handle kedro-datasets. e.g. it might convince us that pip install kedro[pandas.CSVDataSet] is good or bad.
38DC2301-54DB-4A29-9261-698ECD6F82FC
6781F28A-4338-4C40-8221-643D158DA462

@antonymilne antonymilne changed the title Explore how other packages handle dependencies Explore how other (meta-)packages handle dependencies Aug 10, 2022
@deepyaman
Copy link
Member

Since we're not experts on this subject, maybe it's worth posting a detailed issue to https://github.com/pypa/packaging.python.org/issues, because they may be aware of best practices/examples or at least have an opinion.

@deepyaman
Copy link
Member

I just thought of another example--dask has dask.distributed, dask.array, etc. and you can pip install dask[complete].

@merelcht
Copy link
Member

merelcht commented Sep 15, 2022

We will not further pursue this approach for now as it seems like overkill to make Kedro into a metapackage structure. Further details and discussion can be seen under #1758.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

4 participants