Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the documentations for kedro-datasets work on kedro #2006

Merged
merged 26 commits into from
Nov 29, 2022
Merged

Conversation

SajidAlamQB
Copy link
Contributor

@SajidAlamQB SajidAlamQB commented Nov 8, 2022

Signed-off-by: SajidAlamQB [email protected]

NOTE: Kedro datasets are moving from kedro.extras.datasets to a separate kedro-datasets package in
kedro-plugins repository. Any changes to the dataset implementations
should be done by opening a pull request in that repository.

Description

Since kedro.extras.datasets is moving to a separate kedro-datasets package, how will our docs work for kedro-datasets and kedro?

This PR will look to implement the solution outlined in the comment on issue: #1651 (comment).

kedro-datasets and kedro docs should remain together on RTD as they do now. The only difference is that the API docs for kedro-datsets will appear under kedro.datasets rather than kedro.extras.datasets. All the document-building takes place on the kedro repo.

To achieve this we need the following flow when docs are built on kedro:

  1. Copy the kedro-datasets files into a datasets folder in kedro (maybe clone just the relevant branch of the repo? Or pip install --no-deps -t into the right directory is maybe easier?)
  2. make build-docs on the kedro repo should then work as it does now to generate API docs for both kedro core and kedro datasets
  3. the kedro-datasets files should not remain on the kedro repo - they're just temporarily copied across for docs generation

Development notes

Notes on how to copy kedro-datasets to kedro:

To copy kedro-datasets main use this command:

pip install --no-deps -t TARGET_DIRECTORY git+https://github.com/kedro-org/kedro-plugins.git#subdirectory=kedro-datasets

To copy kedro-datasets latest release use this command:

pip install --no-deps -t TARGET_DIRECTORY kedro-datasets

After the technical design session, we decided to simplify the design solution by allowing stable and latest in RTD to be the same as the latest version of kedro-datasets. In the near future, we will move away from ReadTheDocs and sphinx, so we are happy to leave this as a stop-gap solution.

  • Created a bash script that copies kedro-datasets latest release to kedro.datasets in kedro. The script is used by build-docs.sh and .readthedocs.yml so that both the CI and ReadTheDocs will build the correct documentation.
  • .readthedocs.yml has hooks/build customisation that allows us to copy over the kedro-datasets before docs are built.

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes

Signed-off-by: SajidAlamQB <[email protected]>
@SajidAlamQB SajidAlamQB self-assigned this Nov 8, 2022
@SajidAlamQB SajidAlamQB marked this pull request as ready for review November 15, 2022 12:33
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work @SajidAlamQB ! ⭐ The decision made in tech design really simplified the solution. Really well done on investigating all options and implementing it 👍

Copy link
Contributor

@stichbury stichbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I couldn't build the docs on my machine from your branch because of various toolchain issues associated with tools like pyproj on my machine (which is a bit odd, as I've been building OK to date, but could be unrelated).

So I didn't verify the build myself locally (and the CI checks aren't returning success yet), but it looks sound to me from the perspective of what you've implemented. Nice work!

@SajidAlamQB SajidAlamQB merged commit e57a4b9 into main Nov 29, 2022
@SajidAlamQB SajidAlamQB deleted the test-dataset branch November 29, 2022 15:02
@merelcht merelcht linked an issue Nov 30, 2022 that may be closed by this pull request
@stichbury
Copy link
Contributor

@astrojuanlu just tagging you in on this as a record of the decision-making about datasets API docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to make the documentations works for kedro-datasets?
3 participants