Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destroy default project path variable #1955

Merged
merged 39 commits into from
Oct 25, 2022

Conversation

jmholzer
Copy link
Contributor

@jmholzer jmholzer commented Oct 18, 2022

Description

Resolves #1840

A summary of the issue is as follows:

The global variable default_project_path of kedro.ipython allows the line magic %reload_kedro to remember the project path. Global variables such as these decrease code quality and maintainability for a number of reasons and should be removed. The project path can either be specified by the user, or an attempt can be made to find it based on the current working directory if it is not specified (using _find_kedro_project). The default_project_path variable is 100% coupled to the 'in-memory' feature of %kedro_reload and is decoupled from the entire remaining code base. To remove it we could do one of two things:

  1. Decouple the 'in-memory' feature from this variable by changing its implementation.
  2. Remove the 'in-memory' feature from %kedro_reload

@AntonyMilneQB gave 3 suggestions of ways to accomplish option 1. @MerelTheisenQB and @AntonyMilneQB asked for user research as a prerequisite to option 2.

Option 1 does not change anything from the user's perspective, it only changes the implementation of the feature in the code base.

This PR implements option 1, because:

  • The implementation change is relatively straightforward and simplifies maintenance going forward.
  • Removing the 'in-memory' feature would have an impact on some users.

Development notes

This PR:

  1. Refactors kedro/ipython/__init__.py:
    1.1. By delegating the responsibility for resolving a project path to a dedicated function (single responsibility / separation of concerns).
    1.2. By sorting the functions in the file in order of importance.
  2. Removes the default_project_path global variable while retaining the 'in-memory' feature of %reload_kedro, which was re-implemented using the 'local scope' functionality provided by IPython.

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes

jmholzer and others added 15 commits October 18, 2022 16:29
Signed-off-by: Jannic Holzer <[email protected]>
Signed-off-by: Jannic Holzer <[email protected]>
…dro-org/kedro into destroy_default_project_path_variable

Signed-off-by: Jannic Holzer <[email protected]>
Signed-off-by: Jannic Holzer <[email protected]>
…dro-org/kedro into destroy_default_project_path_variable

Signed-off-by: Jannic Holzer <[email protected]>
@jmholzer jmholzer marked this pull request as ready for review October 19, 2022 18:20
@jmholzer jmholzer requested a review from idanov as a code owner October 19, 2022 18:20
env: str = None,
extra_params: Dict[str, Any] = None,
local_ns: Optional[Dict[str, Any]] = None,
) -> None: # pragma: no cover
"""Function that underlies the %reload_kedro Line magic. This should not be imported
or run directly but instead invoked through %reload_kedro."""

# If a path is provided, set it as default for subsequent calls
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not always true now - if a user is using reload_kedro instead of %reload_kedro. It only remembers the path when it's using the line magic

Copy link
Contributor Author

@jmholzer jmholzer Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, removed this comment in 72509ec

expected = Path("/test").resolve()
assert result == expected

def no_path_no_local_namespace_specified(self, mocker):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be a test?

Suggested change
def no_path_no_local_namespace_specified(self, mocker):
def test_no_path_no_local_namespace_specified(self, mocker):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks! Fixed in 211706c

Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite good already as a 1st draft! I left some minor comments there, I think it's nice to keep the memory behaviour.

Removing the global should make the ipython testing easier. Have you manually test the test_ipython.py ? As we skipped these tests it would be great to manually test it.

Signed-off-by: Jannic Holzer <[email protected]>
Signed-off-by: Jannic Holzer <[email protected]>
Signed-off-by: Jannic Holzer <[email protected]>
@jmholzer
Copy link
Contributor Author

jmholzer commented Oct 20, 2022

Thanks for the review @noklam! 😃 I addressed the comments you left individually.

Removing the global should make the ipython testing easier. Have you manually test the test_ipython.py ? As we skipped these tests it would be great to manually test it.

Yes these tests work, do you know why these are being skipped? Is it due to performance?

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally this looks good 👍 I've got some questions specifically about the local_ns stuff

path: str = None,
env: str = None,
extra_params: Dict[str, Any] = None,
local_ns: Optional[Dict[str, Any]] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is local_ns? I'd be in favour of having longer argument names if that means it's more descriptive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local_ns is the argument that IPython passes to magics that are decorated with needs_local_scope, which is the feature that allows this implementation to work. local_ns is a dict containing all declarations in the scope of the IPython session and is persisted between calls to magics / extensions being reloaded. Unfortunately, the name must be local_ns in the function signature of the magic for IPython to use it correctly. This feature of IPython seems poorly documented, or I would share some docs.

That being said, we should use a more descriptive name elsewhere -- local_namespace. I will also change the project_path key of local_ns to _project_path, to avoid users accidentally overwriting this.

These changes are in 838aaf1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining @jmholzer, this makes sense now.

kedro/ipython/__init__.py Show resolved Hide resolved
kedro/ipython/__init__.py Outdated Show resolved Hide resolved
@jmholzer
Copy link
Contributor Author

@jmholzer I think we need to manually test out the IPython stuff in GitPod -> try out the kedro ipython command and check if the variables are instantiated properly.

Hey Nok, the PR is in a state where it's ready for some manual testing. I have manually tested the changes and am happy that the functionality is the same as before for the cases I could think of. It would be great if you could test some of the cases you think of too.

Signed-off-by: Jannic Holzer <[email protected]>
Copy link
Contributor

@SajidAlamQB SajidAlamQB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, just had some lingering questions about local_ns and how it exactly works.

path: str = None,
env: str = None,
extra_params: Dict[str, Any] = None,
local_namespace: Optional[Dict[str, Any]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still work, I thought it had to be local_ns or is that just for functions with the decorator, needs_local_scope.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understood it, it must be local_ns in the magic_reload_kedro method above, but it's okay if it's a different name in the other methods.

Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done! ⭐
Don't forget to add this change to the release notes 📝

Copy link
Contributor

@SajidAlamQB SajidAlamQB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work on this! Tested it manually all looks good. 🌟

Signed-off-by: Jannic Holzer <[email protected]>
…dro-org/kedro into destroy_default_project_path_variable

Signed-off-by: Jannic Holzer <[email protected]>
@noklam noklam self-requested a review October 24, 2022 10:24
Copy link
Contributor

@noklam noklam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I manually tested the kedro ipython and it works fine.

  • kedro ipython
  • `%reload_kedro # with/without arguments
  • Check if the context, pipeline, session object are created correctly
  • session.run works

Comment on lines 260 to 261
class Context:
_project_path = Path("/test").resolve()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this class really necessary? I feel like it overcomplicates the test setup a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context for this change is that I moved to storing the project path in context (an instance of KedroContext), which is a variable in the IPython local namespace. This is the way that Viz does it and it is much safer than storing it in the 'top-level' of the IPython namespace, where a user can accidentally overwrite it.

This means that I had to change the testing implementation, I am using a dummy Context class rather than importing KedroContext and instantiating this, since its __init__ takes a lot of arguments that we do not need here. Do you think I should do the latter instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining! I'm happy to leave it like this, but perhaps add a small comment in the tests with this Context class explaining why it's there. I didn't immediately understand it was the KedroContext and not some completely different context used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for taking another look :). Made those changes and will merge

@jmholzer jmholzer merged commit f49234c into main Oct 25, 2022
@jmholzer jmholzer deleted the destroy_default_project_path_variable branch October 25, 2022 18:59
nickolasrm pushed a commit to ProjetaAi/kedro that referenced this pull request Oct 26, 2022
* Refactor function order

Signed-off-by: Jannic Holzer <[email protected]>

* Add implementation for _resolve_and_update_project_path

Signed-off-by: Jannic Holzer <[email protected]>

* Remove fixture for default_project_path

Signed-off-by: Jannic Holzer <[email protected]>

* Move check for unresolvable project name to extension load

Signed-off-by: Jannic Holzer <[email protected]>

* Remove 'raises TypeError' from docstring

Signed-off-by: Jannic Holzer <[email protected]>

* Add test coverage for _resolve_project_path

Signed-off-by: Jannic Holzer <[email protected]>

* Fix linting :/

Signed-off-by: Jannic Holzer <[email protected]>

* Add cross-platform support for tests

Signed-off-by: Jannic Holzer <[email protected]>

* More cross-platform path changes

Signed-off-by: Jannic Holzer <[email protected]>

* Linting :s

Signed-off-by: Jannic Holzer <[email protected]>

* Remove comment regarding path update

Signed-off-by: Jannic Holzer <[email protected]>

* Add missing test_* prefix

Signed-off-by: Jannic Holzer <[email protected]>

* Fix path for test on Windows

Signed-off-by: Jannic Holzer <[email protected]>

* Add mocker patch to fix failing test

Signed-off-by: Jannic Holzer <[email protected]>

* Lint with black

Signed-off-by: Jannic Holzer <[email protected]>

* Move unresovable path check to load_ipython_extension

Signed-off-by: Jannic Holzer <[email protected]>

* Modify call to magic('reload_kedro') to simple call to reload_kedro

Signed-off-by: Jannic Holzer <[email protected]>

* Add tests for the case with an unresolvable project

Signed-off-by: Jannic Holzer <[email protected]>

* Lint

Signed-off-by: Jannic Holzer <[email protected]>

* Modify local_ns to local_namespace in _resolve_project_path

Signed-off-by: Jannic Holzer <[email protected]>

* Add test for updating _project_path in local_ns

Signed-off-by: Jannic Holzer <[email protected]>

* Remove unused pylint ignores

Signed-off-by: Jannic Holzer <[email protected]>

* Add description of change

Signed-off-by: Jannic Holzer <[email protected]>

* Store _project_path in 'context' if available

Signed-off-by: Jannic Holzer <[email protected]>

* Disable pylint warnings

Signed-off-by: Jannic Holzer <[email protected]>

* Rename Context() to DummyKedroContext() to make intention clearer

Signed-off-by: Jannic Holzer <[email protected]>

* Add comment explaining purpose of DummyKedroContext class

Signed-off-by: Jannic Holzer <[email protected]>

* Rename DummyKedroContext to MockKedroContext

Signed-off-by: Jannic Holzer <[email protected]>

Signed-off-by: Jannic Holzer <[email protected]>
Signed-off-by: nickolasrm <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Destroy the default_project_path variable in IPython extension
4 participants