Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alias Kedro IPython extension to kedro.ipython #1837

Merged
merged 21 commits into from
Sep 9, 2022
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@ include LICENSE.md
include requirements.txt
include test_requirements.txt
include kedro/framework/project/default_logging.yml
include kedro/extras/extensions/*.png
include kedro/ipython/*.png
recursive-include templates *
4 changes: 3 additions & 1 deletion RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
# Upcoming Release 0.18.3

## Major features and improvements
* The Kedro IPython extension should now be loaded with `%load_ext kedro.ipython`.
* The line magic `%reload_kedro` now accepts keywords arguments, e.g. `%reload_kedro --env=prod`.

## Bug fixes and other changes
Expand All @@ -29,6 +30,7 @@
* Relaxed `delta-spark` upper bound to allow compatibility with Spark 3.1.x and 3.2.x.

## Upcoming deprecations for Kedro 0.19.0
* The Kedro IPython extension will no longer be available as `%load_ext kedro.extras.extensions.ipython`; use `%load_ext kedro.ipython` instead.

# Release 0.18.2

Expand Down Expand Up @@ -129,7 +131,7 @@ main(
* Added `save_args` to `feather.FeatherDataSet`.

### Jupyter and IPython integration
* The [only recommended way to work with Kedro in Jupyter or IPython is now the Kedro IPython extension](https://kedro.readthedocs.io/en/0.18.0/tools_integration/ipython.html). Managed Jupyter instances should load this via `%load_ext kedro.extras.extensions.ipython` and use the line magic `%reload_kedro`.
* The [only recommended way to work with Kedro in Jupyter or IPython is now the Kedro IPython extension](https://kedro.readthedocs.io/en/0.18.0/tools_integration/ipython.html). Managed Jupyter instances should load this via `%load_ext kedro.ipython` and use the line magic `%reload_kedro`.
* `kedro ipython` launches an IPython session that preloads the Kedro IPython extension.
* `kedro jupyter notebook/lab` creates a custom Jupyter kernel that preloads the Kedro IPython extension and launches a notebook with that kernel selected. There is no longer a need to specify `--all-kernels` to show all available kernels.

Expand Down
4 changes: 2 additions & 2 deletions docs/source/deployment/databricks.md
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,7 @@ Your complete notebook should look similar to this (the results are hidden):

### 9. Using the Kedro IPython Extension

You can interact with Kedro in Databricks through the Kedro [IPython extension](https://ipython.readthedocs.io/en/stable/config/extensions/index.html), `kedro.extras.extensions.ipython`.
You can interact with Kedro in Databricks through the Kedro [IPython extension](https://ipython.readthedocs.io/en/stable/config/extensions/index.html), `kedro.ipython`.

The Kedro IPython extension launches a [Kedro session](../kedro_project_setup/session.md) and makes available the useful Kedro variables `catalog`, `context`, `pipelines` and `session`. It also provides the `%reload_kedro` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html) that reloads these variables (for example, if you need to update `catalog` following changes to your Data Catalog).

Expand All @@ -239,7 +239,7 @@ The IPython extension can be used in a Databricks notebook in a similar way to h
If you encounter a `ContextualVersionConflictError`, it is likely caused by Databricks using an old version of `pip`. Hence there's one additional step you need to do in the Databricks notebook to make use of the IPython extension. After you load the IPython extension using the below command:

```ipython
In [1]: %load_ext kedro.extras.extensions.ipython
In [1]: %load_ext kedro.ipython
```

You must explicitly upgrade your `pip` version by doing the below:
Expand Down
10 changes: 5 additions & 5 deletions docs/source/tools_integration/ipython.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@ There are reasons why you may want to use a Notebook, although in general, the p

## Kedro IPython extension

The recommended way to interact with Kedro in IPython and Jupyter is through the Kedro [IPython extension](https://ipython.readthedocs.io/en/stable/config/extensions/index.html), `kedro.extras.extensions.ipython`. An [IPython extension](https://ipython.readthedocs.io/en/stable/config/extensions/) is an importable Python module that has a couple of special functions to load and unload it.
The recommended way to interact with Kedro in IPython and Jupyter is through the Kedro [IPython extension](https://ipython.readthedocs.io/en/stable/config/extensions/index.html), `kedro.ipython`. An [IPython extension](https://ipython.readthedocs.io/en/stable/config/extensions/) is an importable Python module that has a couple of special functions to load and unload it.

The Kedro IPython extension launches a [Kedro session](../kedro_project_setup/session.md) and makes available the useful Kedro variables `catalog`, `context`, `pipelines` and `session`. It also provides the `%reload_kedro` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html) that reloads these variables (for example, if you need to update `catalog` following changes to your Data Catalog).

The simplest way to make use of the Kedro IPython extension is through the following commands:
* `kedro ipython`. This launches an IPython shell with the extension already loaded and is equivalent to the command `ipython --ext kedro.extras.extensions.ipython`.
* `kedro ipython`. This launches an IPython shell with the extension already loaded and is equivalent to the command `ipython --ext kedro.ipython`.
* `kedro jupyter notebook`. This creates a custom Jupyter kernel that automatically loads the extension and launches Jupyter Notebook with this kernel selected.
* `kedro jupyter lab`. This creates a custom Jupyter kernel that automatically loads the extension and launches JupyterLab with this kernel selected.

Expand All @@ -32,7 +32,7 @@ If these variables are not available then Kedro has not been able to load your p

If the above commands are not available to you (e.g. you work in a managed Jupyter service such as a Databricks Notebook) then equivalent behaviour can be achieved by explicitly loading the Kedro IPython extension with the `%load_ext` line magic:
```ipython
In [1]: %load_ext kedro.extras.extensions.ipython
In [1]: %load_ext kedro.ipython
```

If your IPython or Jupyter instance was launched from outside your Kedro project then you will need to run a second line magic to set the project path so that Kedro can load the `catalog`, `context`, `pipelines` and `session` variables:
Expand All @@ -42,7 +42,7 @@ In [2]: %reload_kedro <project_root>
The Kedro IPython extension remembers the project path so that subsequent calls to `%reload_kedro` do not need to specify it:

```ipython
In [1]: %load_ext kedro.extras.extensions.ipython
In [1]: %load_ext kedro.ipython
In [2]: %reload_kedro <project_root>
In [3]: %reload_kedro
```
Expand Down Expand Up @@ -204,7 +204,7 @@ If you are not able to execute `kedro jupyter notebook` or `kedro jupyter lab` t

### Manage Jupyter kernels

Behind the scenes, the `kedro jupyter notebook` and `kedro jupyter lab` commands create a Jupyter kernel named `kedro_<package_name>`. This kernel is identical to the [default IPython kernel](https://ipython.readthedocs.io/en/stable/install/kernel_install.html) but with a slightly customised [kernel specification](https://jupyter-client.readthedocs.io/en/stable/kernels.html#kernel-specs) that automatically loads `kedro.extras.extensions.ipython` when the kernel is started. The kernel specification is installed at a user level rather than system-wide.
Behind the scenes, the `kedro jupyter notebook` and `kedro jupyter lab` commands create a Jupyter kernel named `kedro_<package_name>`. This kernel is identical to the [default IPython kernel](https://ipython.readthedocs.io/en/stable/install/kernel_install.html) but with a slightly customised [kernel specification](https://jupyter-client.readthedocs.io/en/stable/kernels.html#kernel-specs) that automatically loads `kedro.ipython` when the kernel is started. The kernel specification is installed at a user level rather than system-wide.

```{note}
If a Jupyter kernel with the name `kedro_<package_name>` already exists then it is replaced. This ensures that the kernel always points to the correct Python executable. For example, if you change conda environment in a Kedro project then you should re-run `kedro jupyter notebook/lab` to replace the kernel specification with one that points to the new environment.
Expand Down
153 changes: 16 additions & 137 deletions kedro/extras/extensions/ipython.py
Original file line number Diff line number Diff line change
@@ -1,139 +1,18 @@
# pylint: disable=import-outside-toplevel,global-statement,invalid-name,too-many-locals
"""
This script creates an IPython extension to load Kedro-related variables in
local scope.
This file and directory exists purely for backwards compatibility of the following:
%load_ext kedro.extras.extensions.ipython
from kedro.extras.extensions.ipython import reload_kedro

Any modifications to the IPython extension should now be made in kedro/ipython/.
The Kedro IPython extension should always be loaded as %load_ext kedro.ipython.
Line magics such as reload_kedro should always be called as line magics rather than
importing the underlying Python functions.
"""
import logging
import sys
from pathlib import Path
from typing import Any, Dict

from kedro.framework.cli.project import PARAMS_ARG_HELP
from kedro.framework.cli.utils import ENV_HELP, _split_params

logger = logging.getLogger(__name__)
default_project_path = Path.cwd()


def _remove_cached_modules(package_name):
to_remove = [mod for mod in sys.modules if mod.startswith(package_name)]
# `del` is used instead of `reload()` because: If the new version of a module does not
# define a name that was defined by the old version, the old definition remains.
for module in to_remove:
del sys.modules[module] # pragma: no cover


def _find_kedro_project(current_dir: Path): # pragma: no cover
from kedro.framework.startup import _is_project

while current_dir != current_dir.parent:
if _is_project(current_dir):
return current_dir
current_dir = current_dir.parent

return None


def reload_kedro(
path: str = None, env: str = None, extra_params: Dict[str, Any] = None
):
"""Line magic which reloads all Kedro default variables.
Setting the path will also make it default for subsequent calls.
"""
from IPython import get_ipython
from IPython.core.magic import needs_local_scope, register_line_magic

from kedro.framework.cli import load_entry_points
from kedro.framework.project import LOGGING # noqa # pylint:disable=unused-import
from kedro.framework.project import configure_project, pipelines
from kedro.framework.session import KedroSession
from kedro.framework.startup import bootstrap_project

# If a path is provided, set it as default for subsequent calls
global default_project_path
if path:
default_project_path = Path(path).expanduser().resolve()
logger.info("Updated path to Kedro project: %s", default_project_path)
else:
logger.info("No path argument was provided. Using: %s", default_project_path)

metadata = bootstrap_project(default_project_path)
_remove_cached_modules(metadata.package_name)
configure_project(metadata.package_name)

session = KedroSession.create(
metadata.package_name, default_project_path, env=env, extra_params=extra_params
)
context = session.load_context()
catalog = context.catalog

get_ipython().push(
variables={
"context": context,
"catalog": catalog,
"session": session,
"pipelines": pipelines,
}
)

logger.info("Kedro project %s", str(metadata.project_name))
logger.info(
"Defined global variable 'context', 'session', 'catalog' and 'pipelines'"
)

for line_magic in load_entry_points("line_magic"):
register_line_magic(needs_local_scope(line_magic))
logger.info("Registered line magic '%s'", line_magic.__name__) # type: ignore


def load_ipython_extension(ipython):
"""
Main entry point when %load_ext is executed.
IPython will look for this function specifically.
See https://ipython.readthedocs.io/en/stable/config/extensions/index.html

This function is called when users do `%load_ext kedro.extras.extensions.ipython`.
When user use `kedro jupyter notebook` or `jupyter ipython`, this extension is
loaded automatically.
"""
from IPython.core.magic_arguments import argument, magic_arguments, parse_argstring

@magic_arguments()
@argument(
"path",
type=str,
help=(
"Path to the project root directory. If not given, use the previously set"
"project root."
),
nargs="?",
default=None,
)
@argument("-e", "--env", type=str, default=None, help=ENV_HELP)
@argument(
"--params",
type=lambda value: _split_params(None, None, value),
default=None,
help=PARAMS_ARG_HELP,
)
def magic_reload_kedro(line: str):
"""
The `%reload_kedro` IPython line magic. See
https://kedro.readthedocs.io/en/stable/tools_integration/ipython.html for more.
"""
args = parse_argstring(magic_reload_kedro, line)
reload_kedro(args.path, args.env, args.params)

global default_project_path

ipython.register_magic_function(magic_reload_kedro, magic_name="reload_kedro")
default_project_path = _find_kedro_project(Path.cwd())

if default_project_path is None:
logger.warning(
"Kedro extension was registered but couldn't find a Kedro project. "
"Make sure you run '%reload_kedro <project_root>'."
)
return

reload_kedro(default_project_path)
from ...ipython import load_ipython_extension, reload_kedro
import warnings

warnings.warn(
"kedro.extras.extensions.ipython should be accessed only using the alias "
"kedro.ipython. The unaliased name will be removed in Kedro 0.19.0.",
DeprecationWarning,
)
10 changes: 5 additions & 5 deletions kedro/framework/cli/jupyter.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ def _create_kernel(kernel_name: str, display_name: str) -> None:
"-f",
"{connection_file}",
"--ext",
"kedro.extras.extensions.ipython"
"kedro.ipython"
],
"display_name": "Kedro (spaceflights)",
"language": "python",
Expand Down Expand Up @@ -151,14 +151,14 @@ def _create_kernel(kernel_name: str, display_name: str) -> None:

kernel_json = Path(kernel_path) / "kernel.json"
kernel_spec = json.loads(kernel_json.read_text(encoding="utf-8"))
kernel_spec["argv"].extend(["--ext", "kedro.extras.extensions.ipython"])
kernel_spec["argv"].extend(["--ext", "kedro.ipython"])
# indent=1 is to match the default ipykernel style (see
# ipykernel.write_kernel_spec).
kernel_json.write_text(json.dumps(kernel_spec, indent=1), encoding="utf-8")

kedro_extensions_dir = Path(__file__).parents[2] / "extras" / "extensions"
shutil.copy(kedro_extensions_dir / "logo-32x32.png", kernel_path)
shutil.copy(kedro_extensions_dir / "logo-64x64.png", kernel_path)
kedro_ipython_dir = Path(__file__).parents[2] / "ipython"
shutil.copy(kedro_ipython_dir / "logo-32x32.png", kernel_path)
shutil.copy(kedro_ipython_dir / "logo-64x64.png", kernel_path)
except Exception as exc:
raise KedroCliError(
f"Cannot setup kedro kernel for Jupyter.\nError: {exc}"
Expand Down
2 changes: 1 addition & 1 deletion kedro/framework/cli/project.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ def ipython(

if env:
os.environ["KEDRO_ENV"] = env
call(["ipython", "--ext", "kedro.extras.extensions.ipython"] + list(args))
call(["ipython", "--ext", "kedro.ipython"] + list(args))


@project_group.command()
Expand Down
Loading