Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add load node line magic documentation #3619

Merged
merged 46 commits into from
Feb 19, 2024
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
177a6cf
Add first draft
lrcouto Jan 31, 2024
c283488
Remoe outdated kedro jupyter convert docs
AhdraMeraliQB Jan 31, 2024
922061b
Merge branch 'main' into add-doc-on-ipython-debug
lrcouto Jan 31, 2024
b7ad523
Suggestion: Review edits
AhdraMeraliQB Jan 31, 2024
131049d
Merge branch 'main' into add-doc-on-ipython-debug
lrcouto Jan 31, 2024
56c04e6
Update FAQs
Feb 1, 2024
3d73927
Merge branch 'main' into add-doc-on-ipython-debug
lrcouto Feb 1, 2024
08f025f
Edit jupyter ipython debug section
lrcouto Feb 2, 2024
2cc5a04
Change link to section that does not exist anymore
lrcouto Feb 2, 2024
524e9a1
Change link to section that does not exist anymore
lrcouto Feb 2, 2024
0f51e39
Merge branch 'main' into add-doc-on-ipython-debug
lrcouto Feb 2, 2024
6faaffb
Merge branch 'main' into add-doc-on-ipython-debug
lrcouto Feb 2, 2024
9ba8382
Merge branch 'main' into add-doc-on-ipython-debug
lrcouto Feb 5, 2024
170d7d0
Merge branch 'main' into add-doc-on-ipython-debug
lrcouto Feb 5, 2024
d3f6d33
Merge branch 'main' into add-doc-on-ipython-debug
lrcouto Feb 5, 2024
2a2d5ea
Change wording and formatting
lrcouto Feb 5, 2024
0139982
Lint
lrcouto Feb 5, 2024
71cca31
Update docs/source/notebooks_and_ipython/kedro_and_notebooks.md
lrcouto Feb 5, 2024
6edf692
Merge branch 'main' into add-doc-on-ipython-debug
AhdraMeraliQB Feb 6, 2024
70a767d
Merge branch 'main' into add-doc-on-ipython-debug
lrcouto Feb 6, 2024
58ab682
Update docs/source/notebooks_and_ipython/kedro_and_notebooks.md
lrcouto Feb 7, 2024
a81c537
Changes to the wording, remove unnecessary section
lrcouto Feb 7, 2024
333283a
Merge branch 'main' into add-doc-on-ipython-debug
AhdraMeraliQB Feb 8, 2024
5884d5e
Move docs on debugging with hooks to hooks section
Feb 8, 2024
bc6a57b
Add links to main debugging page
Feb 8, 2024
3b77445
Make notebook debugging an independent section
Feb 8, 2024
a347558
Merge main into docs/streamline-debugging-docs
Feb 9, 2024
de7b3f0
Update link in FAQs
Feb 9, 2024
909990f
Group line magics together
Feb 9, 2024
cb98fd3
Merge main into docs/add-load-node-line-magic-docs
Feb 13, 2024
a0bdec4
Add section structure
Feb 13, 2024
b7318af
Lint
Feb 13, 2024
81ee012
Move section about kedro ipython extension to the top
ankatiyar Feb 14, 2024
a4fb7ba
Vale suggestions
ankatiyar Feb 14, 2024
f932efe
Merge branch 'main' into docs/add-load-node-line-magic-docs
AhdraMeraliQB Feb 15, 2024
3f3f52e
Add to faq
ankatiyar Feb 15, 2024
dce4264
Add gif to demonstrate load_node
lrcouto Feb 16, 2024
c187a43
Fix link error
Feb 16, 2024
eb25b80
Rejig the page
Feb 16, 2024
394f902
Add recommended debug workflow with load node line magic
Feb 16, 2024
1ca4640
Merge branch 'main' into docs/add-load-node-line-magic-docs
AhdraMeraliQB Feb 16, 2024
4660492
Fix link after title change
Feb 16, 2024
60dce69
Add clarification on where to find node name when debugging
Feb 19, 2024
de60945
Fix reference
Feb 19, 2024
fc0403a
Use link to paremt section
Feb 19, 2024
60b88bb
Appease Vale
Feb 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions docs/source/development/debugging.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
# Debugging

:::note

``` {note}
Our debugging documentation has moved. Please see our existing guides:

:::
```

* [Debugging a Kedro project within a notebook](../notebooks_and_ipython/kedro_and_notebooks.md#debugging-a-kedro-project-within-a-notebook) for information on how to launch an interactive debugger in your notebook.
AhdraMeraliQB marked this conversation as resolved.
Show resolved Hide resolved
* [Debugging in VSCode](./set_up_vscode.md#debugging) for information on how to set up VSCode's built-in debugger.
Expand Down
3 changes: 2 additions & 1 deletion docs/source/faq/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,11 @@
* {doc}`Where can I find the documentation about Kedro-Viz<kedro-viz:kedro-viz_visualisation>`?
* {py:mod}`Where can I find the documentation for Kedro's datasets <kedro-datasets:kedro_datasets>`?

## Working with Jupyter
## Working with Notebooks

Check warning on line 15 in docs/source/faq/faq.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/faq/faq.md#L15

[Kedro.headings] 'Working with Notebooks' should use sentence-style capitalization.
Raw output
{"message": "[Kedro.headings] 'Working with Notebooks' should use sentence-style capitalization.", "location": {"path": "docs/source/faq/faq.md", "range": {"start": {"line": 15, "column": 4}}}, "severity": "WARNING"}

* [How can I debug a Kedro project in a Jupyter notebook](../notebooks_and_ipython/kedro_and_notebooks.md#debugging-a-kedro-project-within-a-notebook)?
* [How do I connect a Kedro project kernel to other Jupyter clients like JupyterLab](../notebooks_and_ipython/kedro_and_notebooks.md#ipython-jupyterlab-and-other-jupyter-clients)?
* [How can I use the Kedro IPython extension in a notebook where launching a new kernel is not an option](../notebooks_and_ipython/kedro_and_notebooks.md#exploring-the-project-with-the-kedroipython-extension)?

Check warning on line 19 in docs/source/faq/faq.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/faq/faq.md#L19

[Kedro.pronouns] Avoid first-person singular pronouns such as 'I'.
Raw output
{"message": "[Kedro.pronouns] Avoid first-person singular pronouns such as 'I'.", "location": {"path": "docs/source/faq/faq.md", "range": {"start": {"line": 19, "column": 12}}}, "severity": "WARNING"}
AhdraMeraliQB marked this conversation as resolved.
Show resolved Hide resolved

## Kedro project development

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
123 changes: 83 additions & 40 deletions docs/source/notebooks_and_ipython/kedro_and_notebooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@

We will assume the example project is called `iris`, but you can call it whatever you choose.

## Loading the project with `kedro jupyter notebook`

Navigate to the project directory (`cd iris`) and issue the following command in the terminal to launch Jupyter:

```bash
Expand All @@ -32,35 +34,41 @@

### What does `kedro jupyter notebook` do?

The `kedro jupyter notebook` command launches a notebook with a kernel that is [slightly customised](https://jupyter-client.readthedocs.io/en/stable/kernels.html#kernel-specs) but almost identical to the [default IPython kernel](https://ipython.readthedocs.io/en/stable/install/kernel_install.html).

This custom kernel automatically makes the following Kedro variables available:
The `kedro jupyter notebook` command launches a notebook with a customised kernel that has been extended to make the following project variables available:

* `catalog` (type `DataCatalog`): [Data Catalog](../data/data_catalog.md) instance that contains all defined datasets; this is a shortcut for `context.catalog`
* `context` (type `KedroContext`): Kedro project context that provides access to Kedro's library components
* `context` (type `KedroContext`): [Kedro project context](../api/kedro.framework.context.rst) that provides access to Kedro's library components
* `pipelines` (type `Dict[str, Pipeline]`): Pipelines defined in your [pipeline registry](../nodes_and_pipelines/run_a_pipeline.md#run-a-pipeline-by-name)
* `session` (type `KedroSession`): [Kedro session](../kedro_project_setup/session.md) that orchestrates a pipeline run

``` {note}
If the Kedro variables are not available within your Jupyter notebook, you could have a malformed configuration file or missing dependencies. The full error message is shown on the terminal used to launch `kedro jupyter notebook`.
```

## How to explore a Kedro project in a notebook
Here are some examples of how to work with the Kedro variables. To explore the full range of attributes and methods available, see the relevant [API documentation](/api/kedro) or use the [Python `dir` function](https://docs.python.org/3/library/functions.html#dir), for example `dir(catalog)`.

### `%run_viz` line magic
## Loading the project with the `kedro.ipython` extension

``` {note}
If you have not yet installed [Kedro-Viz](https://github.com/kedro-org/kedro-viz) for the project, run `pip install kedro-viz` in your terminal from within the project directory.
A quick way to explore the `catalog`, `context`, `pipelines`, and `session` variables in your project within a IPython compatible environment, such as Databricks notebooks, Google Colab, and more, is to use the `kedro.ipython` extension.

Check warning on line 50 in docs/source/notebooks_and_ipython/kedro_and_notebooks.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/notebooks_and_ipython/kedro_and_notebooks.md#L50

[Kedro.words] Use '' instead of 'quick'.
Raw output
{"message": "[Kedro.words] Use '' instead of 'quick'.", "location": {"path": "docs/source/notebooks_and_ipython/kedro_and_notebooks.md", "range": {"start": {"line": 50, "column": 3}}}, "severity": "WARNING"}

Check warning on line 50 in docs/source/notebooks_and_ipython/kedro_and_notebooks.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/notebooks_and_ipython/kedro_and_notebooks.md#L50

[Kedro.Spellings] Did you really mean 'Colab'?
Raw output
{"message": "[Kedro.Spellings] Did you really mean 'Colab'?", "location": {"path": "docs/source/notebooks_and_ipython/kedro_and_notebooks.md", "range": {"start": {"line": 50, "column": 181}}}, "severity": "WARNING"}
This is tool-independent and useful in situations where launching a Jupyter interactive environment is not possible. You can use the [`%load_ext` line magic](https://ipython.readthedocs.io/en/stable/config/extensions/index.html#using-extensions) to explicitly load the Kedro IPython extension:
```ipython
In [1]: %load_ext kedro.ipython
```

You can display an interactive visualisation of your pipeline directly in your notebook using the `run-viz` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html) from within a cell:
If you have launched your interactive environment from outside your Kedro project, you will need to run a second line magic to set the project path.
This is so that Kedro can load the `catalog`, `context`, `pipelines` and `session` variables:

```python
%run_viz
```ipython
In [2]: %reload_kedro <project_root>
```
The Kedro IPython extension remembers the project path so that future calls to `%reload_kedro` do not need to specify it:

![View your project's Kedro Viz inside a notebook](../meta/images/run_viz_in_notebook.png)
```ipython
In [1]: %load_ext kedro.ipython
In [2]: %reload_kedro <project_root>
In [3]: %reload_kedro
AhdraMeraliQB marked this conversation as resolved.
Show resolved Hide resolved
```

## Exploring the Kedro project in a notebook
Here are some examples of how to work with the Kedro variables. To explore the full range of attributes and methods available, see the relevant [API documentation](/api/kedro) or use the [Python `dir` function](https://docs.python.org/3/library/functions.html#dir), for example `dir(catalog)`.

### `catalog`

Expand Down Expand Up @@ -195,9 +203,13 @@

You can execute one *successful* run per session, as there's a one-to-one mapping between a session and a run. If you wish to do more than one run, you'll have to run `%reload_kedro` line magic to get a new `session`.

#### `%reload_kedro` line magic
## Kedro line magics

[Line magics](https://ipython.readthedocs.io/en/stable/interactive/magics.html) are commands that provide a concise way of performing tasks in an interactive session. Kedro provides several line magic commands to simplify working with Kedro projects in interactive environments.

You can use `%reload_kedro` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html) within your Jupyter notebook to reload the Kedro variables (for example, if you need to update `catalog` following changes to your Data Catalog).
### `%reload_kedro` line magic

You can use `%reload_kedro` line magic within your Jupyter notebook to reload the Kedro variables (for example, if you need to update `catalog` following changes to your Data Catalog).

You don't need to restart the kernel for the `catalog`, `context`, `pipelines` and `session` variables.

Expand All @@ -209,27 +221,78 @@

For more details, run `%reload_kedro?`.

### `%load_node` line magic

``` {note}
This is still an experimental feature and is currently only availble for Jupyter Notebook (>7.0) and Jupyter Lab. If you encounter unexpected behaviour or would like to suggest feature enhancements, add it under [this github issue](https://github.com/kedro-org/kedro/issues/3580)
```

You can load the contents of a node in your project into a series of cells using the `%load_node` line magic.

```ipython
%load_node <my-node-name>
```

Ensure you use the name of your node as defined in the pipeline, not the name of the node function. The line magic will load your node's inputs, imports, and body:

<details>
<summary>Click to see an example.</summary>

![jupyter_ipython_load_node](../meta/images/jupyter_ipython_load_node.gif)

Check warning on line 241 in docs/source/notebooks_and_ipython/kedro_and_notebooks.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/notebooks_and_ipython/kedro_and_notebooks.md#L241

[Kedro.Spellings] Did you really mean 'jupyter_ipython_load_node'?
Raw output
{"message": "[Kedro.Spellings] Did you really mean 'jupyter_ipython_load_node'?", "location": {"path": "docs/source/notebooks_and_ipython/kedro_and_notebooks.md", "range": {"start": {"line": 241, "column": 3}}}, "severity": "WARNING"}

</details>

---

To be able to access your node's inputs, make sure they are explicitly defined in your project's catalog.

You can then run the generated cells to recreate how the node would run in your pipeline. You can use this to explore your node's inputs, behaviour, and outputs in isolation, or for [debugging](#debugging-a-kedro-project-within-a-notebook).

### `%run_viz` line magic

``` {note}
If you have not yet installed [Kedro-Viz](https://github.com/kedro-org/kedro-viz) for the project, run `pip install kedro-viz` in your terminal from within the project directory.
```

You can display an interactive visualisation of your pipeline directly in your notebook using the `%run_viz` line magic from within a cell:

```python
%run_viz
```

![View your project's Kedro Viz inside a notebook](../meta/images/run_viz_in_notebook.png)

## Debugging a Kedro project within a notebook

You can use the `%debug` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-debug) to launch an interactive debugger in your Jupyter notebook. Declare it before a single-line statement to step through the execution in debug mode. You can use the argument `--breakpoint` or `-b` to provide a breakpoint.
The follow sequence occurs when `%debug` runs immediately after an error occurs:
You can use the built-in [`%debug` line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-debug) to launch an interactive debugger in your Jupyter notebook. Declare it before a single-line statement to step through the execution in debug mode. You can use the argument `--breakpoint` or `-b` to provide a breakpoint. Alternatively, use the command with no arguments after an error occurs to load the stack trace and begin debugging.

Check warning on line 267 in docs/source/notebooks_and_ipython/kedro_and_notebooks.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/notebooks_and_ipython/kedro_and_notebooks.md#L267

[Kedro.Spellings] Did you really mean 'breakpoint'?
Raw output
{"message": "[Kedro.Spellings] Did you really mean 'breakpoint'?", "location": {"path": "docs/source/notebooks_and_ipython/kedro_and_notebooks.md", "range": {"start": {"line": 267, "column": 335}}}, "severity": "WARNING"}

Check warning on line 267 in docs/source/notebooks_and_ipython/kedro_and_notebooks.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/notebooks_and_ipython/kedro_and_notebooks.md#L267

[Kedro.toowordy] 'Alternatively' is too wordy
Raw output
{"message": "[Kedro.toowordy] 'Alternatively' is too wordy", "location": {"path": "docs/source/notebooks_and_ipython/kedro_and_notebooks.md", "range": {"start": {"line": 267, "column": 347}}}, "severity": "WARNING"}

Check warning on line 267 in docs/source/notebooks_and_ipython/kedro_and_notebooks.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/notebooks_and_ipython/kedro_and_notebooks.md#L267

[Kedro.weaselwords] 'Alternatively' is a weasel word!
Raw output
{"message": "[Kedro.weaselwords] 'Alternatively' is a weasel word!", "location": {"path": "docs/source/notebooks_and_ipython/kedro_and_notebooks.md", "range": {"start": {"line": 267, "column": 347}}}, "severity": "WARNING"}

The follow sequence occurs when `%debug` runs immediately after an error occurs:

Check warning on line 269 in docs/source/notebooks_and_ipython/kedro_and_notebooks.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/notebooks_and_ipython/kedro_and_notebooks.md#L269

[Kedro.weaselwords] 'immediately' is a weasel word!
Raw output
{"message": "[Kedro.weaselwords] 'immediately' is a weasel word!", "location": {"path": "docs/source/notebooks_and_ipython/kedro_and_notebooks.md", "range": {"start": {"line": 269, "column": 48}}}, "severity": "WARNING"}
- The stack trace of the last unhandled exception loads.
- The program stops at the point where the exception occurred.
- An interactive shell where the user can navigate through the stack trace opens.

You can then inspect the value of expressions and arguments, or add breakpoints to the code.

Here is example debugging workflow after discovering a node in your pipeline is failing unexpectedly:

Check warning on line 276 in docs/source/notebooks_and_ipython/kedro_and_notebooks.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/notebooks_and_ipython/kedro_and_notebooks.md#L276

[Kedro.weaselwords] 'unexpectedly' is a weasel word!
Raw output
{"message": "[Kedro.weaselwords] 'unexpectedly' is a weasel word!", "location": {"path": "docs/source/notebooks_and_ipython/kedro_and_notebooks.md", "range": {"start": {"line": 276, "column": 89}}}, "severity": "WARNING"}

1. In your notebook, run `%load_node <name-of-failing-node>` to load the contents of the problematic node with the [`%load_node` line magic](#loadnode-line-magic).
2. Run the populated cells to examine the node's behaviour in isolation.
3. If the node fails in error, use `%debug` to launch an interactive debugging session in your notebook.

<details>
<summary>Click to see an example.</summary>
<summary>Click to see this workflow in action.</summary>

![jupyter_ipython_debug_command](../meta/images/jupyter_ipython_debug_command.gif)

</details>

``` {note}
The `%load_node` line magic is currently only availble for Jupyter Notebook (>7.0) and Jupyter Lab. If you are working within a different interactive environment, manually copy over the contents from your project files instead of using `%load_node` to automatically populate your node's contents, and continue from step 2.
```

---

You can set up the debugger to run automatically when an exception occurs by using the `%pdb` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-pdb). This automatic behaviour can be enabled with `%pdb 1` or `%pdb on` before executing a program, and disabled with `%pdb 0` or `%pdb off`.
You can also set up the debugger to run automatically when an exception occurs by using the [`%pdb` line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-pdb). This automatic behaviour can be enabled with `%pdb 1` or `%pdb on` before executing a program, and disabled with `%pdb 0` or `%pdb off`.

<details>
<summary>Click to see an example.</summary>
Expand Down Expand Up @@ -265,26 +328,6 @@

You can use the `jupyter kernelspec` set of commands to manage your Jupyter kernels. For example, to remove a kernel, run `jupyter kernelspec remove <kernel_name>`.

### Managed services

If you work within a managed Jupyter service such as a Databricks notebook you may be unable to execute `kedro jupyter notebook`. You can explicitly load the Kedro IPython extension with the `%load_ext` line magic:

```ipython
In [1]: %load_ext kedro.ipython
```

If you launch your Jupyter instance from outside your Kedro project, you will need to run a second line magic to set the project path so that Kedro can load the `catalog`, `context`, `pipelines` and `session` variables:

```ipython
In [2]: %reload_kedro <project_root>
```
The Kedro IPython extension remembers the project path so that future calls to `%reload_kedro` do not need to specify it:

```ipython
In [1]: %load_ext kedro.ipython
In [2]: %reload_kedro <project_root>
In [3]: %reload_kedro
```

### IPython, JupyterLab and other Jupyter clients

Expand Down