diff --git a/docs/source/development/debugging.md b/docs/source/development/debugging.md index 29c048f74a..87cd4ff743 100644 --- a/docs/source/development/debugging.md +++ b/docs/source/development/debugging.md @@ -1,12 +1,10 @@ # Debugging -:::note - +``` {note} Our debugging documentation has moved. Please see our existing guides: +``` -::: - -* [Debugging a Kedro project within a notebook](../notebooks_and_ipython/kedro_and_notebooks.md#debugging-a-kedro-project-within-a-notebook) for information on how to launch an interactive debugger in your notebook. +* [Debugging a Kedro project within a notebook](../notebooks_and_ipython/kedro_and_notebooks.md#debugging-a-kedro-project-within-a-notebook) for information on how to debug using the `%load_node` line magic and an interactive debugger. * [Debugging in VSCode](./set_up_vscode.md#debugging) for information on how to set up VSCode's built-in debugger. * [Debugging in PyCharm](./set_up_pycharm.md#debugging) for information on using PyCharm's debugging tool. * [Debugging in the CLI with Kedro Hooks](../hooks/common_use_cases.md#use-hooks-to-debug-your-pipeline) for information on how to automatically launch an interactive debugger in the CLI when an error occurs in your pipeline run. diff --git a/docs/source/faq/faq.md b/docs/source/faq/faq.md index 52f6b28ac0..1830183603 100644 --- a/docs/source/faq/faq.md +++ b/docs/source/faq/faq.md @@ -12,10 +12,11 @@ This is a growing set of technical FAQs. The [product FAQs on the Kedro website] * {doc}`Where can I find the documentation about Kedro-Viz`? * {py:mod}`Where can I find the documentation for Kedro's datasets `? -## Working with Jupyter +## Working with Notebooks * [How can I debug a Kedro project in a Jupyter notebook](../notebooks_and_ipython/kedro_and_notebooks.md#debugging-a-kedro-project-within-a-notebook)? * [How do I connect a Kedro project kernel to other Jupyter clients like JupyterLab](../notebooks_and_ipython/kedro_and_notebooks.md#ipython-jupyterlab-and-other-jupyter-clients)? +* [How can I use the Kedro IPython extension in a notebook where launching a new kernel is not an option](../notebooks_and_ipython/kedro_and_notebooks.md#loading-the-project-with-the-kedroipython-extension)? ## Kedro project development diff --git a/docs/source/meta/images/jupyter_ipython_load_node.gif b/docs/source/meta/images/jupyter_ipython_load_node.gif new file mode 100644 index 0000000000..0bc8ec6a57 Binary files /dev/null and b/docs/source/meta/images/jupyter_ipython_load_node.gif differ diff --git a/docs/source/meta/images/pipeline_error_logs.png b/docs/source/meta/images/pipeline_error_logs.png new file mode 100644 index 0000000000..1d11b2b5c8 Binary files /dev/null and b/docs/source/meta/images/pipeline_error_logs.png differ diff --git a/docs/source/notebooks_and_ipython/kedro_and_notebooks.md b/docs/source/notebooks_and_ipython/kedro_and_notebooks.md index 66337143ce..3701bf6051 100644 --- a/docs/source/notebooks_and_ipython/kedro_and_notebooks.md +++ b/docs/source/notebooks_and_ipython/kedro_and_notebooks.md @@ -10,6 +10,8 @@ The example adds a notebook to experiment with the retired [`pandas-iris` starte We will assume the example project is called `iris`, but you can call it whatever you choose. +## Loading the project with `kedro jupyter notebook` + Navigate to the project directory (`cd iris`) and issue the following command in the terminal to launch Jupyter: ```bash @@ -32,12 +34,10 @@ We recommend that you save your notebook in the `notebooks` folder of your Kedro ### What does `kedro jupyter notebook` do? -The `kedro jupyter notebook` command launches a notebook with a kernel that is [slightly customised](https://jupyter-client.readthedocs.io/en/stable/kernels.html#kernel-specs) but almost identical to the [default IPython kernel](https://ipython.readthedocs.io/en/stable/install/kernel_install.html). - -This custom kernel automatically makes the following Kedro variables available: +The `kedro jupyter notebook` command launches a notebook with a customised kernel that has been extended to make the following project variables available: * `catalog` (type `DataCatalog`): [Data Catalog](../data/data_catalog.md) instance that contains all defined datasets; this is a shortcut for `context.catalog` -* `context` (type `KedroContext`): Kedro project context that provides access to Kedro's library components +* `context` (type `KedroContext`): [Kedro project context](../api/kedro.framework.context.rst) that provides access to Kedro's library components * `pipelines` (type `Dict[str, Pipeline]`): Pipelines defined in your [pipeline registry](../nodes_and_pipelines/run_a_pipeline.md#run-a-pipeline-by-name) * `session` (type `KedroSession`): [Kedro session](../kedro_project_setup/session.md) that orchestrates a pipeline run @@ -45,22 +45,30 @@ This custom kernel automatically makes the following Kedro variables available: If the Kedro variables are not available within your Jupyter notebook, you could have a malformed configuration file or missing dependencies. The full error message is shown on the terminal used to launch `kedro jupyter notebook`. ``` -## How to explore a Kedro project in a notebook -Here are some examples of how to work with the Kedro variables. To explore the full range of attributes and methods available, see the relevant [API documentation](/api/kedro) or use the [Python `dir` function](https://docs.python.org/3/library/functions.html#dir), for example `dir(catalog)`. - -### `%run_viz` line magic +## Loading the project with the `kedro.ipython` extension -``` {note} -If you have not yet installed [Kedro-Viz](https://github.com/kedro-org/kedro-viz) for the project, run `pip install kedro-viz` in your terminal from within the project directory. +A quick way to explore the `catalog`, `context`, `pipelines`, and `session` variables in your project within a IPython compatible environment, such as Databricks notebooks, Google Colab, and more, is to use the `kedro.ipython` extension. +This is tool-independent and useful in situations where launching a Jupyter interactive environment is not possible. You can use the [`%load_ext` line magic](https://ipython.readthedocs.io/en/stable/config/extensions/index.html#using-extensions) to explicitly load the Kedro IPython extension: +```ipython +In [1]: %load_ext kedro.ipython ``` -You can display an interactive visualisation of your pipeline directly in your notebook using the `run-viz` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html) from within a cell: +If you have launched your interactive environment from outside your Kedro project, you will need to run a second line magic to set the project path. +This is so that Kedro can load the `catalog`, `context`, `pipelines` and `session` variables: -```python -%run_viz +```ipython +In [2]: %reload_kedro ``` +The Kedro IPython extension remembers the project path so that future calls to `%reload_kedro` do not need to specify it: -![View your project's Kedro Viz inside a notebook](../meta/images/run_viz_in_notebook.png) +```ipython +In [1]: %load_ext kedro.ipython +In [2]: %reload_kedro +In [3]: %reload_kedro +``` + +## Exploring the Kedro project in a notebook +Here are some examples of how to work with the Kedro variables. To explore the full range of attributes and methods available, see the relevant [API documentation](/api/kedro) or use the [Python `dir` function](https://docs.python.org/3/library/functions.html#dir), for example `dir(catalog)`. ### `catalog` @@ -195,9 +203,13 @@ You can also specify the following optional arguments for `session.run`: You can execute one *successful* run per session, as there's a one-to-one mapping between a session and a run. If you wish to do more than one run, you'll have to run `%reload_kedro` line magic to get a new `session`. -#### `%reload_kedro` line magic +## Kedro line magics + +[Line magics](https://ipython.readthedocs.io/en/stable/interactive/magics.html) are commands that provide a concise way of performing tasks in an interactive session. Kedro provides several line magic commands to simplify working with Kedro projects in interactive environments. + +### `%reload_kedro` line magic -You can use `%reload_kedro` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html) within your Jupyter notebook to reload the Kedro variables (for example, if you need to update `catalog` following changes to your Data Catalog). +You can use `%reload_kedro` line magic within your Jupyter notebook to reload the Kedro variables (for example, if you need to update `catalog` following changes to your Data Catalog). You don't need to restart the kernel for the `catalog`, `context`, `pipelines` and `session` variables. @@ -209,27 +221,86 @@ You don't need to restart the kernel for the `catalog`, `context`, `pipelines` a For more details, run `%reload_kedro?`. +### `%load_node` line magic + +``` {note} +This is still an experimental feature and is currently only availble for Jupyter Notebook (>7.0) and Jupyter Lab. If you encounter unexpected behaviour or would like to suggest feature enhancements, add it under [this github issue](https://github.com/kedro-org/kedro/issues/3580) +``` + +You can load the contents of a node in your project into a series of cells using the `%load_node` line magic. + +```ipython +%load_node +``` + +Ensure you use the name of your node as defined in the pipeline, not the name of the node function. The line magic will load your node's inputs, imports, and body: + +
+Click to see an example. + +![jupyter_ipython_load_node](../meta/images/jupyter_ipython_load_node.gif) + +
+ +--- + +To be able to access your node's inputs, make sure they are explicitly defined in your project's catalog. + +You can then run the generated cells to recreate how the node would run in your pipeline. You can use this to explore your node's inputs, behaviour, and outputs in isolation, or for [debugging](#debugging-a-kedro-project-within-a-notebook). + +### `%run_viz` line magic + +``` {note} +If you have not yet installed [Kedro-Viz](https://github.com/kedro-org/kedro-viz) for the project, run `pip install kedro-viz` in your terminal from within the project directory. +``` + +You can display an interactive visualisation of your pipeline directly in your notebook using the `%run_viz` line magic from within a cell: + +```python +%run_viz +``` + +![View your project's Kedro Viz inside a notebook](../meta/images/run_viz_in_notebook.png) ## Debugging a Kedro project within a notebook - You can use the `%debug` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-debug) to launch an interactive debugger in your Jupyter notebook. Declare it before a single-line statement to step through the execution in debug mode. You can use the argument `--breakpoint` or `-b` to provide a breakpoint. -The follow sequence occurs when `%debug` runs immediately after an error occurs: + You can use the built-in [`%debug` line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-debug) to launch an interactive debugger in your Jupyter notebook. Declare it before a single-line statement to step through the execution in debug mode. You can use the argument `--breakpoint` or `-b` to provide a breakpoint. Alternatively, use the command with no arguments after an error occurs to load the stack trace and begin debugging. + + The follow sequence occurs when `%debug` runs after an error occurs: - The stack trace of the last unhandled exception loads. - The program stops at the point where the exception occurred. - An interactive shell where the user can navigate through the stack trace opens. You can then inspect the value of expressions and arguments, or add breakpoints to the code. +Here is example debugging workflow after discovering a node in your pipeline is failing: +1. Inspect the logs to find the name of the failing node. We can see below the problematic node is `split_data_node`. +
-Click to see an example. +Click to the pipeline failure logs. + +![pipeline_error_logs](../meta/images/pipeline_error_logs.png) + +
+ +2. In your notebook, run `%load_node ` to load the contents of the problematic node with the [`%load_node` line magic](#kedro-line-magics). +3. Run the populated cells to examine the node's behaviour in isolation. +4. If the node fails in error, use `%debug` to launch an interactive debugging session in your notebook. + +
+Click to see this workflow in action. ![jupyter_ipython_debug_command](../meta/images/jupyter_ipython_debug_command.gif)
+``` {note} +The `%load_node` line magic is currently only availble for Jupyter Notebook (>7.0) and Jupyter Lab. If you are working within a different interactive environment, manually copy over the contents from your project files instead of using `%load_node` to automatically populate your node's contents, and continue from step 2. +``` + --- -You can set up the debugger to run automatically when an exception occurs by using the `%pdb` [line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-pdb). This automatic behaviour can be enabled with `%pdb 1` or `%pdb on` before executing a program, and disabled with `%pdb 0` or `%pdb off`. +You can also set up the debugger to run automatically when an exception occurs by using the [`%pdb` line magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-pdb). This automatic behaviour can be enabled with `%pdb 1` or `%pdb on` before executing a program, and disabled with `%pdb 0` or `%pdb off`.
Click to see an example. @@ -265,26 +336,6 @@ To ensure that a Jupyter kernel always points to the correct Python executable, You can use the `jupyter kernelspec` set of commands to manage your Jupyter kernels. For example, to remove a kernel, run `jupyter kernelspec remove `. -### Managed services - -If you work within a managed Jupyter service such as a Databricks notebook you may be unable to execute `kedro jupyter notebook`. You can explicitly load the Kedro IPython extension with the `%load_ext` line magic: - -```ipython -In [1]: %load_ext kedro.ipython -``` - -If you launch your Jupyter instance from outside your Kedro project, you will need to run a second line magic to set the project path so that Kedro can load the `catalog`, `context`, `pipelines` and `session` variables: - -```ipython -In [2]: %reload_kedro -``` -The Kedro IPython extension remembers the project path so that future calls to `%reload_kedro` do not need to specify it: - -```ipython -In [1]: %load_ext kedro.ipython -In [2]: %reload_kedro -In [3]: %reload_kedro -``` ### IPython, JupyterLab and other Jupyter clients