Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to choose between NSYS and NCU profilers #28

Merged
merged 32 commits into from
Mar 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
881c67f
Add option to give nvcc extra arguments
cosminc98 Jan 23, 2024
50bc8ff
Add test for nvcc options that changes c++ dialect from c++17 to c++14
cosminc98 Jan 23, 2024
595e450
Add make and the english language pack to devcontainer to be able to …
cosminc98 Jan 23, 2024
405c16e
Update documentation config to automatically import the current versi…
cosminc98 Jan 23, 2024
65eca38
Document new --compiler-args argument
cosminc98 Jan 23, 2024
6236fe2
Improve tests coverage by testing for bad arguments and the error out…
cosminc98 Jan 23, 2024
639624b
Add IPython to docs requirements to allow the __version__ import for …
cosminc98 Jan 24, 2024
36fc282
Change devcontainer base image to have the latest CUDA toolkit
cosminc98 Jan 26, 2024
b49062e
Mock the nsight compute tool with a bash script
cosminc98 Jan 26, 2024
c1fbc06
Add test to compile with opencv
cosminc98 Jan 26, 2024
bc91620
Add new page to documentation that contains a new notebook that expla…
cosminc98 Jan 26, 2024
e9f131a
Add autodocstring vscode extension to devcontainer
cosminc98 Jan 27, 2024
b3c015a
Add function that modifies the default profiler/compiler arguments to…
cosminc98 Jan 27, 2024
33801a3
Update pylint exceptions
cosminc98 Jan 27, 2024
a3f4f31
Update contributing instructions
cosminc98 Jan 27, 2024
9663c74
Change version from 1.0.3 to 1.1.0 due to adding features in a backwa…
cosminc98 Jan 27, 2024
aaaa260
Install latest CUDA toolkit on the test runner to pass the OpenCV com…
cosminc98 Jan 27, 2024
28637d5
Install opencv in test runner and update code coverage install
cosminc98 Jan 27, 2024
863cdcf
Add CUDA bin to PATH in test and coverage runners
cosminc98 Jan 27, 2024
2614c92
Add cuda bin to path variable in .bashrc
cosminc98 Jan 27, 2024
27b045b
Update way to set environment variable PATH in github action
cosminc98 Jan 27, 2024
ee9aa3d
Change devcontainer base image back to ubuntu:22.04 to match the envi…
cosminc98 Jan 27, 2024
8d39ce0
Add option to choose between NSYS and NCU profilers
cosminc98 Feb 1, 2024
2c10844
Add tests for choosing the profiler
cosminc98 Feb 1, 2024
5a880c9
Add isort config to help it find local modules so they are not consid…
cosminc98 Feb 2, 2024
26fab4d
Replace experimental-string-processing black formatter config with en…
cosminc98 Feb 2, 2024
ba775f7
Search for profiling tools executable paths when they are required
cosminc98 Feb 2, 2024
bac447e
Install dev dependencies in editable mode
cosminc98 Feb 2, 2024
0908891
Add documentation for using Nsight Systems instead of the default Nsi…
cosminc98 Feb 2, 2024
c3b8524
Fix cuda typo
cosminc98 Feb 12, 2024
4d805bb
Mention Nsight Systems in README.md
cosminc98 Feb 16, 2024
3f8b89c
Merge branch 'master' into feature/profiler-tool-choice
cosminc98 Feb 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .devcontainer/post_create.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/bin/bash

# install developer dependencies
pip install .[dev]
pip install -e .[dev]

# make sure the developer uses pre-commit hooks
pre-commit install
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ to own a GPU yourself.
Here are just a few of the things that nvcc4jupyter does well:

- [Easily run CUDA C++ code](https://nvcc4jupyter.readthedocs.io/en/latest/usage.html#hello-world)
- [Profile your code with NVIDIA Nsight Compute](https://nvcc4jupyter.readthedocs.io/en/latest/usage.html#profiling)
- [Profile your code with NVIDIA Nsight Compute or Nsight Systems](https://nvcc4jupyter.readthedocs.io/en/latest/usage.html#profiling)
- [Compile your code with external libraries (e.g. OpenCV)](https://nvcc4jupyter.readthedocs.io/en/latest/notebooks.html#compiling-with-external-libraries)
- [Share code between different programs in the same notebook / split your code into multiple files for improved readability](https://nvcc4jupyter.readthedocs.io/en/latest/usage.html#groups)

Expand Down Expand Up @@ -96,7 +96,7 @@ If not using the devcontainer you need to install the package with the
development dependencies and install the pre-commit hook before commiting any
changes:
```bash
pip install .[dev]
pip install -e .[dev]
pre-commit install
```

Expand Down
30 changes: 25 additions & 5 deletions docs/source/magics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Usage
- ``%%cuda``: Compile and run this cell.
- ``%%cuda -p``: Also runs the Nsight Compute profiler.
- ``%%cuda -p -a "<SPACE SEPARATED PROFILER ARGS>"``: Also runs the Nsight Compute profiler.
- ``%%cude -c "<SPACE SEPARATED COMPILER ARGS"``: Passes additional arguments to "nvcc".
- ``%%cuda -c "<SPACE SEPARATED COMPILER ARGS"``: Passes additional arguments to "nvcc".
- ``%%cuda -t``: Outputs the "timeit" built-in magic results.

Options
Expand All @@ -36,15 +36,35 @@ Options
.. _profile:

-p, --profile
Boolean. If set, runs the NVIDIA Nsight Compute profiler whose
output is appended to standard output.
Boolean. If set, runs the NVIDIA Nsight Compute (or NVIDIA Nsight Systems
if changed via the \-\-profiler option) profiler whose output is appended to
standard output.

.. _profiler:

-l, --profiler
String. Can either be "ncu" (the default) to use NVIDIA Nsight Compute
profiling tool, or "nsys" to use NVIDIA Nsight Systems profiling tool.

.. _profiler_args:

.. _profiler_args:

-a, --profiler-args
String. Optional profiler arguments that can be space separated
by wrapping them in double quotes. See all options here:
`Nsight Compute CLI <https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html#command-line-options>`_
by wrapping them in double quotes. Will be passed to the profiler selected
by the \-\-profiler option.. See profiler options here:
`Nsight Compute <https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html#command-line-options>`_
or `Nsight Systems <https://docs.nvidia.com/nsight-systems/UserGuide/index.html#command-line-options>`_.

.. _compiler_args:

-c, --compiler-args
String. Optional compiler arguments that can be space separated
by wrapping them in double quotes. They will be passed to "nvcc".
See all options here:
`NVCC Options <https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#nvcc-command-options>`_


.. _compiler_args:

Expand Down
28 changes: 25 additions & 3 deletions docs/source/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -225,10 +225,11 @@ Profiling
---------

Another important feature of nvcc4jupyter is its integration with the NVIDIA
Nsight Compute profiler, which you need to make sure is installed and its
executable can be found in a directory in your PATH environment variable.
Nsight Compute / NVIDIA Nsight Systems profilers, which you need to make sure
are installed and the executables can be found in a directory in your PATH
environment variable.

In order to use it and provide the profiler with custom arguments, simply run:
To profile using Nsight Compute with custom arguments:

.. code-block:: c++

Expand Down Expand Up @@ -256,6 +257,27 @@ Running the cell above will compile and execute the vector addition code in the
Compute (SM) Throughput % 1.19
----------------------- ------------- ------------

To profile using Nsight Systems with custom arguments:

.. code-block:: c++

%cuda_group_run --group "vector_add" --profiler nsys --profile --profiler-args "profile --stats=true"

Running the cell above will compile and execute the vector addition code in the
"vector_add" group and profile it with Nsight Systems. The output will contain
multiple tables, one of which will look similar to this:

.. code-block::

[5/8] Executing 'cuda_api_sum' stats report

Time (%) Total Time (ns) Num Calls Avg (ns) Med (ns) Min (ns) Max (ns) StdDev (ns) Name
-------- --------------- --------- ------------- ------------- ----------- ----------- ----------- ----------------------
77.3 200,844,276 1 200,844,276.0 200,844,276.0 200,844,276 200,844,276 0.0 cudaMalloc
22.6 58,594,762 2 29,297,381.0 29,297,381.0 29,153,999 29,440,763 202,772.8 cudaMemcpy
0.1 305,450 1 305,450.0 305,450.0 305,450 305,450 0.0 cudaLaunchKernel
0.0 1,970 1 1,970.0 1,970.0 1,970 1,970 0.0 cuModuleGetLoadingMode

Compiler arguments
------------------

Expand Down
2 changes: 1 addition & 1 deletion nvcc4jupyter/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
nvcc4jupyter: CUDA C++ plugin for Jupyter Notebook
"""

from .parsers import set_defaults # noqa: F401
from .parsers import Profiler, set_defaults # noqa: F401
from .plugin import NVCCPlugin, load_ipython_extension # noqa: F401

__version__ = "1.1.0"
42 changes: 36 additions & 6 deletions nvcc4jupyter/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,32 +3,51 @@
"""

import argparse
from typing import Callable, Optional
from enum import Enum
from typing import Callable, Optional, Type, TypeVar


class Profiler(Enum):
"""Choice between Nsight Compute and Nsight Systems profilers."""

NCU = "ncu"
NSYS = "nsys"


_default_profiler: Profiler = Profiler.NCU
_default_profiler_args: str = ""
_default_compiler_args: str = ""

T = TypeVar("T")


def set_defaults(
compiler_args: Optional[str] = None, profiler_args: Optional[str] = None
profiler: Optional[Profiler] = None,
compiler_args: Optional[str] = None,
profiler_args: Optional[str] = None,
) -> None:
"""
Set the default values for various arguments of the magic commands. These
values will be used if the user does not explicitly provide those arguments
to override this behaviour on a cell by cell basis.

Args:
profiler: If not None, this value becomes the new default profiler.
Defaults to None.
compiler_args: If not None, this value becomes the new default compiler
config. Defaults to "".
config. Defaults to None.
profiler_args: If not None, this value becomes the new default profiler
config. Defaults to "".
config. Defaults to None.
"""

# pylint: disable=global-statement
global _default_profiler
if profiler is not None:
_default_profiler = profiler
global _default_compiler_args
global _default_profiler_args
if compiler_args is not None:
_default_compiler_args = compiler_args
global _default_profiler_args
if profiler_args is not None:
_default_profiler_args = profiler_args

Expand All @@ -38,6 +57,11 @@ def str_to_lambda(arg: str) -> Callable[[], str]:
return lambda: arg


def class_to_lambda(arg: str, cls: Type[T]) -> Callable[[], T]:
"""Convert string value to class and then to lambda"""
return lambda: cls(arg)


def get_parser_cuda() -> argparse.ArgumentParser:
"""
%%cuda magic command parser.
Expand All @@ -52,8 +76,14 @@ def get_parser_cuda() -> argparse.ArgumentParser:
parser.add_argument("-t", "--timeit", action="store_true")
parser.add_argument("-p", "--profile", action="store_true")

# --profiler-args and --compiler-args values are lambda functions to allow
# the type of the following arguments is a lambda lambda function to allow
# changing the default value at runtime
parser.add_argument(
"-l",
"--profiler",
type=lambda arg: class_to_lambda(arg, cls=Profiler),
default=lambda: _default_profiler,
)
parser.add_argument(
"-a",
"--profiler-args",
Expand Down
61 changes: 61 additions & 0 deletions nvcc4jupyter/path_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
"""
Helper functions relating to file paths.
"""

import os
from glob import glob
from typing import List, Optional

CUDA_SEARCH_PATHS: List[str] = [
"/opt/nvidia/nsight-compute",
"/usr/local/cuda",
"/opt",
"/usr",
]


def is_executable(fpath: str) -> bool:
"""Check if file exists and is executable"""
return os.path.isfile(fpath) and os.access(fpath, os.X_OK)


def which(name: str) -> Optional[str]:
"""Find an executable by name by searching the PATH directories"""
for path_dir in os.environ.get("PATH", "").split(os.pathsep):
exec_path = os.path.join(path_dir, name)
if is_executable(exec_path):
return exec_path
return None


def find_executable(
name: str, search_paths: Optional[List[str]] = None
) -> Optional[str]:
"""
Find an executable, either by searching in the directories of the PATH
environment variable or, if that did not work, by searching recursively
in directories a list given as parameter.

Args:
name: The name of the executable to be found.
search_paths: If None, only executables that are available from PATH
will be found. Otherwise, will recursively search these
directories. Defaults to None.

Returns:
The path to the executable if it is found, and None otherwise.
"""
if search_paths is None:
search_paths = []

which_path = which(name)
if which_path is not None:
return which_path

for search_path in search_paths:
search_path = os.path.abspath(search_path)
search_path = os.path.join(search_path, f"**/{name}")
for exec_path in glob(search_path, recursive=True):
return exec_path

return None
66 changes: 56 additions & 10 deletions nvcc4jupyter/plugin.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,20 @@
import subprocess
import tempfile
import uuid
from typing import List, Optional
from typing import Dict, List, Optional

# pylint: disable=import-error
from IPython.core.interactiveshell import InteractiveShell
from IPython.core.magic import Magics, cell_magic, line_magic, magics_class

from . import parsers
from .parsers import (
Profiler,
get_parser_cuda,
get_parser_cuda_group_delete,
get_parser_cuda_group_run,
get_parser_cuda_group_save,
)
from .path_utils import CUDA_SEARCH_PATHS, find_executable

DEFAULT_EXEC_FNAME = "cuda_exec.out"
SHARED_GROUP_NAME = "shared"
Expand All @@ -37,14 +44,19 @@ def __init__(self, shell: InteractiveShell):
super().__init__(shell)
self.shell: InteractiveShell # type hint not provided by parent class

self.parser_cuda = parsers.get_parser_cuda()
self.parser_cuda_group_save = parsers.get_parser_cuda_group_save()
self.parser_cuda_group_delete = parsers.get_parser_cuda_group_delete()
self.parser_cuda_group_run = parsers.get_parser_cuda_group_run()
self.parser_cuda = get_parser_cuda()
self.parser_cuda_group_save = get_parser_cuda_group_save()
self.parser_cuda_group_delete = get_parser_cuda_group_delete()
self.parser_cuda_group_run = get_parser_cuda_group_run()

self.workdir = tempfile.mkdtemp()
print(f'Source files will be saved in "{self.workdir}".')

self.profiler_paths: Dict[Profiler, Optional[str]] = {
Profiler.NCU: None,
Profiler.NSYS: None,
}

def _save_source(
self, source_name: str, source_code: str, group_name: str
) -> None:
Expand Down Expand Up @@ -135,11 +147,42 @@ def _compile(

return executable_fpath

def _run(
def _get_profiler_path(self, profiler: Profiler) -> str:
"""
Get the path of the executable of a given profiling tool. Searches
the directories of the PATH environment variable and some extra
directories where CUDA is usually installed.

Args:
profiler: The profiler whose executable should be found.

Raises:
RuntimeError: If the profiler executable could not be found.

Returns:
The file path of the executable.
"""
profiler_path = self.profiler_paths[profiler]
if profiler_path is not None:
return profiler_path

profiler_path = find_executable(profiler.value, CUDA_SEARCH_PATHS)
if profiler_path is None:
raise RuntimeError(
f'Could not find the "{profiler.value}" profiling tool.'
" Consider searching for where it is installed and adding its"
" directory to the PATH environment variable."
)

self.profiler_paths[profiler] = profiler_path
return profiler_path

def _run( # pylint: disable=too-many-arguments
self,
exec_fpath: str,
timeit: bool = False,
profile: bool = False,
profiler: Profiler = Profiler.NCU,
profiler_args: str = "",
) -> str:
"""
Expand All @@ -150,8 +193,9 @@ def _run(
timeit: If True, returns the result of the "timeit" magic instead
of the standard output of the CUDA process. Defaults to False.
profile: If True, the executable is profiled with NVIDIA Nsight
Compute profiling tool and its output is added to stdout.
Defaults to False.
Compute or NVIDIA Nsight Systems and the profiling output is
added to stdout. Defaults to False.
profiler: The profiling tool to use.
profiler_args: The profiler arguments used to customize the
information gathered by it and its overall behaviour. Defaults
to an empty string.
Expand All @@ -173,7 +217,8 @@ def _run(
else:
run_args = []
if profile:
run_args.extend(["ncu"] + profiler_args.split())
profiler_path = self._get_profiler_path(profiler)
run_args.extend([profiler_path] + profiler_args.split())
run_args.append(exec_fpath)
output = subprocess.check_output(
run_args, stderr=subprocess.STDOUT
Expand All @@ -194,6 +239,7 @@ def _compile_and_run(
exec_fpath=exec_fpath,
timeit=args.timeit,
profile=args.profile,
profiler=args.profiler(),
profiler_args=args.profiler_args(),
)
except subprocess.CalledProcessError as e:
Expand Down
Loading
Loading