Skip to content

Commit

Permalink
Add more details and add description of existing messaging APIs
Browse files Browse the repository at this point in the history
  • Loading branch information
kurtamohler committed Jul 14, 2022
1 parent 72797e2 commit 6883a9b
Show file tree
Hide file tree
Showing 2 changed files with 284 additions and 39 deletions.
Binary file removed .RFC-0000-template.md.swp
Binary file not shown.
323 changes: 284 additions & 39 deletions RFC-0026-logging-system.md
Original file line number Diff line number Diff line change
@@ -1,57 +1,47 @@
# PyTorch Logging System
# New PyTorch Logging System

## **Summary**
Create a message logging system for PyTorch with the following requirements:

### Consistency

* The C++ and Python APIs should match each other as closely as possible.

* All errors, warnings, and other messages generated by PyTorch should be
emitted using the the logging system API
emitted using the the logging system API.


* The APIs for emitting messages and changing settings should all be consistent
between C++ and Python
### Severity and message classes

* Offer different message severity levels, including at least the following:

- **Info**: Emits a message without creating a warning or error. By default,
this gets printed to stdout
this gets printed to stdout.

- **Warning**: Emits a message as a warning. By default, this will turn into
a Python warning
- **Warning**: Emits a message as a warning. If a warning is never caught,
the warning may get printed to stdout.

- **Error**: Emits a message as an error. By default, this will turn into
a Python error
- **Error**: Emits a message as an error. If an error is never caught, the
application will quit.

- TODO: Should we also have a **Fatal** severity for integration with
Meta's internal logging system? A fatal message terminates the program

* Offer different classes of messages, including at least the following:

- **Default**: A catch-all message class

- **Nondeterministic**: Emitted when `torch.use_deterministic_algorithms(True)`
is set and a nondeterministic operation is called

- **Deprecated**: Emitted when a deprecated function is called
* Offer different message classes under each severity level.

- **Beta**: Emitted when a beta feature is called. See
[PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/)
- Every message is emitted as an instance of a message class.

- **Prototype**: Emitted when a prototype feature is called. See
[PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/)
- Each message class has both a C++ class and a Python class, and when a
C++ message is propagated to Python, it is converted to its corresponding
Python class.

- TODO: Should all the classic Python errors and warnings (`TypeError`,
`ValueError`, `NotImplementedError`, `DeprecationWarning`, etc) have their
own message class? Or are those separate from our concept of a message
class, and any message class is allowed to raise any Python exception or
warning type?
- Whenever it makes sense, the Python class should be one of the builtin
Python error/warning classes. For instance, currently in PyTorch, the C++
error class `c10::Error` gets converted to the Python `RuntimeError` class.

* Continue using warning/error APIs that currently exist in PyTorch wherever
possible. For instance, `TORCH_CHECK`, `TORCH_WARN`, and `TORCH_WARN_ONCE`
should continue to be used in C++
* Adding new message classes and severity levels should be easy

- NOTE: These existing APIs don't currently have a concept of message classes,
so that will need to be added

* Creating new message classes and severity levels should be easy
### User-facing configurability

* Ability to turn warnings into errors. This is already possible with the
Python `warnings` module filter, but the PyTorch docs should mention it and
Expand All @@ -66,7 +56,7 @@ Create a message logging system for PyTorch with the following requirements:
to a warning, but we wouldn't want to downgrade an error from invalid
arguments given to an operation.

- Disabling warnings in Python should already be possible with the `warnings`
- Disabling warnings in Python is already possible with the `warnings`
module filter. See [documentation](https://docs.python.org/3/library/warnings.html#the-warnings-filter).
There is no similar system in C++ at the moment, and building one is probably
low priority.
Expand All @@ -75,7 +65,7 @@ Create a message logging system for PyTorch with the following requirements:
excessive printouts can degrade the user experience. Related to
issue [#68768](https://github.com/pytorch/pytorch/issues/68768)

* Settings to avoid emitting duplicate messages generated by multiple
* Settings to enable/disable emitting duplicate messages generated by multiple
`torch.distributed` ranks. Related to issue
[#68768](https://github.com/pytorch/pytorch/issues/68768)

Expand All @@ -85,15 +75,20 @@ Create a message logging system for PyTorch with the following requirements:
- NOTE: Currently `TORCH_WARN_ONCE` does this in C++, but there is no Python
equivalent

- TODO: Should there be a setting to turn a warn-once into a warn-always for
a given message class and vice versa?
- NOTE: `torch.set_warn_always()` currently controls some warnings (maybe
only the ones from C++? I need to find out for sure.)

- TODO: Should there be a setting to turn a warn-once into a warn-always and
vice versa for an entire message class?

* Settings can be changed from Python, C++, or environment variables

- Filtering warnings with Python command line arguments should
remain possible. For instance, the following turns a `DeprecationWarning`
into an error: `python -W error::DeprecationWarning your_script.py`

### Compatibility

* Should integrate with Meta's internal logging system, which is
[glog](https://github.com/google/glog)

Expand All @@ -102,12 +97,19 @@ Create a message logging system for PyTorch with the following requirements:
* Must be OSS-friendly, so it shouldn't require libraries (like glog) which may
cause incompatibility issues for projects that use PyTorch

### Other requirements

* Continue using warning/error APIs and message classes that currently exist in
PyTorch wherever possible. For instance, `TORCH_CHECK`, `TORCH_WARN`, and
`TORCH_WARN_ONCE` should continue to be used in C++

* TODO: Determine the requirements for the following concepts:

- Log files (default behavior and any settings)


## **Motivation**

Original issue: [link](https://github.com/pytorch/pytorch/issues/72948)

Currently, it is challenging for PyTorch developers to provide messages that
Expand All @@ -116,5 +118,248 @@ act consistently between Python and C++.
It is also challenging for PyTorch users to manage the messages that PyTorch
emits. For instance, if a PyTorch user happens to be calling PyTorch functions
that emit lots of messages, it can be difficult for them to filter out those
messages so that their project's users don't get bombarded with warnings that
they don't need to see.
messages so that their project's users don't get bombarded with warnings and
printouts that they don't need to see.


## **Proposed Implementation**

### Message classes

At least the following message classes should be available. The name of the
C++ class appears first in all the listed entries below, with the Python class
to the right of it.

Each severity level has a default class. All other classes within a given
severity level inherit from the corresponding default class.

NOTE: Most of the error classes below already exist in PyTorch. However,
info classes do not currently exist. Also, only one type of warning currently
exists in C++, and it is not implemented as a C++ class that can be inherited
(as far as I understand).

#### Error message classes:

* **`c10::Error`** - `RuntimeError`
- Default error class. Other error classes inherit from it.

* **`c10::IndexError`** - `IndexError`
- Emitted when attempting to access an element that is not present in
a list-like object.

* **`c10::ValueError`** - `ValueError`
- Emitted when a function receives an argument with correct type but
incorrect value.

* **`c10::TypeError`** - `TypeError`
- Emitted when a function receives an argument with incorrect type.

* **`c10:NotImplementedError`** - `NotImplementedError`
- Emitted when a feature that is not implemented is called.

* **`c10::LinAlgError`** - `torch.linalg.LinAlgError`
- Emitted from the `torch.linalg` module when there is a numerical error.

* **`c10::NondeterministicError`** - `torch.NondeterministicError`
- Emitted when `torch.use_deterministic_algorithms(True)` and
`torch.set_deterministic_debug_mode('error')` are set, and a
nondeterministic operation is called.


#### Warning message classes:

* **`c10::UserWarning`** - `UserWarning`
- Default warning class. Other warning classes inherit from it.

* **`c10::BetaWarning`** - `torch.BetaWarning`
- Emitted when a beta feature is called. See
[PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/).

* **`c10::PrototypeWarning`** - `torch.PrototypeWarning`
- Emitted when a prototype feature is called. See
[PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/).

* **`c10::NondeterministicWarning`** - `torch.NondeterministicWarning`
- Emitted when `torch.use_deterministic_algorithms(True)` and
`torch.set_deterministic_debug_mode('warn')` are set, and a
nondeterministic operation is called.

* **`c10::DeprecationWarning`** - `DeprecationWarning`
- Emitted when a deprecated function is called.
- TODO: `DeprecationWarning`s are ignored by default in Python, so we may
actually want to use a different Python class for this.


#### Info message classes:

* **`c10::Info`** - `torch.Info`
- Default info class. Other info classes inherit from it.


### Error APIs

The APIs for raising errors all share a similar form. They check a boolean
condition, the `cond` argument in the following signatures, and throw an error
if that condition is false.

The error APIs also each have a variable length argument list, `...` in C++ and
`*args` in Python. When an error is raised, these arguments are concatenated
into a string, and the string becomes the body of the error message. If
possible, a developer who writes these error messages should try to include
enough information so that a user could understand why the error happened and
what to do about it. If that goal is not possible, the message should at least
contain some useful information to lead the user in the right direction.

The error APIs are listed below, with the C++ signature on the left and the
corresponding Python signature on the right.

**`TORCH_CHECK(cond, ...)`** - `torch.check(cond, *args)`
- C++ error: `c10::Error`
- Python error: `RuntimeError`

TODO: Add the rest of these and also add sections for warnings and info.

### Other details

* At the moment in PyTorch, the Python `warnings` module is being publicly
included in `torch` as `torch.warnings`. This should probably be removed or
renamed to `_warnings` to avoid confusion.


# PyTorch's current messaging API

The rest of this document contains details about the current messaging API in
PyTorch. This is included to give better context about what will change and
what will stay the same in the new messaging system.

At the moment, PyTorch has some APIs in place to make a lot of aspects of
message logging easy, from the perspective of a developer working on PyTorch.
Messages can be either printouts, warnings, or errors.

Errors are created with the standard `raise` statement in Python
([documentation](https://docs.python.org/3/tutorial/errors.html#raising-exceptions)).
In C++, PyTorch offers macros for creating errors (which are listed later in
this document). When a C++ function propagates to Python, any errors that were
generated get converted to Python errors.

Warnings are created with `warnings.warn` in Python
([documentation](https://docs.python.org/3/library/warnings.html)). In C++,
PyTorch offers macros for creating warnings (which are listed later in this
document). When a C++ function propagates to Python, any warnings that were
generated get converted to Python warnings.

Printouts (or what is called "Info" severity messages in the new system) are
created with just `print` in Python and `std::cout` in C++.

PyTorch's C++ warning/error macros are declared in
[`c10/util/Exception.h`](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h).

## PyTorch C++ Errors

In C++, there are several different types of errors that can be used, but
PyTorch developers typically don't deal with these error classes directly.
Instead, they use macros that offer a concise interface for raising different
error classes.

### C++ error macros

Each of the error macros evaluate a boolean conditional expression, `cond`. If
the condition is false, the error is raised, and whatever extra arguments are
in `...` get concatenated into the error message with `operator<<`.

| Macro | C++ Error class |
| ---------------------------------------- | ------------------------------ |
| `TORCH_CHECK(cond, ...)` | `c10::Error` |
| `TORCH_CHECK_WITH(error_t, cond, ...)` | caller specifies `error_t` arg |
| `TORCH_CHECK_LINALG(cond, ...)` | `c10::LinAlgError` |
| `TORCH_CHECK_INDEX(cond, ...)` | `c10::IndexError` |
| `TORCH_CHECK_VALUE(cond, ...)` | `c10::ValueError` |
| `TORCH_CHECK_TYPE(cond, ...)` | `c10::TypeError` |
| `TORCH_CHECK_NOT_IMPLEMENTED(cond, ...)` | `c10::NotImplementedError` |

There is some documentation on error macros [here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L344-L362)

The reason why C++ preprocessor macros are used, rather than function calls, is
to ensure that the compiler can optimize for the `cond == true` branch. In
other words, if an error does not get raised, overhead is minimized.

### C++ error classes

The primary error class in C++ is `c10::Error`. Documentation and declaration
are
[here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L21-L28).
`c10::Error` is a subclass of `std::exception`.

There are other error classes which are child classes of `c10::Error`, defined
[here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L195-L236).

When these errors propagate to Python, they are each converted to a different
Python error class:

| C++ error class | Python error class |
| ------------------------------- | -------------------------- |
| `c10::Error` | `RuntimeError` |
| `c10::IndexError` | `IndexError` |
| `c10::ValueError` | `ValueError` |
| `c10::TypeError` | `TypeError` |
| `c10::NotImplementedError` | `NotImplementedError` |
| `c10::EnforceFiniteError` | `ExitException` |
| `c10::OnnxfiBackendSystemError` | `ExitException` |
| `c10::LinAlgError` | `torch.linalg.LinAlgError` |


## PyTorch C++ Warnings

When warnings propagate from C++ to Python, they are converted to a Python
`UserWarning`. Whatever is in `...` will get concatenated into the warning
message using `operator<<`.

* `TORCH_WARN(...)`
- [Definition](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L515-L530)

* `TORCH_WARN_ONCE(...)`
- [Definition](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L557-L562)
- This macro only generates a warning the first time it is encountered during
run time.


## Implementation details

### C++ to Python Error Translation

`c10::Error` and its subclasses are translated into their corresponding Python
errors [in `CATCH_CORE_ERRORS`](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/torch/csrc/Exceptions.h#L54-L100).

However, not all of the `c10::Error` subclasses in the table above appear here.
I'm not sure yet what's up with that.

`CATCH_CORE_ERRORS` is included within the `END_HANDLE_TH_ERRORS` macro that
every Python-bound C++ function uses for handling errors. For instance,
`THPVariable__is_view` uses the error handling macro
[here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/tools/autograd/templates/python_variable_methods.cpp#L76).


#### `torch::PyTorchError`

There's also an extra error class in `CATCH_CORE_ERRORS`,
`torch::PyTorchError`. I'm not sure yet why it exists and how it differs from
`c10::Error`. `torch::PyTorchError` has several overloads:

* `torch::IndexError`
* `torch::TypeError`
* `torch::ValueError`
* `torch::NotImplementedError`
* `torch::AttributeError`
* `torch::LinAlgError`


### C++ to Python Warning Translation

The conversion of warnings from C++ to Python is described [here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/torch/csrc/Exceptions.h#L25-L48)


## Misc Notes

[PyTorch Developer Podcast - Python exceptions](https://pytorch-dev-podcast.simplecast.com/episodes/python-exceptions)
explains how C++ errors/warnings are converted to Python. TODO: listen to it
again and take notes.

0 comments on commit 6883a9b

Please sign in to comment.