Add more details and add description of existing messaging APIs

pytorch · Jul 14, 2022 · 6883a9b · 6883a9b
1 parent 72797e2
commit 6883a9b
Show file tree

Hide file tree

Showing 2 changed files with 284 additions and 39 deletions.
diff --git a/.RFC-0000-template.md.swp b/.RFC-0000-template.md.swp
diff --git a/RFC-0026-logging-system.md b/RFC-0026-logging-system.md
@@ -1,57 +1,47 @@
-# PyTorch Logging System
+# New PyTorch Logging System
 
 ## **Summary**
 Create a message logging system for PyTorch with the following requirements:
 
+### Consistency 
+
+* The C++ and Python APIs should match each other as closely as possible.
+
 * All errors, warnings, and other messages generated by PyTorch should be
-  emitted using the the logging system API
+  emitted using the the logging system API.
+
 
-* The APIs for emitting messages and changing settings should all be consistent
-  between C++ and Python
+### Severity and message classes
 
 * Offer different message severity levels, including at least the following:
 
   - **Info**: Emits a message without creating a warning or error. By default,
-    this gets printed to stdout
+    this gets printed to stdout.
 
-  - **Warning**: Emits a message as a warning. By default, this will turn into
-    a Python warning
+  - **Warning**: Emits a message as a warning. If a warning is never caught,
+    the warning may get printed to stdout.
 
-  - **Error**: Emits a message as an error. By default, this will turn into
-    a Python error
+  - **Error**: Emits a message as an error. If an error is never caught, the
+    application will quit.
 
   - TODO: Should we also have a **Fatal** severity for integration with
     Meta's internal logging system? A fatal message terminates the program
 
-* Offer different classes of messages, including at least the following:
-
-  - **Default**: A catch-all message class
-
-  - **Nondeterministic**: Emitted when `torch.use_deterministic_algorithms(True)`
-    is set and a nondeterministic operation is called
-
-  - **Deprecated**: Emitted when a deprecated function is called
+* Offer different message classes under each severity level.
 
-  - **Beta**: Emitted when a beta feature is called. See
-    [PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/)
+  - Every message is emitted as an instance of a message class.
 
-  - **Prototype**: Emitted when a prototype feature is called. See
-    [PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/)
+  - Each message class has both a C++ class and a Python class, and when a
+    C++ message is propagated to Python, it is converted to its corresponding
+    Python class.
 
-  - TODO: Should all the classic Python errors and warnings (`TypeError`,
-    `ValueError`, `NotImplementedError`, `DeprecationWarning`, etc) have their
-    own message class? Or are those separate from our concept of a message
-    class, and any message class is allowed to raise any Python exception or
-    warning type?
+  - Whenever it makes sense, the Python class should be one of the builtin
+    Python error/warning classes. For instance, currently in PyTorch, the C++
+    error class `c10::Error` gets converted to the Python `RuntimeError` class.
 
-* Continue using warning/error APIs that currently exist in PyTorch wherever
-  possible. For instance, `TORCH_CHECK`, `TORCH_WARN`, and `TORCH_WARN_ONCE`
-  should continue to be used in C++
+* Adding new message classes and severity levels should be easy
 
-  - NOTE: These existing APIs don't currently have a concept of message classes,
-    so that will need to be added
-
-* Creating new message classes and severity levels should be easy
+### User-facing configurability
 
 * Ability to turn warnings into errors. This is already possible with the
   Python `warnings` module filter, but the PyTorch docs should mention it and
@@ -66,7 +56,7 @@ Create a message logging system for PyTorch with the following requirements:
     to a warning, but we wouldn't want to downgrade an error from invalid
     arguments given to an operation.
 
-  - Disabling warnings in Python should already be possible with the `warnings`
+  - Disabling warnings in Python is already possible with the `warnings`
     module filter. See [documentation](https://docs.python.org/3/library/warnings.html#the-warnings-filter).
     There is no similar system in C++ at the moment, and building one is probably
     low priority.
@@ -75,7 +65,7 @@ Create a message logging system for PyTorch with the following requirements:
     excessive printouts can degrade the user experience. Related to
     issue [#68768](https://github.com/pytorch/pytorch/issues/68768)
 
-* Settings to avoid emitting duplicate messages generated by multiple
+* Settings to enable/disable emitting duplicate messages generated by multiple
   `torch.distributed` ranks. Related to issue
   [#68768](https://github.com/pytorch/pytorch/issues/68768)
 
@@ -85,15 +75,20 @@ Create a message logging system for PyTorch with the following requirements:
   - NOTE: Currently `TORCH_WARN_ONCE` does this in C++, but there is no Python
     equivalent
 
-  - TODO: Should there be a setting to turn a warn-once into a warn-always for
-    a given message class and vice versa?
+  - NOTE: `torch.set_warn_always()` currently controls some warnings (maybe
+    only the ones from C++? I need to find out for sure.)
+
+  - TODO: Should there be a setting to turn a warn-once into a warn-always and
+    vice versa for an entire message class?
 
 * Settings can be changed from Python, C++, or environment variables
 
   - Filtering warnings with Python command line arguments should
     remain possible. For instance, the following turns a `DeprecationWarning`
     into an error: `python -W error::DeprecationWarning your_script.py`
 
+### Compatibility
+
 * Should integrate with Meta's internal logging system, which is
   [glog](https://github.com/google/glog)
 
@@ -102,12 +97,19 @@ Create a message logging system for PyTorch with the following requirements:
 * Must be OSS-friendly, so it shouldn't require libraries (like glog) which may
   cause incompatibility issues for projects that use PyTorch
 
+### Other requirements
+
+* Continue using warning/error APIs and message classes that currently exist in
+  PyTorch wherever possible. For instance, `TORCH_CHECK`, `TORCH_WARN`, and
+  `TORCH_WARN_ONCE` should continue to be used in C++
+
 * TODO: Determine the requirements for the following concepts:
 
   - Log files (default behavior and any settings)
 
 
 ## **Motivation**
+
 Original issue: [link](https://github.com/pytorch/pytorch/issues/72948)
 
 Currently, it is challenging for PyTorch developers to provide messages that
@@ -116,5 +118,248 @@ act consistently between Python and C++.
 It is also challenging for PyTorch users to manage the messages that PyTorch
 emits. For instance, if a PyTorch user happens to be calling PyTorch functions
 that emit lots of messages, it can be difficult for them to filter out those
-messages so that their project's users don't get bombarded with warnings that
-they don't need to see.
+messages so that their project's users don't get bombarded with warnings and
+printouts that they don't need to see.
+
+
+## **Proposed Implementation**
+
+### Message classes
+
+At least the following message classes should be available. The name of the
+C++ class appears first in all the listed entries below, with the Python class
+to the right of it.
+
+Each severity level has a default class. All other classes within a given
+severity level inherit from the corresponding default class.
+
+NOTE: Most of the error classes below already exist in PyTorch. However,
+info classes do not currently exist. Also, only one type of warning currently
+exists in C++, and it is not implemented as a C++ class that can be inherited
+(as far as I understand).
+
+#### Error message classes:
+
+* **`c10::Error`** - `RuntimeError`
+  - Default error class. Other error classes inherit from it.
+
+* **`c10::IndexError`** - `IndexError`
+  - Emitted when attempting to access an element that is not present in
+    a list-like object.
+
+* **`c10::ValueError`** - `ValueError`
+  - Emitted when a function receives an argument with correct type but
+    incorrect value.
+
+* **`c10::TypeError`** - `TypeError`
+  - Emitted when a function receives an argument with incorrect type.
+
+* **`c10:NotImplementedError`** - `NotImplementedError`
+  - Emitted when a feature that is not implemented is called.
+
+* **`c10::LinAlgError`** - `torch.linalg.LinAlgError`
+  - Emitted from the `torch.linalg` module when there is a numerical error.
+
+* **`c10::NondeterministicError`** - `torch.NondeterministicError`
+  - Emitted when `torch.use_deterministic_algorithms(True)` and
+    `torch.set_deterministic_debug_mode('error')` are set, and a
+    nondeterministic operation is called.
+
+
+#### Warning message classes:
+
+* **`c10::UserWarning`** - `UserWarning`
+  - Default warning class. Other warning classes inherit from it.
+
+* **`c10::BetaWarning`** - `torch.BetaWarning`
+  - Emitted when a beta feature is called. See
+    [PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/).
+
+* **`c10::PrototypeWarning`** - `torch.PrototypeWarning`
+  - Emitted when a prototype feature is called. See
+    [PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/).
+
+* **`c10::NondeterministicWarning`** - `torch.NondeterministicWarning`
+  - Emitted when `torch.use_deterministic_algorithms(True)` and
+    `torch.set_deterministic_debug_mode('warn')` are set, and a
+    nondeterministic operation is called.
+
+* **`c10::DeprecationWarning`** - `DeprecationWarning`
+  - Emitted when a deprecated function is called.
+  - TODO: `DeprecationWarning`s are ignored by default in Python, so we may
+    actually want to use a different Python class for this.
+
+
+#### Info message classes:
+
+* **`c10::Info`** - `torch.Info`
+  - Default info class. Other info classes inherit from it.
+
+
+### Error APIs
+
+The APIs for raising errors all share a similar form. They check a boolean
+condition, the `cond` argument in the following signatures, and throw an error
+if that condition is false.
+
+The error APIs also each have a variable length argument list, `...` in C++ and
+`*args` in Python. When an error is raised, these arguments are concatenated
+into a string, and the string becomes the body of the error message. If
+possible, a developer who writes these error messages should try to include
+enough information so that a user could understand why the error happened and
+what to do about it. If that goal is not possible, the message should at least
+contain some useful information to lead the user in the right direction.
+
+The error APIs are listed below, with the C++ signature on the left and the
+corresponding Python signature on the right.
+
+**`TORCH_CHECK(cond, ...)`** - `torch.check(cond, *args)`
+  - C++ error: `c10::Error`
+  - Python error: `RuntimeError`
+
+TODO: Add the rest of these and also add sections for warnings and info.
+
+### Other details
+
+* At the moment in PyTorch, the Python `warnings` module is being publicly
+  included in `torch` as `torch.warnings`. This should probably be removed or
+  renamed to `_warnings` to avoid confusion.
+
+
+# PyTorch's current messaging API
+
+The rest of this document contains details about the current messaging API in
+PyTorch. This is included to give better context about what will change and
+what will stay the same in the new messaging system.
+
+At the moment, PyTorch has some APIs in place to make a lot of aspects of
+message logging easy, from the perspective of a developer working on PyTorch.
+Messages can be either printouts, warnings, or errors.
+
+Errors are created with the standard `raise` statement in Python
+([documentation](https://docs.python.org/3/tutorial/errors.html#raising-exceptions)).
+In C++, PyTorch offers macros for creating errors (which are listed later in
+this document). When a C++ function propagates to Python, any errors that were
+generated get converted to Python errors.
+
+Warnings are created with `warnings.warn` in Python
+([documentation](https://docs.python.org/3/library/warnings.html)). In C++,
+PyTorch offers macros for creating warnings (which are listed later in this
+document). When a C++ function propagates to Python, any warnings that were
+generated get converted to Python warnings.
+
+Printouts (or what is called "Info" severity messages in the new system) are
+created with just `print` in Python and `std::cout` in C++.
+
+PyTorch's C++ warning/error macros are declared in
+[`c10/util/Exception.h`](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h).
+
+## PyTorch C++ Errors
+
+In C++, there are several different types of errors that can be used, but
+PyTorch developers typically don't deal with these error classes directly.
+Instead, they use macros that offer a concise interface for raising different
+error classes.
+
+### C++ error macros
+
+Each of the error macros evaluate a boolean conditional expression, `cond`. If
+the condition is false, the error is raised, and whatever extra arguments are
+in `...` get concatenated into the error message with `operator<<`.
+
+| Macro                                    | C++ Error class                |
+| ---------------------------------------- | ------------------------------ |
+| `TORCH_CHECK(cond, ...)`                 | `c10::Error`                   |
+| `TORCH_CHECK_WITH(error_t, cond, ...)`   | caller specifies `error_t` arg |
+| `TORCH_CHECK_LINALG(cond, ...)`          | `c10::LinAlgError`             |
+| `TORCH_CHECK_INDEX(cond, ...)`           | `c10::IndexError`              |
+| `TORCH_CHECK_VALUE(cond, ...)`           | `c10::ValueError`              |
+| `TORCH_CHECK_TYPE(cond, ...)`            | `c10::TypeError`               |
+| `TORCH_CHECK_NOT_IMPLEMENTED(cond, ...)` | `c10::NotImplementedError`     |
+
+There is some documentation on error macros [here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L344-L362)
+
+The reason why C++ preprocessor macros are used, rather than function calls, is
+to ensure that the compiler can optimize for the `cond == true` branch. In
+other words, if an error does not get raised, overhead is minimized.
+
+### C++ error classes
+
+The primary error class in C++ is `c10::Error`. Documentation and declaration
+are
+[here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L21-L28).
+`c10::Error` is a subclass of `std::exception`.
+
+There are other error classes which are child classes of `c10::Error`, defined
+[here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L195-L236).
+
+When these errors propagate to Python, they are each converted to a different
+Python error class:
+
+| C++ error class                 | Python error class         |
+| ------------------------------- | -------------------------- |
+| `c10::Error`                    | `RuntimeError`             |
+| `c10::IndexError`               | `IndexError`               |
+| `c10::ValueError`               | `ValueError`               |
+| `c10::TypeError`                | `TypeError`                |
+| `c10::NotImplementedError`      | `NotImplementedError`      |
+| `c10::EnforceFiniteError`       | `ExitException`            |
+| `c10::OnnxfiBackendSystemError` | `ExitException`            |
+| `c10::LinAlgError`              | `torch.linalg.LinAlgError` |
+
+
+## PyTorch C++ Warnings
+
+When warnings propagate from C++ to Python, they are converted to a Python
+`UserWarning`. Whatever is in `...` will get concatenated into the warning
+message using `operator<<`.
+
+* `TORCH_WARN(...)`
+  - [Definition](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L515-L530)
+
+* `TORCH_WARN_ONCE(...)`
+  - [Definition](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L557-L562)
+  - This macro only generates a warning the first time it is encountered during
+    run time.
+
+
+## Implementation details
+
+### C++ to Python Error Translation
+
+`c10::Error` and its subclasses are translated into their corresponding Python
+errors [in `CATCH_CORE_ERRORS`](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/torch/csrc/Exceptions.h#L54-L100).
+
+However, not all of the `c10::Error` subclasses in the table above appear here.
+I'm not sure yet what's up with that.
+
+`CATCH_CORE_ERRORS` is included within the `END_HANDLE_TH_ERRORS` macro that
+every Python-bound C++ function uses for handling errors. For instance,
+`THPVariable__is_view` uses the error handling macro
+[here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/tools/autograd/templates/python_variable_methods.cpp#L76).
+
+
+#### `torch::PyTorchError`
+
+There's also an extra error class in `CATCH_CORE_ERRORS`,
+`torch::PyTorchError`. I'm not sure yet why it exists and how it differs from
+`c10::Error`. `torch::PyTorchError` has several overloads:
+
+* `torch::IndexError`
+* `torch::TypeError`
+* `torch::ValueError`
+* `torch::NotImplementedError`
+* `torch::AttributeError`
+* `torch::LinAlgError`
+
+
+### C++ to Python Warning Translation
+
+The conversion of warnings from C++ to Python is described [here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/torch/csrc/Exceptions.h#L25-L48)
+
+
+## Misc Notes
+
+[PyTorch Developer Podcast - Python exceptions](https://pytorch-dev-podcast.simplecast.com/episodes/python-exceptions)
+explains how C++ errors/warnings are converted to Python. TODO: listen to it
+again and take notes.