Skip to content

Commit

Permalink
doc: confine mention of raising in a last paragraph
Browse files Browse the repository at this point in the history
Signed-off-by: Paul-Elliot <[email protected]>
  • Loading branch information
panglesd committed Jan 21, 2022
1 parent d5a58c0 commit fdad824
Showing 1 changed file with 75 additions and 24 deletions.
99 changes: 75 additions & 24 deletions doc/ppx-for-plugin-authors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -363,55 +363,96 @@ whole ``metaquot`` extension point. E.g. you can write:
Handling errors
---------------

In order to have a good error reporting, it is necessary to understand the different options that are offered to you, their differences and when to use each. The possibilities are the following:

.. |raise_errorf| replace:: ``Location.raise_errorf``
.. _raise_errorf: https://ocaml-ppx.github.io/ppxlib/ppxlib/Ppxlib/Location/index.html#val-raise_errorf

.. |error_extensionf| replace:: ``Location.error_extensionf``
.. _error_extensionf: https://ocaml-ppx.github.io/ppxlib/ppxlib/Ppxlib/Location/index.html#val-error_extensionf

- Embedding an error node in the AST. Error nodes are special extension nodes ``[%ocaml.error error_message]``, which are interpreted as errors. Error nodes make valid AST, so the resulting AST will be passed to the upcoming rewriters, and later errors can be reported as well. They can be created with |error_extensionf|_.
- Raising a located exception. A located exception is an exception containing both an error message and a location in the file. The purpose of raising an exception is to stop the execution of the current rewriter, as well as rewriters that would be executed later. It should be used when the ppx fails with an unrecoverable error, and no meaningful AST can be returned. Raising a located exception can be done through the |raise_errorf|_ function.
.. |pexp_extension| replace:: ``pexp_extension``
.. _pexp_extension: https://ocaml-ppx.github.io/ppxlib/ppxlib/Ppxlib/Ast_builder/Default/index.html#val-pexp_extension

.. |Ast_builder| replace:: ``Ast_builder``
.. _Ast_builder: https://ocaml-ppx.github.io/ppxlib/ppxlib/Ppxlib/Ast_builder/index.html

In most of the case, it is better to embed error nodes in the AST. It often allows for finer-grained error reporting, lets further rewriting happen, and makes features such as type analysis or "jump to definition" available in a wider scope. Raising a located exception should be only done when no meaningful AST can be produced, for instance when there is a hard failure of the ppx.
In order to give a nice user experience when reporting errors or failures in a ppx, it is necessary to include as much of the generated content as possible. Most IDE tools, such as Merlin, rely on the AST for their features, such as displaying type, jumping to definition or showing the list of errors.

It is important for whole file transformation authors to know that the AST they get might contain error nodes from previous rewriters. In order to treat these errors with respect, they should never dump some part of the AST without thinking twice about the risk of losing errors.
A common way to report an error is to throw an exception. However, this method interrupts the execution flow of the ppxlib driver without giving it an AST to produce, preventing Merlin to work.

In the next sections, we distinguish the context-free transformations and the whole file transformations. Recall that whole file transforms are transformations which take the whole AST as input, and return a rewritten AST; and context-free transformations are transformations that can be made locally to an extension node or a node with an attribute: extenders and derivers.
Instead, it is better to always return a valid AST, as complete as possible, but with "error extension nodes" at every place where valid code generation was impossible. Error extension nodes are special extension nodes ``[%ocaml.error error_message]``, which makes valid AST but are interpreted later as errors, for instance by the compiler or Merlin. As all extensions nodes, they can be put anywhere in the AST, from structure items to expressions or pattern.

For instance, suppose a rewriter is supposed to define a new record type, but there is an error in the generation of the type of one field. In order to have the most complete AST as output, the rewriter can still define the type and all of its fields, putting an extension node in place of the type of the faulty field:

Reporting errors in a whole file rewriter
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
..
- not include the type at all, outputting only:
In ppxlib, you can register whole file transformations. They take as input the AST, and output a rewritten AST. When there are several registered whole file transformations, the transformations are applied sequentially, in alphabetical order of the names.
.. code:: ocaml
When an error happen during a whole file transformation, there are two possibilities: Either the error still allow to output a meaningfull AST, with holes filled with error nodes. For instance, if a variable has to be defined, but the value is undefined due to some error, the rewriter can append ``let variable = [%ocaml.error "Some error happened"]`` somewhere in the AST without harming further rewriting. The AST is then passed to the next rewriter, the ppxlib driver is not even aware of the error.
[%ocaml.error "type long_record was not defined due to foo"]
Or, there is an unrecoverable error that prevent to return an AST. For instance, the whole AST has to be generated from the content of a json file, but the file is missing. Then, the ppxlib driver should stop the rewritingchain to return the error. This behaviour can be achieved by raising a located exception: an exception which will stop the rewriting execution flow to return the error together with the original AST. In this case, the final AST returned by the driver is an error node deduced from the located exception, followed by the last valid AST: the one passed to the raising transformation.
- define the type, but not its implementation:

..
When an exception is raised during one of the whole file transforms ``A``, it is unclear what to do from the point of view of the driver. If the driver stops there and display the error to the user, no other errors can be shown, even those unrelated to the rewriting, as the AST is lost. If the driver registers the error and pass the unrewritten AST to the next rewriter, the user might get many errors due to the failure of ``A`` that do not point to the right direction.
.. code:: ocaml
In summary, the best practice for a whole file rewriter is to return a valid AST whenever the severity of the failure allows it, containing error extension nodes to fill holes due to failed rewriting. When this is impossible, the rewriter should throw a located error.
type long_record = [%ocaml.error "type long_record was not implemented due to foo"]
- define and implement the type, but not its faulty field:

Reporting errors in a context-free transformation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code:: ocaml
The situation for context-free transformations is similar to the whole file transformations, with one difference: all context-free rewriters are applied in parallel during a single AST traverse. This traverse happen before the whole file transformations, and is considered by the driver as a single whole file transformation, in terms of error reporting.
type long_record = { field_1: int ; field_2: [%ocaml.error "field_2 could not be implemented due to foo"]}
Thus, for the whole file transformations to happen, it is required that all context-free rewriters return without raising. Moreover, if a context-free rewriter raises, the driver will stop and output the error together with the last valid AST, which is the one before any context-free rewriting. This means that if a context-free transformation raises, all other context-free rewriting won’t happen.
Ppxlib provides a function in its API to create error extension nodes: |error_extensionf|_. This function creates an extension node, which has then to be transformed in the right kind of node using functions such as for instance |pexp_extension|_ from the ``Default`` module of |Ast_builder|_.

This situation is really unwanted, so even more than in whole file transformations, embedding errors in extension nodes using |error_extensionf|_ should be preferred to raising as much as possible.
As the ppxlib driver get a valid AST, it can’t detect whether the ppx failed or not. This AST will be passed as input to the next rewriter, which means that further transformations might get AST containing error nodes. It is therefore important for whole file transormations authors to treat these with respect: for instance, to never dump some part of the AST without thinking twice about the risk of loosing errors, which could result in a very hard to debug state.


..
to understand the different options that are offered to you, their differences and when to use each. The possibilities are the following:

- Embedding an error node in the AST. Error nodes are special extension nodes ``[%ocaml.error error_message]``, which are interpreted as errors. Error nodes make valid AST, so the resulting AST will be passed to the upcoming rewriters, and later errors can be reported as well. They can be created with |error_extensionf|_...
- Raising a located exception. A located exception is an exception containing both an error message and a location in the file. The purpose of raising an exception is to stop the execution of the current rewriter, as well as rewriters that would be executed later. It should be used when the ppx fails with an unrecoverable error, and no meaningful AST can be returned. Raising a located exception can be done through the |raise_errorf|_ function...

In most of the case, it is better to embed error nodes in the AST. It often allows for finer-grained error reporting, lets further rewriting happen, and makes features such as type analysis or "jump to definition" available in a wider scope. Raising a located exception should be only done when no meaningful AST can be produced, for instance when there is a hard failure of the ppx...

It is important for whole file transformation authors to know that the AST they get might contain error nodes from previous rewriters. In order to treat these errors with respect, they should never dump some part of the AST without thinking twice about the risk of losing errors...

In the next sections, we distinguish the context-free transformations and the whole file transformations. Recall that whole file transforms are transformations which take the whole AST as input, and return a rewritten AST; and context-free transformations are transformations that can be made locally to an extension node or a node with an attribute: extenders and derivers...

..
it will prevent further rewriting, as the whole but the context-free rewriting will still finish. An error node, generated from the exception, will be placed in the right place in the AST: replacing the extension node for extenders, and just after the attributed structure item for deriver.

..
Unlike in the case of a whole file rewriting, when a context-free rewriter throw an exception, the whole AST is not lost: the transformation happen locally in an extension node, or is inserted after an attribute. Therefore, when doing the context-free rewriting pass, the ppxlib driver can catch located exceptions and automatically embed an error node with the same location and message, at the right place in the AST.
Reporting errors in a whole file rewriter
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In ppxlib, you can register whole file transformations. They take as input the AST, and output a rewritten AST. When there are several registered whole file transformations, the transformations are applied sequentially, in alphabetical order of the names.

When an error happen during a whole file transformation, there are two possibilities: Either the error still allow to output a meaningfull AST, with holes filled with error nodes. For instance, if a variable has to be defined, but the value is undefined due to some error, the rewriter can append ``let variable = [%ocaml.error "Some error happened"]`` somewhere in the AST without harming further rewriting. The AST is then passed to the next rewriter, the ppxlib driver is not even aware of the error.

However, although throwing a located exception does not harm error reporting at other places of the program, it does not allow multiple error reporting for the same rewriter. Thus, ``Location.raise_errorf`` should only be used as a practical tool when no other error could possibly be reported in the rewriting of a node.
Or, there is an unrecoverable error that prevent to return an AST. For instance, the whole AST has to be generated from the content of a json file, but the file is missing. Then, the ppxlib driver should stop the rewritingchain to return the error. This behaviour can be achieved by raising a located exception: an exception which will stop the rewriting execution flow to return the error together with the original AST. In this case, the final AST returned by the driver is an error node deduced from the located exception, followed by the last valid AST: the one passed to the raising transformation.

..
When an exception is raised during one of the whole file transforms ``A``, it is unclear what to do from the point of view of the driver. If the driver stops there and display the error to the user, no other errors can be shown, even those unrelated to the rewriting, as the AST is lost. If the driver registers the error and pass the unrewritten AST to the next rewriter, the user might get many errors due to the failure of ``A`` that do not point to the right direction.
In summary, the best practice for a whole file rewriter is to return a valid AST whenever the severity of the failure allows it, containing error extension nodes to fill holes due to failed rewriting. When this is impossible, the rewriter should throw a located error.


Reporting errors in a context-free transformation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The situation for context-free transformations is similar to the whole file transformations, with one difference: all context-free rewriters are applied in parallel during a single AST traverse. This traverse happen before the whole file transformations, and is considered by the driver as a single whole file transformation, in terms of error reporting.

Thus, for the whole file transformations to happen, it is required that all context-free rewriters return without raising. Moreover, if a context-free rewriter raises, the driver will stop and output the error together with the last valid AST, which is the one before any context-free rewriting. This means that if a context-free transformation raises, all other context-free rewriting won’t happen.

This situation is really unwanted, so even more than in whole file transformations, embedding errors in extension nodes using |error_extensionf|_ should be preferred to raising as much as possible.

..
it will prevent further rewriting, as the whole but the context-free rewriting will still finish. An error node, generated from the exception, will be placed in the right place in the AST: replacing the extension node for extenders, and just after the attributed structure item for deriver.
..
Unlike in the case of a whole file rewriting, when a context-free rewriter throw an exception, the whole AST is not lost: the transformation happen locally in an extension node, or is inserted after an attribute. Therefore, when doing the context-free rewriting pass, the ppxlib driver can catch located exceptions and automatically embed an error node with the same location and message, at the right place in the AST.
However, although throwing a located exception does not harm error reporting at other places of the program, it does not allow multiple error reporting for the same rewriter. Thus, ``Location.raise_errorf`` should only be used as a practical tool when no other error could possibly be reported in the rewriting of a node.

A documented example
^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -481,3 +522,13 @@ The first limitation is that the deriver cannot work on non record types. Howeve
])
|> List.concat
In case of panick
^^^^^^^^^^^^^^^^^

In some rare cases, it might happen that a whole file rewriter is not able to output a meaningful AST. In this case, they might be tempted to raise a located error: an exception that includes the location of the error. Moreover, this has historically been what was suggested to do by ppxlib examples, but is now discouraged in most of the cases, as it prevent Merlin features to work well.

If such an exception is uncaught, the binary will return with an error code and the exeption be pretty-printed, including the location. When the driver is spawned with the ``-embed-errors`` or ``-as-ppx`` flags, the driver will look for located error. If it catches one, it will stop its chain of rewriting at thi spoint, and output an AST consisting of the located error followed by the last valid AST: the one passed to the raising rewriter.

Even more in context-free rewriters, raising should be avoided, in favour of outputting a single error node when a finer grained reporting is not needed or possible. As the whole context-free rewriting is done in one traverse of the AST, a single raise will cancel both the context-free pass and upcoming rewriters, and the AST prior to the context-free pass will be outputted together with the error.

The function provided by the API to raise located errors is |raise_errorf|_.

0 comments on commit fdad824

Please sign in to comment.