Fix #782: give implementations more flexibility in error handling #795

mihnita · 2024-05-16T20:00:50Z

No description provided.

…andling

aphillips · 2024-05-16T20:22:56Z

spec/errors.md

+In all cases, when encountering an error during formatting,
+a message formatter MUST provide some representation of the message,
+or MUST provide an informative error or errors.
+An implementation MAY provide both.


The "in all cases" text with the "MUST or MUST" is a bit unusual in terms of standards. Perhaps:

Suggested change

In all cases, when encountering an error during formatting,

a message formatter MUST provide some representation of the message,

or MUST provide an informative error or errors.

An implementation MAY provide both.

When an error is encountered during formatting,

a message formatter can provide an informative error (or errors)

or some representation of the message or both.

ACK. I am OK with the suggestion, and I trust you on what standards do :-)

But isn't the final form a bit too mild?
The result is "can A or B or both"
Does it leave an open door to do neither A nor B?

Would something like this be more in line with what standards do?: "MUST A or B or both"
Which would preserve (at least some) of the "strength" of the original text?

@macchiati I think maybe a better pseudo-model would look like the below. This ignores the try/catch pattern, which would look like your example, to show some of the other ways APIs might be provided:

MessageFormatter mf = MessageFormatter.withMessage(pattern); int result = mf.format(parameters); // some APIs return a status code, e.g. ICU4C String output = mf.toString(); // fallback message if errors MessageFormatter.Part[] parts = mf.toParts(); // let's not forget formatToParts if (mf.hasErrors()) { log.error(mf.getErrors()); // you can get the errors out too }

@eemeli is correct that any of the three models he lists would be permitted by the proposed text. I think I agree that we should make a design doc, even if it is a small one, since I note that some programming languages (such as Java) use exception throwing as the primary error mechanism. But some do not. We need to enable those languages also.

I think it is probably incorrect to only have a fallback message with no way to find out if the formatting call failed in some way.

So I'd suggest something like:

Suggested change

In all cases, when encountering an error during formatting,

a message formatter MUST provide some representation of the message,

or MUST provide an informative error or errors.

An implementation MAY provide both.

Programming languages and runtime environments differ widely,

so this specification does not define the exact mechanism for reporting errors.

When an error or errors are encountered during formatting,

an implementation MUST provide a means for the caller to discover the errors.

For example, it might throw an exception or it might return a status code signaling an error.

In some environments, callers will expect the formatting call to return

a representation of the formatted message

in addition to any error handling or signaling mechanism.

When an error or errors are encountered during formatting,

an implementation SHOULD provide a means for the caller to obtain

some representation of the formatted message.

Such a representation MUST follow the fallback representation defined in this specification.

Implementations MAY provide both an error signal and a fallback representation.

Implementations MAY allow callers to ignore errors

(such as by disabling exception throwing or providing non-erroring APIs).

I'd be okay with any of: 1) @mihnita's proposed change 2) @aphillips's proposed change so long as the change below is made 3) just changing the existing spec text's "MUST"s to "MAY"s, or 4) some combination thereof.

The change to @aphillips's proposed change above is:

- an implementation MUST provide a means for the caller to discover the errors. + an implementation MUST provide a means for the caller to discover the occurrence of an error.

I think that change supports the constraints of something like ICU4X in Rust where it only reports 1 error if there are many errors encountered, due to the constraint that they don't want to allocate but allocation would be required if they had to report any & all errors, according to the feedback that we asked them to give. And I think it better reflects @aphillips's original intent when he wrote above

way to find out if the formatting call failed in some way

I say this also because the hypothetical Java-ish pseudocode above also includes the line

log.error(mf.getErrors())

which implies getting multiple errors, but that's probably unnecessary/undesirable for many places in which Rust is used, and is not supported by ICU4X in particular.

an implementation SHOULD provide a means for the caller to obtain
some representation of the formatted message

This PR, plus maybe at least this bit of editing, is an improvement over the original. One key thing to note is that the most idiomatic way to implement in many languages will be more than one formatter function. For example in rust where you can return a Result<message, error>, where you only get either message or error but not both, providing "a means for the caller to obtain some representation" will look like

trait Message { fn format(&self, args: Arguments) -> Result<String, Error>; fn formatFallback(&self, args: Arguments) -> String; }

The usage would be to try with message.format() first using normal idiomatic error handling to determine if it worked or not. In the event an error was returned instead of a formatted string, then the programmer has the option to use the infallible formatFallback variant that does whatever it needs to do to return something (even if it is just the message source or even the id) but is unable to return any error info at the same time.

I'd be fine with a change that separated the two current MUST statements from each other to more explicitly allow an implementation like @alerque's above to be conformant.

As I see it, the two key requirements we are and should be imposing are:

You can always get some formatted representation of a valid message.

For a message with errors, you can get at least one of the errors.

Note in particular the use of the word "valid" there; that spec-defined qualifier is intentional, as we probably should not require messages with syntax or data model errors to still be nominally formattable.

@eemeli noted about valid messages. It should be pointed out that invalid messages would fail before one starts the formatting process. (Of course it is possible to visualize a method/function that parses and formats a message all in one go, but under the hood the implementation will fail the message before it gets to the formatting steps in our spec)

Note that the fallback representation of syntax and data model errors (these errors define what it means to be a "valid" message) is the logo string, specifically:

If the message being formatted has any Syntax Errors or Data Model Errors, the result of pattern selection MUST be a pattern resolving to a single fallback value using the message's fallback string defined in the formatting context or if this is not available or empty, the U+FFFD REPLACEMENT CHARACTER �.

@echeran noted:

I think that change supports the constraints of something like ICU4X in Rust where it only reports 1 error if there are many errors encountered, due to the constraint that they don't want to allocate but allocation would be required if they had to report any & all errors, according to the feedback we asked them to give. And I think it better reflects @aphillips's original intent when he wrote above...

I would probably make a change to your proposal. An implementation that doesn't want to return a structure containing all of the errors might reasonably be "fast fail" (stop trying to format the message and return the first error it encounters rather than trying to "collect them all": it's not clear that an implementation can always recover to find all of the errors anyway or that removing an error might not result in a different outcome).

- an implementation MUST provide a means for the caller to discover the occurrence of an error. + an implementation MUST provide a means for the caller to discover than an error or errors have occurred. + Implementations are not required to enumerate all possible errors.

@alerque I agree that separate functions are one possibility (another example that somewhere I call out is ICU4J sometimes has "throwing" and "non-throwing" format methods).

Note that fn format($self, args: Arguments) -> Result<String,Error>; could return both String and Error for every call, with Error::NONE being the "good result", saving the expense of formatting twice. When there is an error, the string can be the fallback string. I don't think we should require this behavior. Just calling it out.

As a user, I'd be unamused by the need to write tons of logic around my format calls. One of the nice things about message format is the ability to get rid of tons of fiddly code related to the arguments and their formatting. I tend to want my calls to MF2 to be as sleek as possible. In most cases I'm going to try to solve all of my errors in testing, so I can be "free to fly" in production.

Note that fn format($self, args: Arguments) -> Result<String,Error>; could return both String and Error for every call, with Error::NONE being the "good result"

No it can't, that's impossible. Rust results are either a successful something (type specified) or an error, but never both.

The only way around that is to re-roll your own error handling from scratch by making your own types (with a tuple struct or whatever) with results and errors inside them—and eschewing all of Rust's error handling patterns. And that's crazy. This is the whole reason this issue came up because the way the spec is written would force people to write non-ergonomic and very much non-idiomatic code in many languages just to be spec compliant.

As a user, I'd be unamused by the need to write tons of logic around my format calls.

That's exactly what an end user would have to do in many languages if the API followed the patterns the spec is currently trying to mandate and/or keep getting suggested here as possible patterns. Maybe in some languages that is an ergonomic/idiomatic possibility, but not Rust and not many other languages I know of. The message format shouldn't be mandating this aspect of an implementation API. It can make suggestions so that robust usage is possible, but shouldn't try to foist a particular error handling pattern onto languages that may or may not actually work that way.

@alerque Never mind then. I have no desire to create special work for Rust implementers (the whole point of my suggestions is to allow idiomatic implementation everywhere)

eemeli

This is going to need a design doc to explain the change.

At the moment, the spec is providing exactly one choice for what an MF2 formatter is expected to do when a runtime error occurs: Provide some representation of the message, and separately report the error or errors.

The proposed change would be to remove this requirement completely, and allow any of the following to occur:

A message with fallback contents is provided, and errors are swallowed.
A message with fallback contents is provided, and errors are also emitted.
No message is provided, and an error is emitted.

I don't think that's a good idea, and I think this is something on which it's appropriate for us to be opinionated.

macchiati · 2024-05-16T23:01:36Z

Someone could design an API like (in pseudocode). I'd argue that such an API should be allowed to be conformant. In particular, what the formatter is expected to do can depend on what the caller tells it to do. try { * x = format(X, parameters);* // do something with it }catch (MFException e) { if (showMessageAnyway) { * x = formatWithFallback(X, parameters);* } else if (showMissingMessage) { x = reportMissingMessage(Xid, parameters); } else { logMessage("...", X, parameters) throw e; } }

…

On Thu, May 16, 2024 at 2:38 PM Eemeli Aro ***@***.***> wrote: ***@***.**** requested changes on this pull request. This is going to need a design doc to explain the change. At the moment, the spec is providing exactly one choice for what an MF2 formatter is expected to do when a runtime error occurs: Provide some representation of the message, and separately report the error or errors. The proposed change would be to remove this requirement completely, and allow any of the following to occur: 1. A message with fallback contents is provided, and errors are swallowed. 2. A message with fallback contents is provided, and errors are also emitted. 3. No message is provided, and an error is emitted. I don't think that's a good idea, and I think this is something on which it's appropriate for us to be opinionated. — Reply to this email directly, view it on GitHub <#795 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACJLEMFXUWJLNTHYX2LDLYLZCURL3AVCNFSM6AAAAABH233XSKVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDANRRHEZTENRUGM> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

aphillips · 2024-08-26T17:14:04Z

closed by #816 in 2024-08-26

Fix unicode-org#782: give implementations more flexibility in error h…

4ba26ca

…andling

mihnita requested review from aphillips, eemeli and echeran May 16, 2024 20:00

aphillips reviewed May 16, 2024

View reviewed changes

eemeli requested changes May 16, 2024

View reviewed changes

aphillips added normative errors Issues related to the errors section of the spec LDML46 LDML46 Release (Tech Preview - October 2024) labels May 17, 2024

aphillips mentioned this pull request May 20, 2024

Create a design document for error handling #796

Closed

echeran mentioned this pull request Jun 3, 2024

Add design doc for error handling #804

Merged

alerque mentioned this pull request Jul 17, 2024

{^} ACTION REQUIRED -- ballot of error handling #830

Closed

aphillips mentioned this pull request Jul 17, 2024

Discussion issue for #830: balloting of error handling #831

Closed

aphillips closed this Aug 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #782: give implementations more flexibility in error handling #795

Fix #782: give implementations more flexibility in error handling #795

mihnita commented May 16, 2024

aphillips May 16, 2024

mihnita May 16, 2024 •

edited

Loading

aphillips May 17, 2024

echeran May 17, 2024

alerque May 18, 2024

eemeli May 18, 2024

aphillips May 18, 2024

alerque May 18, 2024 •

edited

Loading

aphillips May 18, 2024

eemeli left a comment

macchiati commented May 16, 2024 via email

aphillips commented Aug 26, 2024

-In all cases, when encountering an error during formatting,
-a message formatter MUST provide some representation of the message,
-or MUST provide an informative error or errors.
-An implementation MAY provide both.
+Programming languages and runtime environments differ widely,
+so this specification does not define the exact mechanism for reporting errors.
+When an error or errors are encountered during formatting,
+an implementation MUST provide a means for the caller to discover the errors.
+For example, it might throw an exception or it might return a status code signaling an error.
+In some environments, callers will expect the formatting call to return
+a representation of the formatted message
+in addition to any error handling or signaling mechanism.
+When an error or errors are encountered during formatting,
+an implementation SHOULD provide a means for the caller to obtain
+some representation of the formatted message.
+Such a representation MUST follow the fallback representation defined in this specification.
+Implementations MAY provide both an error signal and a fallback representation.
+Implementations MAY allow callers to ignore errors
+(such as by disabling exception throwing or providing non-erroring APIs).

Fix #782: give implementations more flexibility in error handling #795

Fix #782: give implementations more flexibility in error handling #795

Conversation

mihnita commented May 16, 2024

aphillips May 16, 2024

Choose a reason for hiding this comment

mihnita May 16, 2024 • edited Loading

Choose a reason for hiding this comment

aphillips May 17, 2024

Choose a reason for hiding this comment

echeran May 17, 2024

Choose a reason for hiding this comment

alerque May 18, 2024

Choose a reason for hiding this comment

eemeli May 18, 2024

Choose a reason for hiding this comment

aphillips May 18, 2024

Choose a reason for hiding this comment

alerque May 18, 2024 • edited Loading

Choose a reason for hiding this comment

aphillips May 18, 2024

Choose a reason for hiding this comment

eemeli left a comment

Choose a reason for hiding this comment

macchiati commented May 16, 2024 via email

aphillips commented Aug 26, 2024

mihnita May 16, 2024 •

edited

Loading

alerque May 18, 2024 •

edited

Loading