Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex message start syntax is too strict #610

Closed
eemeli opened this issue Jan 18, 2024 · 7 comments · Fixed by #854
Closed

Complex message start syntax is too strict #610

eemeli opened this issue Jan 18, 2024 · 7 comments · Fixed by #854
Labels
resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. Seek-Feedback-in-Preview Issue should be something we seek feedback on in the tech preview period syntax Issues related with MF Syntax

Comments

@eemeli
Copy link
Collaborator

eemeli commented Jan 18, 2024

While working on a presentation about the JS implementation, I encountered the following error when testing some of the code I'd written for the slides:

const msg = `
  .local $currency = {EUR}
  .input {$price :number style=currency currency=$currency}
  {{Total price: {$price}}}`;
const mf = new MessageFormat(msg, 'en');
// Syntax parse error: parse-error at [the first { of '{{Total...']

After spending quite a while debugging the implementation code to determine the error here, I came to the conclusion that my parser was correct, the above is indeed a syntax error.

The problem with this message is that the leading whitespace before the first . means that it isn't special, and all of the declaration syntax parses as a valid simple pattern, up until the {{ pattern start.

I mentioned this problem on our Slack channel, where @stasm was also unable to initially see the problem. I think if we can't see it, other developers won't either, and that this will become a significant stumbling block.

We should consider amending the ABNF to allow whitespace before the . at the beginning of a complex message.

@eemeli eemeli added syntax Issues related with MF Syntax blocker-candidate The submitter thinks this might be a block for the Technology Preview Agenda+ Requested for upcoming teleconference labels Jan 18, 2024
@aphillips
Copy link
Member

If we do something to fix this, we'll need to confront the problem of pattern initial whitespace. The simplest fix would be to change the ABNF to allow non-significant whitespace at the start of the message. Here's the current ABNF:

message = simple-message / complex-message
simple-start-char = content-char / s / "@" / "|"

And here's the modifications:

message = [s] (simple-message / complex-message)
simple-start-char = content-char / "@" / "|"

The primary side-effect of this change is that, if you want spaces at the start of a simple messages, you must quote the pattern.

A different way to address this would be to allow optional whitespace at the start of complex-message. The challenge here is that the message parser has to look ahead until it sees a non-space character. If that character is . or if it is { followed by { (i.e. {{), the message is complex and the whitespace is discarded. Otherwise the whitespace is retained.

complex-message = [s] *(declaration [s]) complex-body

I think @eemeli's experience is not invalid, since we play loosely with whitespace elsewhere. I might have a slight preference for the second (lookahead) solution. Let's discuss in the 2024-01-22 call.

@aphillips aphillips added the LDML45 LDML45 Release (Tech Preview) label Jan 19, 2024
@gibson042
Copy link
Collaborator

I remember raising this concern multiple times, including a flavor of it in the summary it at #512 (comment) — relevant excerpt from linked comment:

I feel strongly that iterative updates to a message which change its classification (e.g., adding or removing declarations) should not affect treatment of leading or trailing whitespace—cf. #474 (comment)

 logOutMessage = ```
-{%input username}
-
-Log out {$username}?
+Log out?
 ```;

And as far as I can tell, the concern was ultimately rejected, albeit without the requested explicit acknowledgment. Is it to be reconsidered?

@mihnita
Copy link
Collaborator

mihnita commented Jan 22, 2024

I also think the current .foo is a problem.

Looking at the spec:

message = simple-message / complex-message
simple-message = [simple-start pattern]
simple-start = simple-start-char / text-escape / placeholder
simple-start-char = content-char / s / "@" / "|"
text-escape     = backslash ( backslash / "{" / "}" )
content-char      = %x00-08        ; omit HTAB (%x09) and LF (%x0A)
                  / %x0B-0C        ; omit CR (%x0D)
                  / %x0E-19        ; omit SP (%x20)
                  / %x21-2D        ; omit . (%x2E)
                  / %x2F-3F        ; omit @ (%x40)
                  / %x41-5B        ; omit \ (%x5C)
                  / %x5D-7A        ; omit { | } (%x7B-7D)
                  / %x7E-D7FF      ; omit surrogates
                  / %xE000-10FFFF

How can I say ".input is in important keyword in MF2", and render the .input?

The current solution is to wrap the whole string in {{...}}
And that makes it a complex message.

At this point we can all say "I told you so!" :-)
But we will all use this problem as an argument for our own old proposal:
tinker with trimming spaces, start in code mode, special marker to enter code mode, etc.

But trying to look at this with fresh eyes, I think the way we enter "complex mode" is too ... complex :-)
Now a complex message starts with .foo OR {{. If not, it is a simple message.

I had a hard time to figure out from the ebnf if a message is complex or not.
Is ".foo bar" complex?
The ebnf fragment quoted above is not enough to determine that.
I also need to look at the definition of placeholder.
And after going through all of that, I decide it is not simple.

Compare for example with some initial marker for complex messages:

message = "{!}" complex-message / simple-message

I can almost instantly see if a message is complex or not.
Same as seeing #! at the beginning of a file I know it is some kind of script, and what kind of script.

Note: I'm not saying that's the way to go (initial marker).
Maybe it is, maybe not.

What I'm saying is that there are more options than trim leading spaces and quote.
"optional whitespace at the start of complex-message" only makes the detection of complex messages harder.
It is already hard.

@mihnita
Copy link
Collaborator

mihnita commented Jan 22, 2024

We should consider amending the ABNF to allow whitespace before the . at the beginning of a complex message.

I agree it's a problem, I don't like jumping to this solution as the only one.

@aphillips aphillips added Future Deferred for future standardization Seek-Feedback-in-Preview Issue should be something we seek feedback on in the tech preview period out-of-scope? and removed blocker-candidate The submitter thinks this might be a block for the Technology Preview Agenda+ Requested for upcoming teleconference LDML45 LDML45 Release (Tech Preview) Future Deferred for future standardization labels Jan 22, 2024
@aphillips
Copy link
Member

In the 2024-01-22 call we agreed to place this out of scope for LDML45 and to track this issue during tech preview. I'm using the label out-of-scope? for now to represent that choice.

@eemeli
Copy link
Collaborator Author

eemeli commented Jun 19, 2024

Dropping the out-of-scope? label as that was about the LDML54 release.

@aphillips aphillips added the resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. label Aug 9, 2024
@aphillips
Copy link
Member

We resolved to accept this feedback. #854 will close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resolve-candidate This issue appears to have been answered or resolved, and may be closed soon. Seek-Feedback-in-Preview Issue should be something we seek feedback on in the tech preview period syntax Issues related with MF Syntax
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants