[CS2] Fix #2998: Object literal with inconsistent indentation should throw an error #4630

GeoffreyBooth · 2017-08-04T19:35:09Z

Closes #2998. Now code like its example:

known_colors =
   45:              'cyan'
   52:              'sepia'
  100:              'olive'
  124:              'red'

will throw an unexpected indentation error. Not sure if my error check is too specific, but it catches this case at least.

…ld throw an error

GeoffreyBooth · 2017-08-05T20:17:30Z

@helixbass do you mind reviewing this? It’s similar to several recent PRs of yours.

connec · 2017-08-07T16:44:54Z

test/error_messages.coffee

+  assertErrorFormat '''
+    one =
+      '>!': 3
+    six: -> 10


It's a shame that this breaks, it might not be a useful syntax but disallowing it breaks the syntax expectations imo.

Is there some way we can refine the condition to allow it if the second objectsstarts at the same indentation as the first object?

allow it if the second object starts at the same indentation as the first object?

What do you mean?

a = '>!': 3 six: -> 10

? That would be just one object?

I don’t see the purpose of a freestanding object literal that isn’t part of an assignment or return or throw. Why construct such a variable if it never gets used?

What I mean is:

a = # indent: 0 - block starts at indent 0 '>!': 3 # indent: 2 - sets the indentation for everything in the objects bad: 1 # indent: 1 - invalid, everything in the objects needs to be indent 2 six: -> 10 # indent: 0 - new expression foo = -> a = # indent: 2 - block starts at indent 2 '>!': 3 # indent: 4 - sets the indentation for everything in the objects bad: 1 # indent: 3 - invalid, everything in the objects needs to be indent 4 six: -> 10 # indent: 2 - new expression

Here's a contrived example that uses it in a visible way:

o = null f = -> o = '>!': 3 six: -> 10 result = f() # { six: -> 10 } o # { '>!': 3 }

But my main issue is that it should be allowed for the same reason the following are:

1 'hello' -> foo etc

I guess the question is what’s more likely to happen: your “contrived” example, with a function that returns six: -> 10, or a slightly modified version of the original example:

known_colors = 1: 'cyan' 2: 'sepia' 100: 'olive' 124: 'red'

In other words, it could’ve been happenstance that the original poster’s “left-indented” keys didn’t stretch back to the same indentation level as the variable name, as my example here does. If we adopt the rule you suggest, then this example presents the same issue as the original example did.

Whereas if we get more strict about indentation in object literals, your contrived example would simply throw a compiler error, which would be easily resolved by putting return before six: or by wrapping the returned object in braces.

helixbass · 2017-08-11T17:35:25Z

@GeoffreyBooth I'd looked at #2998 previously and was skeptical of a clean solution. If it's worth "fixing", I think the solution should really check that (all of) the colons are in the same column to try and limit it to just this one case. And probably also that the mismatched indentation level isn't the actual previous indentation level, eg this edge case should not be an error, it should continue to compile into two separate objects (sort of like your example in the discussion with @connec):

x =
  a: 1
abc: 2

Not sure quite how hard either of those requirements are so I'd kind of written this one off as not worth fixing

GeoffreyBooth · 2017-08-12T22:44:25Z

@helixbass I don’t think a “colon column check” should be our determinant. For example, I can see someone writing this:

known_colors =
  1: 'cyan'
  2: 'sepia'
100: 'olive'
1243943949: 'red'

For me the bottom line is that the compiler should never compile any of the examples in this thread into two objects, e.g.:

var known_colors;

known_colors = {
  1: 'cyan',
  2: 'sepia'
};

({
  100: 'olive',
  1243943949: 'red'
});

This is just always wrong, and even worse than invalid JavaScript since it doesn’t throw a runtime error. I think it’s worth fixing, as it may not come up often but it can be very confusing to debug when it does.

GeoffreyBooth · 2017-08-14T00:43:03Z

I tweaked that test so that its original intent (I think) is preserved, despite the new requirement that freestanding unassigned object literals now can’t follow object literals.

@connec I think this will just mean that going forward, people will have to use braces or a keyword like return or throw before inline object literals that follow other object literals. To update your example:

f = ->
  o = 
    '>!': 3
  return six: -> 10

I would rather the compiler throw an error, forcing someone to adapt (like by inserting return in this example) than the compiler outputting two objects some of the time, despite what the user expects. That seems much safer to me. I also don’t want to rely on possibly-invalid assumptions that if a new object literal starts at the same indentation as the previous object literal started, the user intends to produce a new object; or that if the colons all line up (or don’t) that that has some significance. It’s very easy to work around this error if the user encounters it, so problem syntaxes are more concerning to me than not limiting the scope of this fix sufficiently.

helixbass · 2017-08-14T08:44:47Z

@GeoffreyBooth the example from the last comment compiles (against this PR) without explicit return. It only seems to fail to compile if it's at the top level. That observed behavior is what my comment was based on - I would've worded it more strongly had it been modifying existing behavior elsewhere besides at the top level

It would be a really bad idea to merge this if it were going to change behavior elsewhere besides at the top level. And I think it's quite unadvisable even at the top level. You're speculating heavily about use cases that would now break. And you know I'll recommend not introducing changes that will break existing code unless it's truly necessary. In this case, I think the justification for introducing it is quite weak. Outdenting a block of code almost anywhere will cause parsing to change (and not necessarily throw a compile-time or runtime error), one won't get far writing Coffeescript without learning this. So the only thing that distinguishes this "bug" from any other case of not indenting your code the way you need to indent it for it to work is the colons-lining-up aspect since it's kind of understandable that someone could assume that they could format their code that way

So the scope of a "fix" should just be to try and fail at compile-time in a helpful way if we're sure that they made this particular mistake. This shouldn't stop working:

x =
  a: 1
bcd: 2

Only this should:

x =
  a: 1
 bc: 2

And even that makes me nervous. It's a very specific use-case raised in #2998 and while understandable, it's also obviously a case of indenting one's code incorrectly. I can't justify introducing virtually any sort of breaking change to accommodate it. As this PR stands, it's way too broad/breaking. And even if it could be limited to lined-up colons and the outdented line being at an in-between indentation level, I'm still not sure it's worth potentially breaking legitimate code

connec · 2017-08-14T10:40:07Z

Personally I think we should we should move to more strict indentation rules, which would close this bug.

This would still be allowed:

x =
  a: 1
bcd: 2

But the hope would be that, in the case of lining up colons, there would be more a problematic entry that would alert the user:

x =
  a: 1
 bc: 2 # ParseError: inconsistent indentation (expected 2 spaces, saw 1)
def: 3

GeoffreyBooth · 2017-08-14T12:58:56Z

But the hope would be that, in the case of lining up colons, there would be more a problematic entry that would alert the user

Unfortunately I don’t see how there could be a foolproof way to detect such a thing. Consider:

x =
    a: 1
  bcd: 2
efghi: 3

or ditto if tabs are used to indent.

GeoffreyBooth · 2017-08-14T12:59:12Z

So is the consensus that we would rather allow the current behavior rather than throwing an error when an “object-ish” line follows an outdent? As in, I should close the original issue as invalid/expected behavior? @lydell or @jashkenas, do you have any thoughts?

connec · 2017-08-14T13:33:45Z

Unfortunately I don’t see how there could be a foolproof way to detect such a thing. Consider:
x =
    a: 1
  bcd: 2
efghi: 3

In that case I would expect an error on line 3: ParseError: inconsistent indentation (expected 4 spaces, saw 2).

connec · 2017-08-14T13:35:50Z

I think the fix in the PR is the wrong way to address it. I'd rather tighten up the the syntax to try and catch such things than add a bunch of special-case handling. To me, the approach taken here seems more like a feature of a linter or something similar.

GeoffreyBooth · 2017-08-14T18:56:42Z

I’d rather tighten up the the syntax to try and catch such things than add a bunch of special-case handling.

My original concern with this PR was that it was too narrowly focused, that it might not catch all cases of consecutive implicit objects that were intended to be single objects; but based on all the feedback, now I’m feeling like that’s a plus, as the effects of this PR should be minimal. If you want to make broader changes to how indentation is parsed, I feel like that’s a goal for another PR. Is there a reason not to approve this one, to close this specific edge case, and leave broader topics for future efforts?

All the code in the rewriter looks rather hacky. This PR doesn’t stand out in that regard. Inside the “implicit object” rewriting, when there’s a : token it looks back three tokens (before the key, in/outdentation, and line break) to see if a generated } token exists there, which would mean that an implicit object was closed on the previous line; and if so, an error is thrown. So I’m relying on a lot of already existing code for the processing of implicit objects; I’m really just checking for two implicit objects in a row, which is exactly what the original issue was and what this PR aims to prevent.

helixbass · 2017-08-14T22:36:08Z

@GeoffreyBooth I still don't think this PR does a very precise job of handling this "edge case" and so don't think it should be approved. I really don't think you should be conflating the case raised in #2998 with "all instances of consecutive, differently-indented implicit objects". I wrote some code today that contained an assignment to an implicit object followed by another (implicitly returned) implicit object. From my understanding, you want this PR to disallow that (even though I don't think it currently does). That just doesn't make sense

Like I explained, #2998 is only worth noticing differently than any other instances of someone incorrectly indenting their code b/c the colons lining up makes it seems like a weird/ugly bug. But it's really not. Your attempted fix even broke an existing test and you decided that the test's syntax shouldn't be allowed and changed the meaning of the test. @connec gave a similar opinion that this PR isn't a great way to tackle the problem (and I'm still questioning how much of a problem it really is)

The fact that you keep thinking that the effects of this PR are too narrow/only tackle this specific edge case worries me. I believe that the scope of the problem as raised in #2998 is extremely narrow (just the colon-aligned case) and that this PR is throwing a much larger net at it that has implications for breaking existing code and disallowing things that there's no legitimate grounds for considering to be invalid syntactically

GeoffreyBooth · 2017-08-14T22:51:33Z

Okay, so aside from the specifics of this PR, should we close #2998 as expected behavior? I don’t think there’s any way to throw an error for that edge case that doesn’t break legitimate use cases that have been given as examples above. Looking at indentation level or colon column each present their own problems, and don’t provide solutions in my opinion.

helixbass · 2017-08-14T23:23:20Z

@GeoffreyBooth that was my takeaway, that as much as it seems like it would be nice to give a helpful error message for that one case, there's probably not a clean enough way to detect and specially handle it. Agreed that there are problematic aspects both to implementation and soundness trying to wrangle just these cases by using indentation level or colon column. So ya maybe going ahead and closing/calling it a wontfix makes sense or could spur further insight if someone else feels really strongly about it being handled

lydell · 2017-08-15T05:41:41Z

After all this exploration and discussion, I guess it's fine to close #2998 as an unfortunate edge case.

jashkenas · 2017-08-18T16:37:43Z

Yes, I agree that y'all called this one right. It really is an incorrect indentation — and we really can't save people from indenting incorrectly.

It seems like the best aid for this problem is to be using a nice text editor that shows little lines on your indentation levels for you ... so that you'd immediately see that you're creating objects at two levels.

Fix jashkenas#2998: Object literal with inconsistent indentation shou…

97a7b2f

…ld throw an error

connec reviewed Aug 7, 2017

View reviewed changes

Restore (and revise) string-key objects newline test

39d00e1

GeoffreyBooth mentioned this pull request Aug 15, 2017

Lexer: Object literal with inconsistent indentation parsed as multiple objects #2998

Closed

GeoffreyBooth closed this Aug 15, 2017

GeoffreyBooth deleted the indentation-mismatch-in-implicit-object branch August 15, 2017 09:22

GeoffreyBooth mentioned this pull request Aug 23, 2017

Invalid JS: Multiline function calls sometimes compile invalidly #3453

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CS2] Fix #2998: Object literal with inconsistent indentation should throw an error #4630

[CS2] Fix #2998: Object literal with inconsistent indentation should throw an error #4630

GeoffreyBooth commented Aug 4, 2017

GeoffreyBooth commented Aug 5, 2017

connec Aug 7, 2017

GeoffreyBooth Aug 7, 2017

connec Aug 7, 2017

GeoffreyBooth Aug 7, 2017

helixbass commented Aug 11, 2017

GeoffreyBooth commented Aug 12, 2017

GeoffreyBooth commented Aug 14, 2017

helixbass commented Aug 14, 2017

connec commented Aug 14, 2017

GeoffreyBooth commented Aug 14, 2017

GeoffreyBooth commented Aug 14, 2017

connec commented Aug 14, 2017

connec commented Aug 14, 2017

GeoffreyBooth commented Aug 14, 2017

helixbass commented Aug 14, 2017

GeoffreyBooth commented Aug 14, 2017

helixbass commented Aug 14, 2017

lydell commented Aug 15, 2017

jashkenas commented Aug 18, 2017

[CS2] Fix #2998: Object literal with inconsistent indentation should throw an error #4630

[CS2] Fix #2998: Object literal with inconsistent indentation should throw an error #4630

Conversation

GeoffreyBooth commented Aug 4, 2017

GeoffreyBooth commented Aug 5, 2017

connec Aug 7, 2017

Choose a reason for hiding this comment

GeoffreyBooth Aug 7, 2017

Choose a reason for hiding this comment

connec Aug 7, 2017

Choose a reason for hiding this comment

GeoffreyBooth Aug 7, 2017

Choose a reason for hiding this comment

helixbass commented Aug 11, 2017

GeoffreyBooth commented Aug 12, 2017

GeoffreyBooth commented Aug 14, 2017

helixbass commented Aug 14, 2017

connec commented Aug 14, 2017

GeoffreyBooth commented Aug 14, 2017

GeoffreyBooth commented Aug 14, 2017

connec commented Aug 14, 2017

connec commented Aug 14, 2017

GeoffreyBooth commented Aug 14, 2017

helixbass commented Aug 14, 2017

GeoffreyBooth commented Aug 14, 2017

helixbass commented Aug 14, 2017

lydell commented Aug 15, 2017

jashkenas commented Aug 18, 2017