Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binding expressions #1210

Open
eernstg opened this issue Sep 9, 2020 · 85 comments
Open

Binding expressions #1210

eernstg opened this issue Sep 9, 2020 · 85 comments
Labels
feature Proposed language feature that solves one or more problems field-promotion Issues related to addressing the lack of field promotion nnbd NNBD related issues

Comments

@eernstg
Copy link
Member

eernstg commented Sep 9, 2020

Binding expressions are expressions that introduce a local variable v with a name which may be taken from the expression itself or it may be specified explicitly. The variable is introduced into the enclosing scope which is limited to the nearest enclosing statement (e.g., an enclosing if statement or loop, or an enclosing expression statement). The variable is accessible but not usable before the binding expression, and it is promoted based on the treatment of the binding expression.

NB: A variant of this proposal using postfix @ is available in this comment below.

The inspiration for this mechanism is #1201 'if-variables', where the conciseness of introducing a variable with an existing name was promoted, and #1191, 'binding type cast and type check', where the ability to introduce a new variable was associated with a general expression.

[Edit Oct 15 2020: Now assuming the proposal from @lrhn that each composite statement introduces a new scope. Nov 24 2021: Mention proposal about using @ rather than var. Nov 25 2021: Make the scope restriction a bit more tight, limiting it to the enclosing expression statement if that's the nearest enclosing statement, and similarly for other statements that include an expression, e.g., <returnStatement>.]

Examples and Motivation

In general, a binding expression can be recognized by having var and :.

It may or may not introduce a new name explicitly: var x: ... introduces the name x, and var:... does not introduce a name explicitly. When a binding expression does not introduce a name explicitly, a name is obtained from the rest of the binding expression. This means that names can be introduces very concisely, and they can be "well-known" in the context because it is already a name which is being used for some purpose.

A binding expression can be used to "snapshot" the value of a subexpression of a given expression:

void main() {
  int i = 42;
  // '42 has 6 bits':
  print('$i has ${i.var:bitLength} bit${bitLength != 1? 's':''}');
  print('$i has ${i.var length:bitLength} bit${length != 1? 's':''}');
}

The construct var:bitLength works as a selector (such that the getter bitLength is invoked), and it also introduces a local variable named bitLength whose value is the result returned by that getter invocation. The construct var length: bitlength gives the new variable the name length, and otherwise works the same. This works for method invocations as well:

void main() {
  var s = "Hello, world!";
  var s2 = s.var:substring(7).toUpperCase() + substring;
  print(s2); // 'WORLD!world!'.
}

In case of name clashes, it is possible to produce names based on templates where $ plays the role as the "default" name:

class Link {
  Link next;
  Link(this.next);
}

void main() {
  Link link = someExpression..var $1:next.var $2.next;
  if (next1 == next2) { /* Cycle detected */ }
}

Apart from selectors and cascade sections, binding expressions can also bind the value of a complete expression to a new variable:

class C {
  int? i = 42;

  void f1() {
    if (var:i != null) { // Snapshot instance variable `i`, create local `i`, scoped to the `if`.
      Expect.isTrue(i.isEven); // Local `i` is promoted by test.
      this.i = null; // Assignment to instance variable; `i = null` is an error.
      Expect.isTrue(i.isEven); // The local variable still has the value 42.
    } else {
      // Local `i` is in scope, with declared type `int?` from initialization.
      Expect.isTrue(i?.isEven);
    }
  }
}

The binding expression var:i introduces a new local variable named i into the scope of the if statement (the if statement is considered to be enclosed by a new scope, and the variable goes into that scope). It is an error to refer to that local variable before the binding expression, so if we have foo(var v1:e1, var v2:e2), e2 can refer to v1, but e1 cannot refer to v2.

A binding expression always introduces a final local variable. The main reason for this is that it is highly error prone to have a local variable whose name is the same as an instance variable, which serves as a proxy for the instance variable because it has the same value (at least initially), and then assignment to the local variable is interpreted to be an assignment to the instance variable.

A binding expression is intended to be a "small" syntactic construct. In particular, parentheses must be used whenever the given expression uses operators (var x: (a + b)) or other constructs with low precedence. It is basically intended to snapshot the value of an expression of the form <primary> <selector>*, that is, a receiver with a chain of member invocations, or even smaller things like a single identifier.

In return for this high precedence, we get the effect that expressions like var:i is T or var:i as T parses such that the binding expression is var:i, which is a useful construct because it introduces a variable i and possibly promotes it to T.

The conflict between two things named i is handled by a special lookup rule: For the construct var:i, the fresh variable i is introduced into the current scope, and the initialization of that variable is done using the i which is looked up in the enclosing scope. In particular, if i is an instance, static, or global variable, var:i will snapshot its value and provide access to that value via the newly introduced local variable with the same name.

This implies several things: It is an error to create a new variable using a binding expression if the resulting name clashes with a declaration in the current scope. (The template based naming that makes $1 expand to next1 and such helps creating similar, non-clashing names). Also, in the typical case where a binding expression introduces a name i for a local variable which is also the name of an instance variable, every access to the instance variable in that scope must use this.i. This may serve as a hint to readers: If a function uses this.i = 42 then it may be because the name i is introduced by a binding expression.

Update Oct 15: @lrhn's proposal that every composite statement should introduce a scope containing just that statement is assumed in this proposal, and it is extended to wrap every <expressionStatement>, <returnStatement>, and a few others, in a new scope as well. This implies that when a binding expression introduces a variable and it is added to the current scope, it will be scoped to the enclosing statement S, which may include nested statements if S is a composite statement like a loop. If a binding expression occurs in a composite statement then it will introduce a variable which is available in that whole composite statement (e.g., also the else branch of an if statement), but only there.

Grammar

The grammar is updated as follows:

<postfixExpression> ::= // Modified rule.
  <assignableExpression> <postfixOperator> |
  <expressionBinder>? <primary> <selector>*

<unconditionalAssignableSelector> ::= // Modified rule.
  '[' <expression> ']' |
  '.' <expressionBinder>? <identifier>

<assignableSelector> ::= // Modified rule.
  <unconditionalAssignableSelector> |
  '?.' <expressionBinder>? <identifier> |
  '?' '[' <expression> ']'
 
<cascadeSelector> ::= // Modified rule.
  '[' <expression> ']' |
  <expressionBinder>? <identifier>

<expressionBinder> ::= // New rule.
  'var' <identifier>? ':'

The overall rule is that a binding expression can be recognized by having var and :, which makes it different from a local variable declaration (with var and =), thus helping both readers and parsers to disambiguate. Apart from the fact that both = and : are used to provide values for a variable in Dart (e.g., for variable initializers and named parameters), the rationale for using : is that it makes var:x introduce a variable named x and obtains its value using the x that we would get if this new variable had not been created. Various other syntaxes seem less suggestive.

Static Analysis

Every form of binding expression e introduces a final local variable into the current scope of e. Below we just specify the name and type of the variable, and the scope and finality is implied.

This proposal includes the following change to the scoping rules: Each statement S is immediately enclosed in a new scope if S is derived from one of the following: <forStatement>, <ifStatement>, <whileStatement>, <doStatement>, <switchStatement>, <expressionStatement>, <returnStatement>, <assertStatement>, <yieldStatement>, <yieldEachStatement>. For instance print('Hello!'); is treated as { print('Hello'); } and if (b) S is treated as { if (b) S }.

A local variable declaration D is treated such that variables introduced by a binding expression in D are in scope in that local variable declaration, and not outside D. This cannot be specified as a syntactic desugaring step, but it can be specified in a similar manner as the scoping for local variable introduced by an initializing formal parameter.

The main rationale for making the variable final in all cases is that it is highly error prone to save the value of an existing variable x (for instance, an instance variable) in a local variable whose name is also x, and then later assign a new value to x under the assumption that it is an update to that other variable. In this case the assignment must use this.x = e because the local variable x is in scope and is final.

If a variable x is introduced by a binding expression e then it is a compile-time error to refer to x in the current scope of e at any point which is textually before the end of e.

A binding expression of the form var x: e1 introduces a variable named x with the static type of e1 as its declared type.

A binding expression of the form var:e1 is a compile-time error unless e1 is an identifier.

A binding expression of the form var:x where x is an identifier introduces a variable named x with the static type of x in the enclosing scope as the declared type.

Note that x is looked up in the enclosing scope, not the current scope. It would be a compile-time error to look it up in the current scope (because that's a reference to the variable itself before it's defining expression ends), and it is likely to be useful to look it up in the enclosing scope because the intended variable/getter would often be a non-local variable.

A binding expression of the form e1?.var x:m<types>(arguments) where ? and <types> may be omitted introduces a variable named x whose declared type is the static type of e1.m<types>(arguments) when this selector does not participate in null shorting, and the static type of e1?.m<types>(arguments) when it participantes in null shorting.

A binding expression of the form e1?.var:m<types>(arguments) where ? and <types> may be omitted introduces a variable named m whose declared type is the static type of e1.m<types>(arguments) when this selector does not participate in null shorting, and the static type of e1?.m<types>(arguments) when it participates in null shorting.

If the previous two cases are not applicable, a binding expression of the form e1?.var x:m where ? may be omitted introduces a variable named x whose declared type is the static type of e1.m when this selector does not participate in null shorting, and the static type of e1?.m when it participates in null shorting; and a binding expression of the form e1.var:m introduces a variable named m, with the same static types as the previous variant.

A binding expression of the form e?..var x:m<types>(arguments) where ? and <types> may be omitted introduces a variable named x whose declared type is the static type of e0.m<types>(arguments) when this cascade section does not participate in null shorting, and the static type of e0?.m<types>(arguments) when it participates in null shorting, where e0 is the receiver of the cascade.

A binding expression of the form e?..var:m<types>(arguments) where ? and <types> may be omitted introduces a variable named m whose declared type is the same as in the corresponding situation in the previous case.

If the previous two cases are not applicable, a binding expression of the form e?..var x:m where ? may be omitted introduces a variable named x whose declared type is the static type of e0.m when this cascade section does not participate in null shorting, and the static type of e0?.m when it participates in null shorting, where e0 is the receiver of the cascade. Similarly, a binding expression of the form e?..var:m where ? may be omitted introduces a variable named m with the same declared type as the corresponding case.

In all these cases, the expression that provides the declared type of the newly introduced variable is known as the initializing expression for the variable.

In all these cases, and in expressions with control flow (conditional expressions, logical operators like && and ||, and so on), it is a compile-time error to evaluate a variable v which has been introduced by a binding expression e unless e is guaranteed to have been evaluated at the location where the evaluation of v occurs.

A special mechanism is provided for the case where a selector has a name that clashes with an existing name in the current scope: If a $ occurs in the specified name then the name of the newly introduced variable replaces $ by the name that would be used if no name had been specified, capitalizing it unless $ is the first character.

For example, a.var $1:next.var $2:next introduces a variable named next1 and another variable named next2; a.var first$:bar.var second$:bar introduces variables firstBar and secondBar; and var x$x: 'Hello' is an error.

If a variable x is introduced by a binding expression e then let e0[[e]] be a notation for the enclosing expression e0 with a "hole" that contains e. Promotion is then applied to x as if it had occurred in an expression of the form e0[[x]].

For example, var x: e is T promotes x to type T iff x is T would have promoted x.

Dynamic Semantics

A local variable x introduced into a scope by a binding expression e is initialized to the value of the initializing expression of x at the time where e is evaluated. If the initializing expression of x is not evaluated (due to null shorting), x is initialized to null.

Discussion

About Generated Names

The mechanism that generates a name from a template may be considered arbitrary: It depends on $, it uses capitalization (which is only fit for camelCasedNaming), and it is unique in Dart in that it creates a name based on a textual transformation. It was included because it seems likely that the use of binding expressions will create name clashes where there is a genuine need for creating several similar names.

The syntax is quite bulky: a.b.var firstName: b.var secondName: b makes it difficult to see the core expression a.b.b.b. It may be helpful to format this kind of construct with plenty of whitespace, such that the introduction of new variables is emphasized:

void main() {
  print(a.b
      .var firstName: b
      .var secondName: b);
}

Alternative Syntax: Use @ and @name:

In this comment @jodinathan proposed using @ rather than var as the syntactic element that initiates the variable declaration:

// As proposed above:

void main() {
  int i = 42;
  // '42 has 6 bits':
  print('$i has ${i.var:bitLength} bit${bitLength != 1? 's':''}');
  print('$i has ${i.var length:bitLength} bit${length != 1? 's':''}');
}

// With the `@` based proposal:
void main() {
  int i = 42;
  // '42 has 6 bits':
  print('$i has ${i.@bitLength} bit${bitLength != 1? 's':''}');
  print('$i has ${i.@length:bitLength} bit${length != 1? 's':''}');
}

One (admittedly subjective) benefit is that the visually disruptive space after var is avoided, and another one is that @ can be used before any <selector>, e.g., myList@firstElement[0] would correspond to var firstElement = myList[0].

One (similarly subjective) drawback is that there is no hint in the syntax itself about the fact that a variable is being declared. For instance, a.@b would not be seen as a construct that declares a variable named b by any developer who hasn't been told explicitly that this is exactly what that @ does. However, if it's used frequently enough then it probably doesn't matter much whether or not we can guess what it means the very first time we see it, no explanations given.

@eernstg eernstg added the feature Proposed language feature that solves one or more problems label Sep 9, 2020
This was referenced Sep 9, 2020
@lrhn
Copy link
Member

lrhn commented Sep 11, 2020

It's a very interesting concept.

It is necessarily limited to selectors because it has to link to an identifier to make the name optional.
That means that it doesn't work for foo(bar(42), ... same value as bar(42)...) unless you can also write foo(var b: bar(42), b) as a non-selector. Then we are really back to making variable declarations valid expressions, evaluating to their values like a normal assignment, and we could just write foo(var b = bar(42), b).

I'm not sold on the syntax. The : seems like it separates more than the . so foo.var x:bar seems more like the var x binds to the previous foo than to the bar.
(Maybe that's a different approach: Suffix .var or .var x binds the previous expression. Then we'd have print('$i has ${i.bitLength.var} bit${bitLength != 1? 's':''}'); or foo(bar(36).var, bar). If the previous expression is a named function call or variable, we bind to that name. Probably an issue if you do foo.var and introduce a new variable named foo in the same scope, though, unless we really restrict the scope to downstream from the declaration, not just to the entire same block).

@eernstg
Copy link
Member Author

eernstg commented Sep 11, 2020

It's a very interesting concept.

Thanks, it builds on nice sources of inspiration, too. ;-)

It is necessarily limited to selectors

That's actually not true: Any binding expression can be named (var x: e), and it is also possible to use an unnamed binding expression when the expression is an identifier, as in var:x. In that case the x is introduced into the current scope, and the initializing expression which is also x is looked up in the enclosing scope.

The construct is deliberately limited to "small" constructs (<primary> <selector>*), such that it fits in with tests like var:x is T (meaning (var:x) is T). This means that we need parentheses for var x: (a + b), but my hunch is that it's better to use a regular local variable declaration for "big" expressions anyway.

foo(bar(42), ... same value as bar(42)...)

That would be handled as foo(var b: bar(42), ...b...), as you mentioned.

we could just write foo(var b = bar(42), b).

I think it would be difficult to parse a <localVariableDeclaration> as an expression unambiguously, or it would have to have a very low precedence. That's one of the reasons why I'm using var along with :, and giving the whole construct a rather high precedence (making it very similar to null shorting in binding power).

The : seems like it separates more than the .

Yes, that is definitely an issue. I think we can at least milden the reading difficulty in a few ways: We could use a standard formatting where .var is forced to go on a new line, except the shortest constructs like a.var:b. So we basically use newline if there is a binding expression that needs a space, and then the eye can catch on to .var and understand what is going on.

I'm also thinking that var will light up in a different color in most IDEs, so a.var:b will self-announce the variable to some extent, which makes it more readable in the case where we don't have .var first on a line.

Suffix .var or .var x binds the previous expression

It would be interesting to study the implications of that design. However, I tend to like the fact that var x: expression and receiver.var x:getter is a bit more like var x = expression respectively var x = receiver.getter than expression.var x and receiver.getter.var x.

restrict the scope to downstream from the declaration

I don't think we should do that: I prefer your approach from the binding type checks/tests, where the new variable is added to the current scope. I think it's going to be considerably more error prone to have two variables named x in the same visual scope (as shown by braces and indentation), and then suddenly we switch over from one to the other because of a binding expression which is hidden deep in an expression.

@leafpetersen
Copy link
Member

@eernstg I would like to explore what some actual code looks like under these various proposals. Here's an example piece of code that I migrated, which I think suffers from the lack of field promotion. This is from the first patchset from this CL - I subsequently refactored it to use local variables, the result of which can be seen in the final patchset of that CL. This was a fairly irritating refactor, and I don't particularly like the result. Would you mind taking a crack at showing how this code would look under your proposal?

// The type of `current` is `Node`, and the type of the `.left`  and `.right` fields is `Node?`.
while (true) {
      comp = _compare(current.key, key);
      if (comp > 0) {
        if (current.left == null) break;
        comp = _compare(current.left!.key, key);
        if (comp > 0) {
          // Rotate right.
          Node tmp = current.left!;
          current.left = tmp.right;
          tmp.right = current;
          current = tmp;
          if (current.left == null) break;
        }
        // Link right.
        right.left = current;
        right = current;
        current = current.left!;
      } else if (comp < 0) {
        if (current.right == null) break;
        comp = _compare(current.right!.key, key);
        if (comp < 0) {
          // Rotate left.
          Node tmp = current.right!;
          current.right = tmp.left;
          tmp.left = current;
          current = tmp;
          if (current.right == null) break;
        }
        // Link left.
        left.right = current;
        left = current;
        current = current.right!;
      } else {
        break;
      }
    }

@eernstg
Copy link
Member Author

eernstg commented Sep 14, 2020

Sure, here we go:

  // Context, based on CL:
  // Comparator<K> get _compare: Instance getter of enclosing class.
  // K key: parameter to enclosing function.
  // int comp: local variable.
  // Node current: local variable.
  // It seems that `right` and `left` are local variables (the code won't work if it's the fields).
  while (true) {
    comp = _compare(current.key, key);
    if (comp > 0) {
      if (var oldLeft: current.left == null) break; // Snapshot `current.left` as `oldLeft`.
      comp = _compare(oldLeft.key, key);            // `current.left!` -> `oldLeft`.
      if (comp > 0) {
        // Rotate right.
        current.left = oldLeft.right;               // Replace `tmp` by `oldLeft`: it already has that value.
        oldLeft.right = current;
        current = oldLeft;
        if (current.left == null) break;            // NB: Not `oldLeft`, it's based on another `current`.
      }
      // Link right.
      right.left = current;                         // Not sure how `right` was initialized?
      right = current;
      current = current.left!;                      // Snapshot of `current.left` not helpful here.
    } else if (comp < 0) {
      if (var oldRight: current.right == null) break;
      comp = _compare(oldRight.key, key);
      if (comp < 0) {
        // Rotate left.
        current.right = oldRight.left;
        oldRight.left = current;
        current = oldRight;
        if (current.right == null) break;
      }
      // Link left.
      left.right = current;                         // Again, not sure how `left` was initialized.
      left = current;
      current = current.right!;
    } else {
      break;
    }
  }

It could have been slightly more concise if I had used the existing names left and right for the new local variables (that could be done with if (current.var:left == null) break;), but that would clash with the names left and right which are used already in the code.

I can see that SplayTree has fields named left and right, and the post-migration code has local variables left and right in the enclosing method (with declared type Node?, and with different treatment such that they are promoted to Node as needed). I just preserved the behavior as given in this example code, and treated left and right as local variables with type Node, such that the given code does not have an error at right.left = current; and similar statements.

I think it's worth noting that this code is updating the field that we're snapshotting, which is basically the worst case for using a snapshot. I used the names oldLeft and oldRight in order to emphasize that there is a difference between the snapshot and the syntactic expression which was used to get the snapshot (otherwise the names currentLeft/currentRight would have been a natural choice).

So there would presumably be lots of situations where we are actually just using the value of a field (and not updating the same field), and we just want to remember the outcome of null tests and type tests, and there are no name clashes. In that case we could of course use shorter names like left and right.

@AKushWarrior
Copy link

Honestly, I think this syntax is far more versatile than the one in #1201 . Since : is a clear definition of binding expression assignment, it's usable virtually everywhere, and is clear in most cases.

To address @lrhn 's concerns about getter binding expressions such as foo.var x:bar, perhaps there could be a lint Prefer enclosing vague binding expressions in parentheses, which corrected code to, e.g., foo.(var x: bar)? Would allowing for the potential parentheses there be a structural issue? It seems to me that that correction eliminates the ambiguity as to what order things are evaluated; I don't think that code conflicts with anything, because parentheses after decimals are currently not supported.

Also, @eernstg, instead of typing out oldLeft in your example, wouldn't it make more sense to use $Old? Isn't the point of the $ syntax to demarcate the similarity in meaning while preserving sanity for later usages?

@lrhn
Copy link
Member

lrhn commented Sep 16, 2020

I think the parentheses still separates the foo. from the bar. Maybe if it was foo.(var x:)bar or foo.(var x =)bar.

If we allow var x: as a prefix operation in general (so 1 + var x :e) is also valid, and we allow prefix operations in selectors (#1216), so foo.~bar is valid and equivalent to (~foo.bar), then we might get used to reading things like foo.var x:bar, since it's consistent with other operations.
(But we could also use foo.var x = bar then, the var ensures that the = isn't ambiguous).

We probably need the var (or final), which means no types: foo.int x:bar/foo.int x = bar. I'd be worried about parsing that, although it's not impossible that it's doable.

@AKushWarrior
Copy link

AKushWarrior commented Sep 16, 2020

@lrhn

I think the parentheses still separates the foo. from the bar. Maybe if it was foo.(var x:)bar or foo.(var x =)bar.

foo.(var x :)bar.getterOfBar or foo.(var x =)bar.getterOfBar, makes sense to me, though it would be more unclear without those parentheses:

if foo.~bar is valid and equivalent to (~foo.bar), then we might get used to reading things like foo.var x:bar, since it's consistent with other operations.

foo.var x: bar.getterOfBar is still ambiguous because foo.var almost looks like a type. Of course, IDEs would probably light up the var, so it might be okay, but that looks kind of hellacious to read while browsing a GitHub repository. That's the primary motivation behind the parentheses idea; it's also not unheard of to use prepended parentheses to perform an action while maintaining the value of the variable (Java casts is one that comes to mind), so it's kind of semantically familiar.

(But we could also use foo.var x = bar then, the var ensures that the = isn't ambiguous).

I do think that there's value in having a distinct binding operator (e.g. colon) as opposed to equals, because binding assignments are clearly different from a normal assignment and should have a visual distinctifier. On the other hand, it might also be syntactically confusing, because : in other languages is used to denote a type. It might be easier to stick with an equals sign because of the confusion factor, or perhaps use something like := instead.

We probably need the var (or final), which means no types: foo.int x:bar/foo.int x = bar. I'd be worried about parsing that, although it's not impossible that it's doable.

As to the whole parsing types thing, I agree that it's an issue that likely can't be solved without making distinct class and property names mandatory, which is definitely a verbose anti-pattern. It might take some advanced contextual analysis to distinctly allow foo.BarType x = bar.barGetter without also saying BarType is not a valid property of foo.

It might (?) be more feasible if BarType x = was couched in parentheses: because then you could explicitly search for those parentheses before a getter/method, and do static type analysis from there.

@AKushWarrior
Copy link

@tatumizer see #1201 for a better idea of what this is meant to solve. It is an enhanced assignment operator, but it has potential to be a lot more.

@AKushWarrior
Copy link

AKushWarrior commented Sep 17, 2020

Ah, okay I think I misunderstood your original point. So my understanding is that the whole point is to allow adding var in more places; there is disagreement over how to do that. The argument for some syntactical distinction is that this binding kind of assignment is actually different; it also represents an object, where normal assignment usually represents nothing.

    String baseData = "";
    var x = baseData.(var y:)length > 0; //This version
    var a = baseData.var b = length > 0; //Using current assignment syntax

Take this example. If you use a separate operator, you can nest binding assignments virtually anywhere; the first pretty clearly displays what's going on. The second uses the current assignment syntax, which brings up a few issues:

  • How do you parse that and realize that the second = is meant to be a binding assignment, where the first is a typical assignment?
  • What order does the call resolve in?
  • How is a newcomer supposed to identify why the same operator sometimes results in an object and other times results in a void?

Of course, that's not to say that this syntax doesn't have some issues. I personally like it better just for clarity's sake.

@AKushWarrior
Copy link

AKushWarrior commented Sep 17, 2020

void main() {
  String baseData = "";
  var length;
  var x = (length=baseData.length).bitLength > 0;
  print (length); // prints 0
}  

Okay... but that's not the point of the issue. The idea is to not have to say (var length = baseData.length). Instead, you could say the cleaner baseData.(var length:)length;. You then don't have to deal with nesting and can chain these together.

The two are strictly different, as well:
In someOp((var length = baseData.length));, you perform someOp on the created copy length of baseData.length, not the original object.
In someOp(baseData.(var length:)length) the copy is created, and then you perform someOp on the original object.

@AKushWarrior
Copy link

Your example can be more consistently treated as an extension of cascade operator - again, we can just allow adding "var" there:

someOp(baseData..var length=length); // note the double dot

Cascade returns the original object.

yes it does, but... why do more work and introduce parsing complexities to make code harder to read?

@AKushWarrior
Copy link

why do more work and introduce parsing complexities to make code harder to read?

The Law of Parsimony. Why introduce new syntax where the old one suffices?

True. It's a judgement call and probably personal preference. Ultimately, it'll be up to the Dart devs, not us; I think that we've probably exhausted this particular argument now.

@eernstg
Copy link
Member Author

eernstg commented Sep 17, 2020

@tatumizer wrote:

Whatever this new construct aspires to achieve, can be done using existing dart syntax ..
What is missing here is just a way to declare bitLength variable inline

Right, that is the topic of this discussion: Can we find a good way to introduce local variables which is more flexible than <localVariableDeclaration>. This proposal allows an expression to introduce a local variable.

we can allow adding "var" in more places

That's what basically all these proposals (#1201, #1191, and this one) do. This proposal uses : rather than =, but otherwise the syntax is quite similar to <localVariableDeclaratio>.

The use of : disambiguates the construct; basically, = is much more ambiguous when it occurs in an expression than : because it is used for a larger number of things already. The use of : to bind a variable to a value is known from named parameters already, and (to me) it seems reasonable to use the form var:x to say that "x is a new local variable, and it's initialized by the meaning of x that we get if we skip that new local variable".

What I propose is to allow an inline declaration instead of a separate "length" declaration:

void main() {
  String baseData = "";
  var x = (var length=baseData.length).bitLength > 0;
  print (length);
}  

This is a slight extension of the existing syntax, which doesn't require much explanation.

Indeed. However, this proposal also aims to allow concise forms where the name of the new variable is taken from the initializing expression or selector. For example, the above could be written as follows:

void main() {
  String baseData = "";
  var x = baseData.var:length.bitLength > 0;
  print(length);
}

So the point is not that it is impossible to allow for <localVariableDeclaration>-ish constructs in some new locations, the point is that this proposal does nearly exactly that, with some twists that allow for a concise form in many cases.

Why introduce new syntax where the old one suffices?

I'm proposing to use : rather than = in order to disambiguate (for parsers and human beings), and in order to support the conciseness of forms like a.var:b and var:x. I think var=x looks less meaningful than var:x.

I would expect the named form to be used for whole expressions (with selectors I'd expect the nameless form to be much more common), like var x: e, and they can of course occur as (var x: e) anywhere a <primary> can occur (which is almost anywhere).

It might look slightly more "normal" if we were to use (var x = e), but that would be somewhat misleading because the binding expression has different properties. In particular, the binding expression allows for the variable to be promoted, as in var:x is T or (var x: e) is T.

@eernstg
Copy link
Member Author

eernstg commented Sep 17, 2020

@AKushWarrior wrote:

perhaps .. lint which corrected code to, e.g., foo.(var x: bar)?

I think the grammar could rather easily be adjusted to allow parentheses along these lines, and it might improve the readability in some cases. However, it's not obvious that they would appear to be very "natural" if they apply to the selector syntactically: foo.(var x: bar)(42) would make the method invocation look funny, and if we make it foo.(var x: bar(42)) then we need to consider what it would mean to say foo.(var x: bar<int>(42).baz[0]).qux, and so on.

@lrhn wrote:

Maybe if it was foo.(var x:)bar

That would eliminate the complexities associated with parentheses around a non-singleton sequence of selectors.

But I still suspect that it's equally easy to learn to look further for the selector if the eye hits .var, because the : is right there after the identifier.

if .. 1 + var x :e .. is also valid,

1 + var x:e is in fact valid in the proposal when e can be derived from <primary> <selector>*. So we can have 1 + var x: foo.bar(42).and?.so.on, but it must be 1 + var x: (2 + 3) in order to bind x to the value of 2 + 3.

But we could also use foo.var x = bar then, the var ensures that the = isn't ambiguous

True, but that only works when var is used on a selector, not for usages on expressions. For instance, if var x = (2 + 3) + 1 occurs as an expression statement then we can't see whether it's a <localVariableDeclaration> or a <bindingExpression>.

I think it makes sense to use : everywhere for binding expressions because it removes a lot of ambiguity, both for parsers and for human beings.

@lrhn
Copy link
Member

lrhn commented Sep 17, 2020

If the inline foo.var x:bar().baz binds x to foo.bar(), then should a leading var x:foo.bar().baz bind x to foo or to foo.bar().baz?

The latter seems to be what you are suggesting, but it feels inconsistent (but then, if we have inline prefix operators, then they should probably all work that way, and then it is consistent.

Also, why let foo.var x:bar().baz(x) bind x to foo.bar() and not just foo.bar? What if foo.bar is a function-typed getter?
What if I want foo.var x: bar()!.baz(x) to bind to bar()!? There is no way to control the range of the binding except by special-casing "selectors" (and the fact that .foo(args) is parsed as two selectors instead of one is an accident of grammar design, it's not treated that way by the semantics).

So, I'd still prefer a suffix binding to an inline prefix binding, if we want to avoid parentheses, or just a prefix declaration if we don't care about that:

1 + (var x = foo.bar()).baz(x)  // general declaration-as-an-expression, no selector special-casing
// or
1 + foo.bar().var x.baz(x)  // and then `1 + foo.bar() as<var x>.baz(x)` is very close.

@eernstg
Copy link
Member Author

eernstg commented Sep 17, 2020

should a leading var x:foo.bar().baz bind x to foo or to foo.bar().baz?

As specified, x is bound to foo.bar().baz, because it parses a <primary> <selector>* after the :. If you want to bind it to foo then you'd need to have (var x:foo).bar().baz.

What if I want foo.var x: bar()!.baz(x) to bind to bar()!`?

True, there's no specialized support for doing that. Again, I'm prioritizing a likely useful choice and conciseness, so there is no new syntactic device for specifying less common choices. But we could use (var x: foo.bar()!).baz(x).

Of course, there's a need to use a more verbose rewrite if we have null-shorting in the part that we wish to bind to the new variable. As usual, I gave priority to the interpretation that I found most likely to be useful, so the binding selector participates in null-shorting and the type of the new variable is nullable when its selector can be null shorted.

@ds84182
Copy link

ds84182 commented Sep 17, 2020

One thing I'm worried about here is the use of var, which implies that it can be "rebound" mid expression: [var x:123, x++, x++, x++]

But that does not seem to be the case. Would val (like Kotlin) or let (like Rust) make more sense here? Also related: #136

@AKushWarrior
Copy link

But that does not seem to be the case.

Why not? I don't see anything contradicting this.

@ds84182
Copy link

ds84182 commented Sep 18, 2020

@AKushWarrior

A binding expression always introduces a final local variable. The main reason for this is that it is highly error prone to have a local variable whose name is the same as an instance variable, which serves as a proxy for the instance variable because it has the same value (at least initially), and then assignment to the local variable is interpreted to be an assignment to the instance variable.

@AKushWarrior
Copy link

void main() {
  var x=123;
  var y:123;
}

Are these declarations equivalent? If not, what's the difference?

The latter is an antipattern; the latter also represents an object, where as the former represents void.

@AKushWarrior
Copy link

@AKushWarrior

A binding expression always introduces a final local variable. The main reason for this is that it is highly error prone to have a local variable whose name is the same as an instance variable, which serves as a proxy for the instance variable because it has the same value (at least initially), and then assignment to the local variable is interpreted to be an assignment to the instance variable.

Ah. I think that maybe final is good instead of var then, because we already have that keyword.

@eernstg
Copy link
Member Author

eernstg commented Sep 18, 2020

@AKushWarrior wrote:

Ah. I think that maybe final is good instead of var then

I chose var because it is shorter (and there's no end to the amount of pain inflicted by long keywords ;-). It would be more consistent to use final, because "that's what it means, anyway!". But any misunderstanding with respect to this will be resolved at compile-time because any attempt to assign to such a variable will be an error, so it's not likely to cause bugs.

@eernstg
Copy link
Member Author

eernstg commented Sep 18, 2020

@tatumizer wrote:

void main() {
    var x=123;
    var y:123;
}

Are these declarations equivalent? If not, what's the difference?

I agree with @AKushWarrior's remarks on this, but also: y is final; and the binding expression restricts the value to be derived from <primary> <selector>*, so you can't just switch to var y: e in general, and if you use var y:a + b; where a is derived from <primary> <selector>* (e.g., it could be an identifier) then you'll initialize y to the value of a, not a + b.

The binding expression is really intended to be used with "small" initializing terms (expressions or selectors), inside a bigger expression, and if you want to create a variable and bind it to a "big" expression then a <localVariableDeclaration> is the natural choice.

@eernstg
Copy link
Member Author

eernstg commented Sep 18, 2020

@lrhn wrote, about suffix forms:

1 + foo.bar() as<var x>.baz(x) is very close

It's definitely an interesting variant of the proposal to use a suffix form.

If we allow 'as' '<' 'var' identifier? '>' as a selector then it could bind x (so foo.bar()() as<var x>.baz would give x the value of foo.bar()()), and it could use the nearest name as the default name (so foo.bar()() as<var>.baz would bind bar to that value).

But we'd need some extra disambiguation in order to bind the suffix form to any other expression. Perhaps it would just require parentheses (in which case it's still a selector, so we don't have any other syntactic forms at all than the selector), like (a + b) as<var x>, in the cases where the expression isn't derivable from <primary> <selector>* or the similar case for cascades.

It should be possible. It's not obvious to me that one is much better than the other:

class C {
  int a, b;

  // Usages; assume different scopes, hence no name clashes.
      // Introduce a new name, apply to expression.
      (a + b) as<var x> + x;
      var x:(a + b) + x;

      // Reuse name of identifier from enclosing scope.
      a as<var> + a;
      var:a + a;

      // Use 'default' name, apply to selector.
      a.bitLength as<var> + bitLength;
      a.var:bitLength + bitLength;

      // Use new name, apply to selector.
      a.bitLength as<var b> + b;
      a.var b:bitLength + b; 
  }
}

@AKushWarrior
Copy link

AKushWarrior commented Sep 18, 2020

@lrhn wrote, about suffix forms:

1 + foo.bar() as<var x>.baz(x) is very close

It's definitely an interesting variant of the proposal to use a suffix form.

@eernstg Where did he write that? On another issue?

@lrhn
Copy link
Member

lrhn commented Sep 20, 2020

@tatumizer Fewer parentheses! Linear writing!

When you are writing and have already written 1 + foo.bar() and realize you need to name that (or cast it without naming it), it's much easier to continue writing as <Bar> or as <Bar bar> or as <var bar> instead of needing to go back and add a start parentheses before foo. For a long chain like (~(-foo.bar().baz().qux()).whatnot()).something() it gets increasingly hard to see where the prefix operators apply.

That's the reason we've been considering an inline cast to begin with, and are considering an inline await as well. Prefix operations simply do not work very well with long chains of selectors, and Dart otherwise encourages long chains.

@eernstg
Copy link
Member Author

eernstg commented Sep 21, 2020

@AKushWarrior wrote (about 1 + foo.bar() as<var x>.baz(x) is very close):

Where did he write that?

#1210 (comment).

@lrhn wrote (about why a suffix form would be desirable):

Fewer parentheses! Linear writing!
When you are writing and have already written 1 + foo.bar() and realize
you need to name that (or cast it without naming it), it's much easier to
continue writing

I'd usually give a higher priority to readability than writability, because it's likely that code needs to be understood more frequently than it is modified, and for that it seems useful to announce the variable introduction just before the name of the variable:

  a.bitLength as<var> + bitLength; // Search back to see which `var` that was.
  a.var:bitLength + bitLength; // Aha, we're creating a variable named `bitLength`.

I think the most tricky part is the readability of the verbose cases where it makes a selector much bigger than usual:

  a.bitLength as<var newNameForBitLength> + newNameForBitLength;
  a.var newNameForBitLength:bitLength + newNameForBitLength; 

It's worth considering a rewrite to use a normal <localVariableDeclaration> in such verbose cases.

I think the most important benefit that the suffix form brings is that it easily allows us to include any desired number of non-identifier selectors:

  a.foo(16)<int>(true) as<var x>.bar + x;
  (var x: a.foo(16)<int>(true)).bar + x;

@lrhn
Copy link
Member

lrhn commented Nov 25, 2021

Unreadable code is unreadable. Having side effects inside a print is already questionable.

Same issue could be argued for:

class C {
  int? i;
  void foo() {
    var veryLongAndNotParticularlyInterestingNameForCompleteness = 42, andMore = 37, i = 1;
    // ...
    // Lots of stuff.
    // ...
    var j = i + 1; // Would everybody know that `i` is not `this.i` here?
  }
}

which is unreadable without using any fancy features.

I guess the point is that in some places, one use of a variable is a logical continuation of another, and we want to extend the scope to also cover the following use, and in other cases it's just incidental, and we don't want to extend the scope too much, unnecessarily.

We want the scope to be predictable.

(The current Dart variable declaration scope is "in scope in the current block, can only use after declaration". That's predictable, even if it's occasionally annoying. Since statements dominate the rest of their block, there is no question about what is "after", syntactically after the declaration, including after its initializer expression, is the same as semantically after).

In this case, the scope of the variable is the current, nearest enclosing, statement (or declaration, I guess, for var sqry = (var:y)*y;, if we want to allow that), and you can only refer to the variable after its declaration (for some definition of "after").

(For declarations, we don't want to wrap them in a new scope, not if that means var x = 42; becomes {var x = 42;} and thereby useless. So, if it has a scope, it has to be a special scope, like a "binding-expression-scope", which a normal declaration can recognize and say that it introduces its declared name into the parent scope, while allowing binding expressions to introduce variables local to the declaration..

Can you do:

  var y = this.var:y, sqry = y * y;

?)

Can you write (test ? (var:x) : 0) + x. The second use is syntactically after the declaration, but not necessarily dominated by it. Is it simply not definitely assigned (and since it's final, it's also not assignable, so it won't ever be useful)?

@eernstg
Copy link
Member Author

eernstg commented Nov 25, 2021

Unreadable code is unreadable. ...
We want the scope to be predictable.

+100!

I think the keyword var at the beginning of the line with var veryLong... serves as a clearly visible reminder that we're extending the current namespace, and the fact that the line extends beyond the margin (on this webpage, anyway) indicates that we just have to scroll right in order to know what's going on.

My conclusion is that we need to protect ourselves from variables like the one in the binding expression I mentioned (print(... var:i ...);) because they are easy to miss, but we can live just fine with the current local variable declarations because we know what they do to the namespace.

Inside an expression statement it's more manageable, because we would presumably read all of it in order to understand it, or we'd just skip over it because we know/think that we don't have to understand it right now. In any case, the new variable which is in scope inside the expression statement causes no particular dangers.

When it comes to declarations,

var sqry = (var:y)*y;

we could make it an error to have a binding expression at all. Just use this instead:

var y = this.y, sqry = y*y;

The point is that a local variable declaration already allows us to introduce additional variables as needed.

Alternatively, the binding expression variable could have a scope which is limited to the local variable declaration itself. This could be helpful if the variable is only created because it's needed in that declaration itself, and it's simply namespace pollution to allow it to be in scope after the declaration. In that case we'd specify the effect on the scoping structure directly (rather than giving a syntactic sugar based specification). But that's a known technique already (cf. initializing formal parameters of generative constructors), so that's not a big problem per se.

About reachability and null shorting:

(test ? (var:x) : 0) + x

I'd just stick to the rules that I've already proposed, that is, it is a compile-time error to evaluate a variable which is introduced by a binding expression unless that binding expression is guaranteed to have been executed. So the x at the end is an error.

[Edit: Correction, I did mention various null shorting constructs, but I did not mention the conditional expression, &&, and so on; I added that to the proposal now. Also added a rule about local variable declarations, enabling binding expressions with a scope that only includes that declaration.]

@jodinathan
Copy link

@tatumizer yeah, renaming variables in nested accessors would be not so easy to read, however, I think that would be used by experienced Dart developers in very specific scenarios so I wouldn't spend energy on it

@Levi-Lesches
Copy link

Reminder to the thread that these are supposed to help readability by removing unneeded lines. If the effect ends up being that programmers condense multiple lines of declarations into one print statement, that can be seen as a good argument against this feature. (Before anyone brings up ;, I'll point out that ; has a purpose besides being abused to squeeze lines together. @'s only purpose is to combine lines together, and it's up to every reader to decide what's unreadable.)

@Levi-Lesches
Copy link

I wouldn't mind seeing @ in a postfix position, assuming there were reasonable rules around parenthesis.

@jodinathan
Copy link

jodinathan commented Dec 6, 2021 via email

@jodinathan
Copy link

The thing is that those kind of if-declaration works but they are very bothersome to work with.
For example:

// I am checking prop
if (obj.prop != null) {
}

// at this moment I think "I have to declare prop to use it"

// with the @ binding expression I have to type one letter and I can already move on with it:
if (obj.@prop != null) {
  use(prop);
}

// I basically didn't have to change anything in my if clause at all.

// now with most other syntax I have to basically remake the if entirely:
if ((final prop = obj.prop) != null) {
  use(prop);
}

IMHO I would go for @ binding expression and maybe as a postfix as you suggested.
From all alternatives it is by far the most simple, intuitive and make the code pretty.

@jodinathan
Copy link

your own suggestion is much better to read and type than the alternatives:

if (properties["firstName"]@name != null) use(name);

again:

// I am here
if (properties["firstName"] != null)

// I think "I have to declare a variable pointing to this index so I can use it"

// easily add 5 continuous chars to the if without changing its structure at all: 
if (properties["firstName"]@name != null) use(name);

// rebuild the whole if from the start to declare the same name variable:
if ((final name = properties["firstName"]) != null) use(name);

with most expressions

really? Array accessors and other type of promotions are greater in proportion than properties promotion?

@jodinathan
Copy link

jodinathan commented Dec 6, 2021

Yeah, we are all collecting the possibilities from quite some time but I think I am sold rn if no other argument against the postfix @ appear.

I aways argued that a postfix solution is better to readability and also easier to build the logic in our brains:
if (obj.prop != null use prop)
Now read my pseudo code few comments earlier: // at this moment I think "I have to declare prop to use it"
The sentence in English is pretty much equivalent to the code regarding the variable declaration, so a postfix solution fits how we build the logic.

However, then came the @ notation and it is so simple to use that it wins over the postfix logic above.

And now you suggested a postfix @ that merged too many advantages over the alternatives:

  • @ is a single letter that you have to type to already a full blown
    variable declared in the simplest cases: if (obj.prop@ != null) use(prop)
  • The @ stands out, so when a fresh developer reads the code it will at least get curious about the @ there
  • Naming index accessors is easy: if (foo.list[0]@val !=null) use (val);
  • Renaming is easy: if (foo.bar.baz@b != null) use(b);
  • Naming function results is easy: if (foo.func(0,1)@result != null) use(result);
  • Is a postfix solution exactly how the flow of logic happens: "if this prop is greater than 5, use prop"
  • Coloring in the IDE should be easy. I can see the variables declare using @ to have a different color that the ones declared with final and var, making it stand out even more. IMO this is very nice.
  • Easier to implement. This is a wild guess but the if-variables and the original binding expression seem harder to implement.
  • It will be unique to Dart. I don't know that many languages and mostly are based from C, but from the ones I know I've never seen any near the easiness and succinctness to declare a variable

@Levi-Lesches
Copy link

Coloring in the IDE should be easy.

Was going to comment, having @identifier come after an expression may be confused with an annotation (or macro once those come out). But the idea can still stand with another symbol like # (which I particularly like because it represents symbols, saying "this is the identifier I want to use in my code").

@leafpetersen
Copy link
Member

The obvious answer is this: Declaration expression is a declaration that returns a value (like any expression).
Then we have if ((var b = foo.bar) != null) use (b);

@tatumizer I'm not sure if this was implicit in your comment, but @lrhn has a proposal for that syntax here. I have a fairly long comment in that thread with my objections to both of these proposals (though restricting the scope to a single statement as @eernstg proposes above addresses some but not all of my concerns).

@leafpetersen
Copy link
Member

Maybe you can discuss it internally, ask more people (e.g. the devs from the flutter team), etc? I don't know how to handle it more objectively.

(The scopes are a separate topic, orthogonal to the syntax of the proposals IMO)

The scopes are the heart of these two proposals. We have discussed them internally, and I think my position is well summarized here and here .

If you remove the expression level scoping, then you're basically left with a potential alternative syntax for the if-variable proposal, which perhaps is worth considering? My main concerns around that proposal are readability (and whether it actually moves the needle enough) so maybe there's something to consider there. I don't find the @ syntax very readable, but at least it signals very loudly that something is happening.

@jodinathan
Copy link

I don't find the @ syntax very readable, but at least it signals very loudly that something is happening.

I guess nothing will be as readable as a full, regular declaration or at least something very verbose/bothersome.

The balance here is to find a syntax that is easy to write, easy to read (when you know the feature) and not bothersome/verbose when you need to use.

We are in a major refactoring for NNBD and when we have to declare a variable for null check or type promotion, it has chance of breaking our current logic focus and we have to read some of the code again to get back to the level we were.
Reading the proposals I am not sure if they will prevent this from happening.

Be it @ or another feature like shadowing, we need something easy and fast enough that do not disrupt our current focus and also make the code safe/optimized.

@eernstg
Copy link
Member Author

eernstg commented Dec 7, 2021

@tatumizer wrote:

According to the spec, selector may also have a form .ident, but my
impression of [email protected] is that the new variable "prop" corresponds
to obj.prop rather than obj.prop.b.

The @ would be used to introduce the new name: If we have obj.prop then we can introduce a new variable named prop by using obj.@prop, and we could introduce a variable named x using obj.@x:prop. This is all for selectors of the form '.' <identifier>. In both cases the new variable would be initialized to have the value of obj.prop.

With a selector of the form '[' <expression> ']' we would not have the colon. So with the starting point myList[5], we could have myList@fifthElement[5], introducing the variable fifthElement with the value myList[5]. We would not allow the anonymous form, because there is no identifier to get the name from (so myList@[5] is a syntax error).

My original proposal was more restricted: It did not allow for the bindings of binding expressions to occur at every kind of selector, but with @ we could actually allow that. So:

The selector <typeArguments> would allow us to snapshot a tear-off: e.method<int>(42) can give rise to e.method@function<int>@result(42) which will snapshot a tear-off of e.method instantiated with <int> into the new variable function, and also the result of the entire method call into result.

So the colon is only used with a selector of the form '.' <identifier> or '?.' <identifier>, or in a cascade, and only if we want to use a different name for the new variable than that identifier, and the new variable name occurs before the colon. So the "thing" that introduces the new variable has the shape '@' or '@' <identifier> ':'.

In any case, the '@' or '@' <identifier> part should presumably be highlighted in IDEs, such that it is easy to detect the introduction of a new variable.

Of course, any mechanism can be used to write line noise, and e.method@function<int>@result(42) may be a pretty good example of that, but I think it could be used judiciously as well:

void main() {
  // Basic, straightforward approach. Using a block to avoid polluting
  // the rest of the function body with variables that are only used here.
  {
    var r = expressionThatYieldsTheReceiver;
    var v1 = r.method1(42);
    var v2 = r.method2<int>(v1);
    r.getter3[v2].method3();
  }

  // Same thing using binding expressions, only allowing identifier selectors.
  expressionThatYieldsTheReceiver
    ..@v1:method1(42)
    ..@v2:method2<int>(v1)
    ..getter3[v2].method3();

  // Same thing using binding expressions, allowing all selectors.
  expressionThatYieldsTheReceiver
    ..method1@v1(42)
    ..method2<int>@v2(v1)
    ..getter3[v2].method3();
}

The variant that allows arbitrary selectors does have more expressive power, but I think it is considerably less readable.

So I tend to prefer the (original) variant that binds the new variable to the result of the entire selector chain up to the next '.' <identifier> or '?.' <identifier>.

@eernstg
Copy link
Member Author

eernstg commented Dec 7, 2021

@Levi-Lesches wrote:

@'s only purpose is to combine lines together

I'm not quite sure about that, I think it plays a significant role that binding expressions allow us to snapshot an intermediate result in a computation. We would read that kind of code by looking at the expression as a whole, and then check out which intermediate results are being snapshotted.

class Link {
  final int value;
  Link? next;
  Link(this.value, [this.next]);
  @override
  String toString() => '$value -> ${next?.toString() ?? 'nil'}';
}

void main() {
  final Link? link = Link(1, Link(2, Link(3)));
  // ... stuff ...
  
  // Swap successors, straightforward approach.
  if (link != null) {
    var next1 = link.next;
    if (next1 != null) {
      var next2 = next1.next;
      if (next2 != null) {
        link.next = next2;
        next1.next = next2.next;
        next2.next = next1;
      }
    }
  }

  // Same thing using binding expressions.
  if (link?.@next1:next?.@next2:next != null) {
    link.next = next2;
    next1.next = next2.next;
    next2.next = next1;
  }
}

We would need to ensure that the flow analysis recognizes that the getter invocation .next occurred both times (such that next1 and next2 are known to have been initialized), and the result was non-null (cf. #1224), but with that in place I think it's about more than just saving lines.

@eernstg
Copy link
Member Author

eernstg commented Dec 7, 2021

@tatumizer wrote:

What will happen if we move @ from prefix to postfix position

That's a very interesting idea! ... an important consequence, as I see it, is that this allows us to read the code as "compute this expression, then store the result in this new variable".

The anonymous form was also mentioned several times (obj.prop@ would bind the new variable prop to the value of obj.prop, and myList[5]@ would be a compile-time error), and I think it would be relatively easy to get used to that as well.

The example from the previous comment would then be like this:

void main() { ...
  if (link?.next@next1?.next@next2 != null) {
    link.next = next2;
    next1.next = next2.next;
    next2.next = next1;
  }
}

Later, @tatumizer wrote:

The problem with @ is that addresses a very narrow set of expressions

I guess this was aimed at the prefix form. I don't see why that syntax couldn't allow for the basic form @name: expression (with the var form that I proposed originally, this would be var name: expression).

I did restrict the expression to be a <primary> <selector>* in order to allow var:e is T and var name:e is T as well as null equality tests without parentheses. My underlying assumption was that a rather large proportion of the expressions used for snapshotting and promotion would have this form. In any case, with parentheses it allows for an arbitrary expression.

if (var p: properties["firstName"] != null) use(p); // or ..
if (@p: properties["firstName"] != null) use(p);

We could of course very will have the prefix form for expressions in general (that is, it can only occur at the top level of the expression, and the expression would then presumably be a fully general <expression>), and then the postfix form would only apply to expressions of the form <primary> <selector>* (but note that we can use (e)@v where e is an arbitrary expression, because (e) is a <primary>).

But even though the postfix form sort of hides the name of the new variable pretty far to the right, I think it works quite well, visually:

if (properties["firstName"]@p != null) use(p);

@lrhn
Copy link
Member

lrhn commented Dec 10, 2021

Also compare to #1216 (inline prefix operators). It suggests allowing prefix operators inline in selector lists, binding to the next "selector", so foo.!isEmpty means !(foo.isEmpty).

If this really was an instance of #1216, we would also allow prefix var:expression and var x:expression. The latter that would just be an alias for var x = expression, so it's not needed, the former is useful when the scope allows it.

the proposal has (IMO) a problem with delimiting the effect. Take: foo.-bar(baz). Does that mean (-(foo.bar))(baz) or -(foo.bar(baz))? If either, why?
And in the presence of

extension <R extends num, T> on R Function(T) { 
  operator -() => (arg) => -(this(arg)); 
}

either can be correct.

This proposal has the same issue: How far ahead does the name bind?

The example:

s.var:substring(7).toUpperCase();

suggests that substring is bound to the value of s.substring(7). Why is the (7) included. Are method invocations special-cased. Would it make a difference if substring was a function-typed getter? What if you write o.var:foo(1)(2), is the second argument list, (2), included as well. If not, why not?

What about s.var:foo[1](1)!. Where is the limit.

Postfix variable binding would be better, then it would be a selector, and it always refers to everything before. Syntax would be a problem. Maybe s.foo(1).var foo.baz or s.foo(10).var.baz - it's the same thing, the .var. declaration latches onto the most recent identifier selector. Not great, the space breaks up the selector chain, but s.foo().var:x.baz is also easy to misread. So, @tatumizer's s.foo(10)@.baz works. I just don't particularly like the @ character. Doesn't say "binding" to me.

The example if (var:i != null) sees the i from the surrounding scope, even though the i being declared shadows it everywhere else in the current scope. That might be confusing, but it's probably well-defined. The special casing does feel a little icky.
We could just say that local variables do not exist prior to their declaration, so var x = x; works everywhere. Would be a non-breaking change since it's currently a compile-time error to refer to local variables prior to their declaration. If we can do what this proposal suggests, then we can say everywhere that if x refers to a local variable prior to its declaration, instead of making it a compile-time error, we look up x in the outer scope instead.

@jodinathan
Copy link

I just don't particularly like the @ character. Doesn't say "binding" to me.

We could use # as @Levi-Lesches suggested, ie:

if (obj.prop# != null) use(prop);

if (s.foo(10)#baz is Foo) use(baz);

void main() { ...
  if (link?.next#next1?.next#next2 != null) {
    link.next = next2;
    next1.next = next2.next;
    next2.next = next1;
  }
}

@eernstg
Copy link
Member Author

eernstg commented Dec 11, 2021

Postfix @ Proposal

Here is an adjustment of the proposal to use postfix @. It turns out to be considerably simpler than the var proposal:

[Edits: Jan 4 2022: '@' <identifier> at the end of an <assignableExpression> is an error.]

Examples and Motivation

Binding expressions introduce a new kind of selector of the form '@' <identifier>. The effect of e@id where e is an expression of the form <primary> <selector>* is that a new final variable id is declared, and it is initialized with the value of e. The variable is in scope in the nearest enclosing statement (either an expression statement, or a composite statement like a loop or an if-statement).

A good way to read e@id is "evaluate e and store the result at id".

The binding expression may omit the name of the variable in the case where it occurs immediately after an identifier id. In this case the variable gets the name id.

This means that new variables can be introduced very concisely, and they can reuse an existing name, e.g., obj.prop@ where the new variable gets the name prop. This name is presumably meaningful in the context, because it is already used as the name of the getter whose value is being cached in this new variable.

A binding expression can be used to "snapshot" the value of a subexpression of a given expression:

void main() {
  int i = 42;
  // '42 has 6 bits':
  print('$i has ${i.bitLength@} bit${bitLength != 1? 's':''}');
  print('$i has ${i.bitLength@len} bit${len != 1? 's':''}');
}

The construct i.bitLength@ works as a sequence of two selectors on i. Evaluation invokes the getter bitLength and stores the value of i.bitLength in a new, final, local variable named bitLength. The construct i.bitLength@len gives the new variable the name len, and otherwise works the same. This works for method invocations as well:

void main() {
  var s = "Hello, world!";
  var s2 = s.substring(7)@sub.toUpperCase() + sub;
  print(s2); // 'WORLD!world!'.
}

Binding expressions can bind the value of a complete expression to a new variable. In some cases this works simply by using @... as the first selector (e.g., i@newVar or i@), but it also works with arbitrary expressions, because (e) is a <primary> (so we can do (a + b.c(whatever)[14])@newVar).

class C {
  int? i = 42;

  void f1() {
    if (i@ != null) { // Snapshot instance variable `i`, create local `i`, scoped to the `if`.
      Expect.isTrue(i.isEven); // Local `i` is promoted by test.
      this.i = null; // Assignment to instance variable; `i = null` is an error.
      Expect.isTrue(i.isEven); // The local variable still has the value 42.
    } else {
      // Local `i` is in scope, with declared type `int?` from initialization.
      Expect.isTrue(i?.isEven);
    }
  }
}

The binding expression i@ introduces a new local variable named i into the scope of the if statement (the if statement is considered to be enclosed by a new scope, and the variable goes into that scope). It is an error to refer to that local variable before the binding expression, so if we have foo(e1@v1, e2@v2), e2 can refer to v1, but e1 cannot refer to v2.

A binding expression always introduces a final local variable. The main reason for this is that it is highly error prone to have a local variable whose name is the same as an instance variable, which serves as a proxy for the instance variable because it has the same value (at least initially), and then assignment to the local variable is interpreted to be an assignment to the instance variable.

A binding expression is intended to be a "small" syntactic construct. In particular, parentheses must be used whenever the given expression uses operators ((a + b)@x) or other constructs with low precedence. It is basically intended to snapshot the value of an expression of the form <primary> <selector>*, that is, a receiver with a chain of member invocations, or even smaller things like a single identifier.

In return for this high precedence, we get the effect that expressions like i@ is T or i@ as T parses such that the binding expression is i@, which is a useful construct because it introduces a variable i and possibly promotes it to T.

The conflict between two things named i is handled by a special lookup rule: For the construct i@, the fresh variable i is introduced into the current scope, and the initialization of that variable is done using the i which is looked up in the enclosing scope. In particular, if i is an instance, static, or global variable, i@ will snapshot its value and provide access to that value via the newly introduced local variable with the same name.

This implies several things: It is an error to create a new variable using a binding expression if the resulting name clashes with a declaration in the current scope (so 2@x + 2@x is an error). Also, in the presumably typical case where a binding expression introduces a name i for a local variable which is also the name of an instance variable, every access to the instance variable in that scope must use this.i. This may serve as a hint to readers: If a function uses this.i = 42 then it may be because the name i is introduced by a binding expression.

This proposal includes the proposal that every composite statement should introduce a scope containing just that statement is assumed in this proposal, and it is extended to wrap every <expressionStatement>, <returnStatement>, and a few others, in a new scope as well. This implies that when a binding expression introduces a variable and it is added to the current scope, it will be scoped to the enclosing statement S, which may include nested statements if S is a composite statement like a loop. If a binding expression occurs in a composite statement then it will introduce a variable which is available in that whole composite statement (e.g., also the else branch of an if statement), but only there.

Grammar

The grammar is updated as follows:

<unconditionalAssignableSelector> ::= // Modified rule.
  '[' <expression> ']' |
  '.' <identifier> |
  '@' <identifier>?

Static Analysis

Every form of binding expression e introduces a final local variable into the current scope of e. Below we just specify the name and type of the variable, and the scope and finality is implied.

This proposal includes the following change to the scoping rules: Each statement S is immediately enclosed in a new scope if S is derived from one of the following: <forStatement>, <ifStatement>, <whileStatement>, <doStatement>, <switchStatement>, <expressionStatement>, <returnStatement>, <assertStatement>, <yieldStatement>, <yieldEachStatement>. For instance print('Hello!'); is treated as { print('Hello'); } and if (b) S1 else S2 is treated as { if (b) S1 else S2 }.

A local variable declaration D is treated such that variables introduced by a binding expression in D are in scope in that local variable declaration, and not outside D. This cannot be specified as a syntactic desugaring step, but it can be specified in a similar manner as the scoping for local variable introduced by an initializing formal parameter.

The main rationale for making the variable final in all cases is that it is highly error prone to save the value of an existing variable x (for instance, an instance variable) in a local variable whose name is also x, and then later assign a new value to x under the assumption that it is an update to that other variable. In this case the assignment must use this.x = e because the local variable x is in scope and is final.

If a variable x is introduced by a binding expression e then it is a compile-time error to refer to x in the current scope of e at any point which is textually before the end of e.

A binding expression of the form e1@x introduces a variable named x with the static type of e1 as its declared type.

A binding expression of the form e1@ is a compile-time error unless e1 is an expression of the form p s1 s2 .. sk where p is a <primary>, s1 .. sk are <selector>s, and sk is of the form '.' <identifier> or '?.' <identifier>. If there is no error then e1@ is treated as e1@id, where id is said identifier.

The x which is used to initialize the new variable in x@ is looked up in the enclosing scope, not the current scope.

It would be a compile-time error to look it up in the current scope (because that's a reference to the variable itself before it's defining expression ends), and it is likely to be useful to look it up in the enclosing scope because the intended variable/getter would often be a non-local variable.

A compile-time error occurs if '@' <identifier> occurs as the last selector in an <assignableExpression>.

x.y@v = e; would mean the same thing as (x.y = e)@v, which is more readable.

Null shorting does not cause any particular difficulties. For example, a binding expression of the form e1?.m<types>(arguments)@x introduces a variable named x whose declared type is the static type of e1?.m<types>(arguments). Evaluation will bind x to null if e1 evaluates to null, which may again occur because of null shorting inside e1.

Cascades are desugared into expressions using only let and <primary> <selector>*, and this determines the analysis and semantics of cascades containing binding selectors (that is, selectors of the form '@' <identifier>?).

In all these cases, the expression that provides the declared type of the newly introduced variable is known as the initializing expression for the variable.

In expressions with control flow (conditional expressions, logical operators like && and ||, and so on), it is a compile-time error to evaluate a variable v which has been introduced by a binding expression e unless e is guaranteed to have been evaluated at the location where the evaluation of v occurs.

Dynamic Semantics

A local variable x introduced into a scope by a binding expression e is initialized to the value of the initializing expression of x at the time where e is evaluated. If the initializing expression of x is not evaluated due to null shorting then x is initialized to null. (If the initializing expression is not evaluated due to other kinds of control flow then the value is unspecified, but this does not matter because it is a compile-time error to evaluate the variable).

@jodinathan
Copy link

@eernstg would it be able to use it like this?

String myFunc() => 'hello there';

myMap[someIndex]@foo = myFunc();

print(foo.length);

This is nice as it avoids this scenario:

final foo = myMap[someIndex] = myFunc();

print(foo.length); // error, foo can be null

@eernstg
Copy link
Member Author

eernstg commented Jan 4, 2022

With the newest scope rules it would actually not work:

... // Declarations as needed.

String myFunc() => 'hello there';

void main() {
  myMap[someIndex]@foo = myFunc();
  print(foo.length); // Error, unknown name `foo`.
}

The problem is that foo is out of scope as soon as we're out of the expression statement myMap[...]@foo = ...;.

The reason for adopting such rigid scoping rules is that it will easily get too hairy to read the code if we can introduce variables in the middle of an expression statement, and then have them in scope in the following statements, which would potentially be many lines of code. The only case where these variables are in scope in many statements is when they are introduced by a structural expression of a composite statement, that is: The condition of an if or while statement, the <forLoopParts> of a for statement, the scrutinee expression of a switch statement, and a few other places like that.

So you'll have to declare foo using a normal local variable declaration, in order to keep it alive across several statements that aren't the body of a composite statement. However, that seems to be just as concise as the above form, so why wouldn't we just do that?:

``dart
... // Declarations as needed.

String myFunc() => 'hello there';

void main() {
var foo = myMap[someIndex] = myFunc(); // foo gets inferred type String.
print(foo.length);
}


The binding expression could be useful when we're considering composite statements:

```dart 
... // Declarations as needed.

String? myFunc() => 'hello there';

void main() {
  if (oneThing && anotherThing && myMap[someIndex]@foo = myFunc() != null) {
    print(foo.length);
  }
}

However, we might prefer the form (myMap[someIndex] = myFunc())@foo != null, because it contains foo != null as a substring, and that's also the semantics of the expression after the evaluation of the parenthesis and the initialization of the new variable foo. So we can read the whole thing as "evaluate the parenthesis, then store the result "at" the new variable foo, then evaluate the expression where foo occurs.

Actually, I think @<identifier> at the end of an assignable expression should be an error. (Done ;-)

@jodinathan
Copy link

oh, it was my bad.

Really had the thought that final foo = map[index] = func(); was going to throw that foo can be null. Weird O_o

But testing on Dartpad it worked =]

@eernstg
Copy link
Member Author

eernstg commented Jan 4, 2022

That's because an assignment has the type of the assigned value, not the type of the assignment target. So with Object? o, o = 1 has type int rather than type Object? (and it has the value 1, so it's sound). Not that weird. ;-)

@jodinathan
Copy link

@eernstg I know we talked about it before, however, being able to declare stuff in the scope would be very helpful in some cases.

take this code example:

someList.map((sub) => '''
    foo(${sub.name.camelCase});

    ${someMethod(sub.name.camelCase)}
''').join('\n')}

I don't want the code to twice call the getter name neither the camelCase extension, so I would declare it previously.
However, in the current situation, I would have to stop using the lamba => '' to use a regular function call (sub) { return ; }

someList.map((sub) {
final fn = sub.name.camelCase;

return '''
    foo($fn);

    ${someMethod(sn)}
''').join('\n');
}

Rewriting with @:

someList.map((sub) => '''
    foo(${sub.name.camelCase@sn});

    ${someMethod(sn)}
''').join('\n')}

We usually feel this need when source generating and few other places.

@jodinathan
Copy link

maybe this could be two different features?

if-binding-expressions

allow var declaration within a block of condition, ie:

if (foo.prop@ != null) {
  // prop exists inside if
}

inline-binding-expressions

allow var declaration within a parent of the expression it is being called, ie

someList.map((sub) => '''
    foo(${sub.name.camelCase@sn}); // sn exists within `map` and below this line

    ${someMethod(sn)}
''').join('\n')}

@eernstg
Copy link
Member Author

eernstg commented Feb 21, 2022

@jodinathan wrote:

being able to declare stuff in the scope would be very helpful in some cases

The given example would actually just work, because the innermost enclosing statement is the invocations of methods on someList, so sn is indeed in scope at ${someMethod(sn)}:

someList.map((sub) => '''
    foo(${sub.name.camelCase@sn});

    ${someMethod(sn)}
''').join('\n')}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Proposed language feature that solves one or more problems field-promotion Issues related to addressing the lack of field promotion nnbd NNBD related issues
Projects
None yet
Development

No branches or pull requests

7 participants