Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explicitly return from blocks instead of last statement being expression value #629

Closed
thejoshwolfe opened this issue Nov 28, 2017 · 9 comments
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@thejoshwolfe
Copy link
Contributor

First proposed here: #346 (comment)

tldr: No longer allow omitting a semicolon after the last statement of a block to indicate a return expression. Allow labeled blocks and block-like expressions. Allow explicitly returning from a labeled block with a result expression.

Status quo

You can currently omit the final semicolon in a block to indicate that the block returns a value:

var x = {
    foo();
    bar() // the return value from bar() gets assigned into x
};
var y = {
    foo();
    bar();
    // y is {}, which is of type void
};

One problem with this is that there are two ways to return a value from a function:

fn abs(x: i32) -> i32 {
    if (x < 0) -x else x
}
fn abs(x: i32) -> i32 {
    return if (x < 0) -x else x;
}

Another problem is that this rule gets confusing when you have implicit semicolons after certain syntactic constructs: if, while, for, and comptime statements if they end with }, and switch statements and blocks (since they always end with }). So what happens when one of these implicit semicolon constructs is the last statement in a block?

var x = {
    if (foo()) {
        bar()
    } else {
        baz()
    } // no semicolon here
};

Proposal

Returning values from blocks is done via labeled blocks and labeled return. For function bodies, this is as simple as using a normal return statement as it exists today. For any other block, the block must be labeled by prefixing the { with identifier:. Then give the block a value via return :identifier value from inside the block.

var x = initialization: {
    if (foo()) {
        return :initialization bar();
    } else {
        return :initialization baz();
    }
};

The name of the identifier is scoped to within the block. And since variable names don't go into scope until after the declaration statement, there would be no conflict with these names "x":

var x = x: {
    return :x bar();
};

But there would be conflict here:

var x = bar();
x = x: { // ERROR: redeclaration of "x"
    return :x bar();
};

It's possible to simply "break" out of a block with return :label and no expression:

maybe_connect: {
    if (!is_configured) return :try_connect;
    if (!has_peer) return :try_connect;
    tryConnect();
}

If control flow reaches the end of a block, there is an implicit return :block with no expression. No expression is the same as a void expression.

var x = x: {
    if (foo()) return :x true; // return type is bool
    bar();
    // implicit return :x {}
    // ERROR: conflicting types bool and void
};
var x = x: {
    if (foo()) return :x {}; // return type is void
    // implicit return :x {}
    // Technically fine, although this is strange enough that it might be an error someday.
};

Only {blocks} can be labeled for this purpose. for and while loops can be labeled for #346, and if and switch cannot be labeled.

var x = x: if (foo()) { // ERROR: after label, expected `for`, `while`, or `{`
    return :x bar();
} else {
    return :x baz();
};

// this is all ok:
var x = if (foo()) x: {
    return :x bar();
} else x: {
    return :x baz();
};

You can't use labeled return on labeled loops, and you can't use labeled break or labeled continue on labeled blocks.

my_loop: while (true) {
    my_block: {
        return :my_loop; // ERROR: my_loop is not a block
        break :my_block; // ERROR: my_block is not a loop
        continue :my_block; // ERROR: my_block is not a loop
    }
}
@thejoshwolfe thejoshwolfe added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Nov 28, 2017
@andrewrk andrewrk added the accepted This proposal is planned. label Nov 28, 2017
@andrewrk andrewrk added this to the 0.2.0 milestone Nov 28, 2017
@PavelVozenilek
Copy link

Is it really needed to name every such block? Inventing a new name requires lot of mental effort, and it creates unnecessary entities.

The example above:

var x = initialization: {
    if (foo()) {
        return :initialization bar();
    } else {
        return :initialization baz();
    }
};

could be written as:

var x = {
    if (foo()) {
        return bar();
    } else {
        return baz();
    }
};

and there would be no ambiguity. The block is used as expression, so the return would always mean "return the block" (and not the parent function).

@kyle-github
Copy link

It seems like there are a couple of things here. There is a conflict between the desire to be DRY and the desire to follow the Zen of Zig and be explicit. I like the named returns because it is explicit.

If we want to follow the "be explicit" rule, then the implicit return of the last thing in the block seems like it violates that.

@thejoshwolfe, it seems like, for consistency, you should either have the colon always be leading in declaration and use or trailing. You know from context whether you are declaring or using a label.

@PavelVozenilek, in your example above, I would have a hard time seeing the fact that the return bar(); was not returning from the function in which all that code was running. How can I tell the difference?

In GCC you can use statement-expressions which have a different syntax to denote them from regular blocks.

It seems like having blocks have values like this is sort of a half-way point to having local functions. That would also solve this. Or perhaps a similar notation for thunks.

@PavelVozenilek
Copy link

@kyle-github

In your example above, I would have a hard time seeing the fact that the return bar(); was not returning from the function in which all that code was running. How can I tell the difference?

Return inside an expression block would always mean return the value from that block.

One would need to know he is looking at expression block, which is always indicated by prior unfinished assignment. (I cannot think about other use case now.) Expression block can be perceived as unnamed local function w/o parameters, then it fits naturally.

If there's need to have exit from parent function inside the initialization code then one would need to give up expression blocks and use "regularly structured" code. I dare to guess this would help the readability.

The main advantage is no need to invent name for the block. For small blocks the name is redundant, in large blocks it would be lost among the code anyway.

@PavelVozenilek
Copy link

PavelVozenilek commented Nov 29, 2017

Un-named expression block could be transformed to local named function with access to parent locals:

fn foo()
{
  var x1 : i32 = 1;
  var x2 = {  return 2 * x1; } //  "one shot" expression block

  const x3 = 10;
  const x4 = { return 20 * x3; } // expression block at compile time

  const x5  = fn { return 2 * x1; } // named local function, can be called repeatedly
  var x6 = x5();

  const x7 = fn { 20 * x3; } // named local function invoked in compile time
  const x8 = x7();

  // this would be either error or perhaps assignment of function pointer
  var x9 = fn { return 2 * x1; } 
  x9 = fn { return 10; } // reassigned
  var x10 = x9();
}

Local functions could be then used to keep error handling code in one place:

fn foo() 
{
  var f : FILE;
  f = fopen(...);
  if (!f) { error(); return; }
  ...
  if (some error) { error(); return; }

  // here I assume that definition of error() is visible from the beginning
  const error = fn { fclose(f); }  
}

@kyle-github
Copy link

@PavelVozenilek, your comments about local functions are where I was pointing in my comment.

Right now the problem is that you need to look at the entire context around a { } block. It might not be an assignment. It could be a function argument. That is why I like how explicit the proposal is.

@PavelVozenilek
Copy link

@kyle-github:

Perhaps the solution (to the need to quickly identify whether what I am looking at is expression block/local function or something else) is slightly different syntax. E.g.

// stands out visually, really fast to type
var x = {{{ return 10; }}}; 
// ala GCC  statement expressions, slower to type, doesn't stand out that much
var y = ({ return 10; }); 

(This new syntax could be optional. The traditional {...} could be for shorter blocks with no chance for confusion, {{{...}}} would be for complex blocks. Compiler may enforce this or may not.)

The names, IMHO, would only bring much more confusion. Some people will use x or _ everywhere, pedants would invent long descriptive names and conventions. All arbitrary decisions with very low usefulness but high potential for confusion.

Exaggerated example with named parameters:

fn foo(named x : i32) { ...}

foo(x = x: { return :x 10; });

vs

foo (x = {{{ return 10; }}});

@jido
Copy link

jido commented Dec 1, 2017

For Pavel: it would be more consistent to have a default label for the current block rather than a different bracketing syntax. E.g.

foo(x = { return :block 10; });

For Josh: This is technically OK but not good style:

var x = if (foo()) x: {
    return :x bar();
} else x: {
    return :x baz();
};

In the documentation, you should use the style at the top of the proposal:

var x = initialization: {
    if (foo()) {
        ... // some processing
        return :initialization bar();
    } else {
        return :initialization baz();
    }
};

With default block label there is no need for the enclosing "initialization:" block.

andrewrk added a commit that referenced this issue Dec 21, 2017
closes #346
closes #630

regression: translate-c can no longer translate switch statements.
after #629 we can ressurect and modify the code to utilize arbitrarily
returning from blocks.
@c-cube
Copy link

c-cube commented Apr 15, 2020

@thejoshwolfe the window for syntax changes is closing, so I'm looking at this and I don't see much of a justification for the change. It'd be nice to see why blocks don't directly return expressions as they do in other languages, as it impacts a lot of other issues (#4294 #4412 #5042 etc.)

@ghost
Copy link

ghost commented Apr 15, 2020

As @c-cube already stated, the second problem @thejoshwolfe listed should not actually be a problem:
I think, the approach of statements are just expressions returning void would simplify the language dramatically. When thinking about it this way, it is obvious that x should be set to the return value of bar() if foo() is true and baz() if it isn't. If bar() and baz() return different types, it should be a compile error.

Given that, I think the pros to reverting this change and accepting #4294 would outweigh the cons:

Pros

  • Better readability for blocks as expressions
  • No need for the () for if/while/for/switch
  • { a } is semantically identical to a (easy to understand)
  • {} is forced with all it's advantages listed in without sacrificing readability
  • switching between an expression and a block when e.g. adding a debug message is much easier

Cons

  • Semantics are changed by one character (;)

What makes omitting ; less of a problem is that it can't be forgotten to add ; or to remove ; due to type safety. (e.g. (expected int, got void) error when forgetting to remove ;).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

6 participants