RFC: label-break-value #2046

ciphergoth · 2017-06-26T20:26:41Z

Rendered

Tracking issue: rust-lang/rust#48594

Allow a break not only out of loop, but of labelled blocks with no loop. Like loop, this break can carry a value.

See also Pre-RFC discussion.

Status as of 2018-01-18.

mark-i-m · 2017-06-27T01:29:41Z

This feels to much like a goto. I don't really want Rust == Fortran 😛

scottmcm · 2017-06-27T01:58:02Z

@mark-i-m Is there something particular about this proposal that makes it feel too goto-like?

This seems to have the same restrictions as the labeled break we already have, as well as to normal break and return: cannot go backwards, cannot skip variable definitions, can only exit a syntactically-visible {} block. (Or, more abstractly, that it doesn't violate the "progress of the process remains characterized by a single textual index" property from Go To Considered Harmful.)

eddyb · 2017-06-27T01:59:28Z

The problems with goto are more or less those of safety. Rust doesn't have the same issues.
Besides you can already do this with loop { break {...} }, the difference is purely in ergonomics.

mark-i-m · 2017-06-27T02:10:46Z

that it doesn't violate the "progress of the process remains characterized by a single textual index" property from Go To Considered Harmful

This is a fair point, but I feel more strongly, I guess.

I personally don't like the labeled breaks we already have. IMHO, they usually just make control flow harder to follow/write/debug without really providing any benefit, which is the heart of Dijkstra's point. In fact, I would go so far as to say, that I generally dislike normal break/continue, too, but I accept them for lack of a better alternative. The way I see it, the more entry/exit points you have for a loop, the worse code quality is -- it's just becomes convoluted to reason about loop invariants.

mark-i-m · 2017-06-27T02:11:56Z

I should add that by breaking out of the middle of a block, you have effictively made it into 2 blocks, one of which doesn't always happen. And that is not always syntactically obvious, IMHO...

ciphergoth · 2017-06-27T03:30:18Z

Full text of Go To Statement Considered Harmful - very interesting!

ciphergoth · 2017-06-27T04:03:23Z

I filed a Pre-RFC about this, see also discussion there.

link to discussions in discussions instead.

ciphergoth · 2017-06-27T04:05:22Z

Other discussions: I proposed this here. An identical proposal was part of the explanation for trait based exception handling.

golddranks · 2017-06-27T04:53:16Z

I think that making control flow more flexible is generally a good thing. As already argued by others Dijkstra's critique doesn't apply here; the critique is against obfuscating the program state using surprising control paths. In the context of this feature it's only allowed to break outwards from a scope, which doesn't allow witnessing uninitialised variables or skipping in the middle of some code that expects there to be a state set up by the earlier code. It does allow skipping some code in a possibly "surprising way" the sense that one can jump from an inner scope to the grandparent scope, but unlike exceptions, this is still a local feature that is well visible in the local context – so in the end, it hardly isn't actually surprising, and when used with discretion, can lead to cleaner code.

I'd argue, that without flexible and safe control flow constructs, people tend to store the "which path" information to local flags or use inner functions to be able to return early. These both feel like hacks to me. The problem with manually juggling control flow flags is that the compiler can't hardly reason about their state and the problem with functions is that they are a wrong abstraction – they come with a new stack frame and aren't easily able to access the parent frame unless the state is explicitly passed to them. They are too heavyweight. Having labelled breaks is a nice way to retain the "state machine-y" feeling of a function-local control flow but still allow more flexible flow that comes handy from time to time.

eddyb · 2017-06-27T08:46:59Z

@ciphergoth In fact my first instinct for desugaring catch is that there's a label on it (or at least the compiler has a way to refer to it) and ? uses break 'innermost_catch err instead of return err.
I haven't looked much into how it ended up being implemented but AFAIK it's close enough.
The advantage of a proposal like this is that you can go to one of many labels instead of just one.

ciphergoth · 2017-06-27T13:54:12Z

@eddyb The RFC that proposes catch invents pretty much exactly what I propose here in order to describe what it does. It also describes return in terms of a break to a special 'fn scope for the whole function.

In the pre-RFC discussion, nikomatsakis says:

The compiler already internally supports [labelled blocks] for use with catch { } (that is how the HIR represents catch).

mark-i-m · 2017-06-28T01:42:27Z

@golddranks

As already argued by others Dijkstra's critique doesn't apply here; the critique is against obfuscating the program state using surprising control paths.

Perhaps I will always just disagree on this... I suspect I am probably more extreme than most on this point. It looks like I am pretty outnumbered here, so I wont spam everyone more beyond this post, unless asked for more 😛

Basically, I can't imagine many useful situations where this is easier to follow

'block: {
    do_thing();
    if condition_not_met() {
        break 'block;
    }
    do_next_thing();
    if condition_not_met() {
        break 'block;
    }
    do_last_thing();
}

than this

do_thing();

if condition_met() {
    do_next_thing();
    if condition_met() {
        do_last_thing();
    }
}

In the first example it's not clear that the preconditions for do_last_thing are the conditions_met1() && conditions_met2(). It is also a bit annoying that syntactically non-obvious prefixes of a block might execute.

But in the second one, the curly braces (and formatting conventions) make it clear, which is what we expect because curly braces are the primary way rust indicates a block of code. Of course someone will argue that if you have 50 of them, you will have too much indenting. I thinks it's worth it, but I think that's really a matter of taste.

est31 · 2017-06-28T13:15:40Z

@mark-i-m In my use case, I need to conditionally break from inside a deeply nested structure, the eno! invocation is 4 layers inside structure, with loops and if's outside. This can't just be simply refactored to use if.

Also, I especially like the pattern to break early if some condition is not met, so that there is no big rightward drift.

mark-i-m · 2017-06-29T00:57:02Z

@est31 I'll take your word for it that it is hard to refactor (I haven't tried). I guess I can see the use case, but I still don't really like the break as a pattern... although, I don't have an alternative, other than major refactoring...

scottmcm · 2017-06-29T06:19:05Z

@mark-i-m I actually agree with your for that example. Overall, though, I suspect that people won't reach for this except in cases where if and match are awkward too, so it doesn't scare me to have it.

The example I found quite compelling was this one:

let result = 'block: {
    for &v in first_container.iter() {
        if v > 0 { break 'block v; }
    }
    for &v in second_container.iter() {
        if v < 0 { break 'block v; }
    }
    0
}

Because a simple translation to normal constructs ends up something like this:

let mut result = None;
for &v in first_container.iter() {
    if v > 0 {
        result = Some(v);
        break;
    }
}
if result.is_none() {
    for &v in second_container.iter() {
        if v < 0 {
            result = Some(v);
            break;
        }
    }
}
let result = result.unwrap_or(0);

Which I find more awkward, as it obscures the symmetry and has the compiler less able to help with the initialization logic. (You could also do this one with the {||{ … }}() "operator" and return, but I'm not a fan of immediately-called-closure as a pattern either.)

ciphergoth · 2017-06-29T19:56:18Z

Should I be changing the examples in the RFC to reflect discussion here?

mark-i-m · 2017-06-30T00:54:39Z

I think it would be useful to include some of the motivating examples. And I would also like to see the disadvantages section updated, with some of the objections, even if the language is not strongly worded...

Various other small improvements

ciphergoth · 2017-07-01T18:05:38Z

For the specific example given you could also do this:

first_container.iter().filter(|&&v| v > 0).chain(
        second_container.iter().filter(|&&v| v < 0)
    ).map(|&v| v).next().unwrap_or(0);

mark-i-m · 2017-07-01T18:35:00Z

I won't lie. I'm pretty sure most of these examples can be expressed more elegantly with some refactoring, but it might take some effort...

…

On Jul 1, 2017 2:05 PM, "Paul Crowley" ***@***.***> wrote: For the specific example given you could also do this: first_container.iter().filter(|&&v| v > 0).chain( second_container.iter().filter(|&&v| v < 0) ).map(|&v| v).next().unwrap_or(0); — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2046 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIazwKQtyYLkV7CG1drVxECCQmN8cGJzks5sJop3gaJpZM4OFz9D> .

Ericson2314 · 2017-07-03T11:56:19Z

Big fan of this; thanks for writing!

One nit is while the desugaring is correct, I don't think in long term we should implement/teach/think about the feature that way. I rather have:

Break on block is fundamental
Breaking out of loop is no different than breaking out of underlying block
Break with no label is desugared to innermost loop instead of innermost block for historical reasons.

RFC language can be incremental I suppose, but o hope something like that makes the books.

ciphergoth · 2017-07-03T15:51:16Z

Ericson2314 - how do we integrate continue into that teaching plan?

Maybe we could describe loops as being implicitly this:

'outer: {
    for i in container.iter() {
        'inner: {
            LOOP BODY
        }
    }
}

Then break means break 'outer and continue means break 'inner. continue 'label really means something like continue 'label_inner.

mark-i-m · 2018-03-02T18:18:56Z

I think the question is inherently a bit subjective... Does this encourage less-clear-than-it-could-be control flow? (as opposed to an objective question: does this fundamentally introduce new control flow?)

Anyway, I think that the best thing to do now would be to focus on making the feature as good as it can be 🥇

SoniEx2 · 2018-03-02T22:56:40Z

Does this encourage less-clear-than-it-could-be control flow?

I mean, it allows replacing:

loop {
/* ... */
if cond { break val; }
/* ... */
break otherval;
}

with:

'block: {
/* ... */
if cond { break /*'block?*/ val; }
/* ... */
val
}

So I mean, if you find the former clearer... ;)

derekdreery · 2019-06-15T14:57:39Z

I'm a big fan of this idea for a number of reasons

People are already doing this in hacky ways (loops that only ever run once etc.). I think this proves the need exists
It's not goto. You can't jump arbitrarily around. goto is sometimes used like this in C, but IMO that's what it should be used for.
It simplifies the language rather than complicating it. This is because it unifies concepts of early return and early ending of loops, so conceptually you can think of them as the same. I always like features that make the language simpler. (I accept this is partially subjective, hopefully I've put forward a convincing argument)

Kimundi · 2019-06-16T09:56:15Z

I also recently had to use the loop hack, so I'd really like to see this feature in Rust.

ijackson · 2021-03-23T18:54:32Z

Because the target of the break is ambiguous, code like the following will produce an error at compile time:
loop {
    'labelled_block: {
        if condition() {
            break;
        }
    }
}

I would strongly prefer this to be permitted, and resolved by only binding labelled breaks to labelled blocks.

Something like this is needed for macros which want to adjust the control flow. I came across one in if_chain. The loop hack can't be used there, because it's not possible to rethrow an inner break. The break can be caught, giving a value (which is possibly ()). But it can't be rethrown because the macro doesn't know whether it's in a for (where break () is forbidden) or a loop returning a non-unit (where break without a value is forbidden).

ciphergoth · 2021-03-23T19:03:43Z

What you want can still be written as

'loop: loop {
    'labelled_block: {
        if condition() {
            break 'loop;
        }
    }
}

Would that meet your needs?

ijackson · 2021-03-23T19:52:19Z

Would that meet your needs?

No, sadly not. The if_chain macro is a good example here, but there are other similar situations.

Currently

  if_chain! {
    if let Some(a) = get_a();
    if let Some(b) = get_b();
    then { Left((a, b)) }
    else { Right(42) }
  }

expands to (roughly)

  if let Some(a) = get(a) {
    if let Some(b) = get_b() {
      Left((a, b))
    } else {
      Right(42)
    }
  } else {
    Right(42)
  }

This is bad because it repeats the else clause and, worse, the different copies of it can pick up differnet variable bindings (imagine if there were an a and b in the environment and we said Right((a, b))! I want to expand to something like this:

  'if_chain_outer: {
    'if_chain_else: {
      if let Some(a) = get(a) {
	if let Some(b) = get_b() {
	  break 'if_chain_outer Left((a, b));
	} else {
	  break 'if_chain_else;
      } else {
	break 'if_chain_else;
      };
      Right(42)
    }
  }

But this does not work correctly if the macro user writes break. Eg,

  for x in stuff {
    if_chain!{
      let spong = {
        if complicated { break }

According to the current RFC, this break is "ambiguous" and must be rejected. Of course it's not ambiguous in the user's code. The macro just needs that break to bypass all the named blocks.

I don't think having it bypass the named blocks is at all strange or confusing. Perl's named blocks work the same way.

This limitation is a real shame becaue there are a lot of cool things that can easily be done with macros that are able to define their own control flow like this, but they're lacking this primitive. You can't use the loop trick either for similar reasons to do with break.

SoniEx2 · 2021-03-23T20:00:20Z

This limitation would force user crate to break explicitly. How is that not a good thing? If anything this limitation can be used by macros to prevent user break from conflicting with macro break!

ijackson · 2021-03-23T20:00:42Z

(You might think that the macro could "catch" the user's break by surrounding the user's code with a loop, and then rethrow it. But the macro doesn't know if it is inside a loop, and if so whether that's a for loop where break () is forbidden or a loop loop with a value return where value-less break is forbidden, so it doesn't know whether to emit the rethrowing code.)

SoniEx2 · 2021-03-23T20:02:03Z

Do you want hygienic break?

ijackson · 2021-03-23T20:04:58Z

This limitation would force user crate to break explicitly. How is that not a good thing?

I'm not sure what you mean. The if_chain macro doesn't have anything to do with loops and the user doesn't expect it to "steal" break. In the user's code break is quite clearly referencing the loop, just as it would be if the if were an if rather than an if_chain!.

If anything this limitation can be used by macros to prevent user break from conflicting with macro break!

Err, if the macro is supposed by its function to embed the user's code in a loop, then the user's break ought to refer to the macro's loop. Presumably the macro would provide the user with a way to name the macro-defined loop.

Do you want hygienic break?

Essentially, yes. In fact the named break is already hygienic, I think. The labels are identifiers and I assume they are already hygienic so that if the macro and the user use the same name, it won't be treated as the same identifier.

The only difficulty is that the "unnamed label" is not hygienic because it inserts itself unconditionally.

SoniEx2 · 2021-03-23T20:10:28Z

Well, luckily wrapping the user code in a label-break-value causes it to error instead of miscompiling, thus alerting the user of how to fix it. That's a win, yeah?

You don't make the unnamed label hygienic. You introduce a mechanism that can invalidate the unnamed label. That mechanism is called label-break-value.

alercah · 2021-03-23T22:01:09Z

I think it might be worth arguing that break () ought to be permitted in for and while.

There's definitely also an argument for hygiene of unlabeled break.

This RFC is not really the right place to argue either, though, especially not in the now-closed RFC thread. This conversation should be on ILRO or Zulip.

scottmcm · 2023-01-18T17:37:31Z

Looks like this predates the "Prior Art" section, but nitroll mentioned that emacs lisp has this, even using 'foo syntax: https://www.gnu.org/software/emacs/manual/html_node/elisp/Catch-and-Throw.html

label-break-value RFC

420b72b

ciphergoth changed the title ~~label-break-value RFC~~ RFC: label-break-value Jun 26, 2017

Fill in start date and link to PR

d4adfa6

Delete superfluous paragraph

9281a8f

link to discussions in discussions instead.

withoutboats added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jun 27, 2017

Fix broken Rust code with proper references

670c1e0

Various other small improvements

nikomatsakis assigned nikomatsakis and scottmcm and unassigned nikomatsakis Jul 6, 2017

rpjohnst mentioned this pull request Apr 14, 2018

RFC: Reserve try for try { .. } block expressions #2388

Merged

Centril mentioned this pull request Apr 21, 2018

if let && bool #2411

Closed

Centril mentioned this pull request May 3, 2018

block-break-value #2430

Closed

SoniEx2 mentioned this pull request May 4, 2018

RFC: throw expressions #2426

Closed

Centril mentioned this pull request May 27, 2018

RFC: Reserve throw and fail as keywords in edition 2018 #2441

Closed

dtolnay mentioned this pull request Sep 2, 2018

Release 0.15 dtolnay/syn#476

Closed

Centril added A-syntax Syntax related proposals & ideas A-control-flow Proposals relating to control flow. A-expressions Term language related proposals & ideas labels Nov 23, 2018

ciphergoth deleted the label-break-value branch February 28, 2019 07:00

RalfJung mentioned this pull request Jan 16, 2020

Inline assembly #2850

Closed

ijackson mentioned this pull request Mar 23, 2021

Fix scope of else clause, and avoid duplication lambda-fairy/if_chain#14

Closed

camsteffen mentioned this pull request Jun 15, 2021

RFC: let-else statements #3137

Merged

Calvin304 mentioned this pull request Feb 7, 2023

clippy::never_loop has false positive with break from labeled blocks rust-lang/rust-clippy#10304

Closed

azam mentioned this pull request Apr 18, 2023

Remove break labels in macros azam/azamcodec-rs#10

Merged

sollyucko mentioned this pull request Sep 16, 2023

[Language] for-else and while-else #3361

Open

RFC: label-break-value #2046

RFC: label-break-value #2046

Conversation

ciphergoth commented Jun 26, 2017 • edited by Centril Loading

mark-i-m commented Jun 27, 2017

scottmcm commented Jun 27, 2017

eddyb commented Jun 27, 2017

mark-i-m commented Jun 27, 2017

mark-i-m commented Jun 27, 2017

ciphergoth commented Jun 27, 2017

ciphergoth commented Jun 27, 2017

ciphergoth commented Jun 27, 2017

golddranks commented Jun 27, 2017 • edited Loading

eddyb commented Jun 27, 2017

ciphergoth commented Jun 27, 2017 • edited Loading

mark-i-m commented Jun 28, 2017

est31 commented Jun 28, 2017

mark-i-m commented Jun 29, 2017

scottmcm commented Jun 29, 2017

ciphergoth commented Jun 29, 2017

mark-i-m commented Jun 30, 2017

ciphergoth commented Jul 1, 2017 • edited Loading

mark-i-m commented Jul 1, 2017 via email

Ericson2314 commented Jul 3, 2017

ciphergoth commented Jul 3, 2017

mark-i-m commented Mar 2, 2018

SoniEx2 commented Mar 2, 2018 • edited Loading

derekdreery commented Jun 15, 2019

Kimundi commented Jun 16, 2019

ijackson commented Mar 23, 2021

ciphergoth commented Mar 23, 2021

ijackson commented Mar 23, 2021

SoniEx2 commented Mar 23, 2021

ijackson commented Mar 23, 2021

SoniEx2 commented Mar 23, 2021 • edited Loading

ijackson commented Mar 23, 2021

SoniEx2 commented Mar 23, 2021

alercah commented Mar 23, 2021

scottmcm commented Jan 18, 2023

ciphergoth commented Jun 26, 2017 •

edited by Centril

Loading

golddranks commented Jun 27, 2017 •

edited

Loading

ciphergoth commented Jun 27, 2017 •

edited

Loading

ciphergoth commented Jul 1, 2017 •

edited

Loading

SoniEx2 commented Mar 2, 2018 •

edited

Loading

SoniEx2 commented Mar 23, 2021 •

edited

Loading