Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce expression-scoped variables #8019

Closed
ikskuh opened this issue Feb 15, 2021 · 28 comments
Closed

Introduce expression-scoped variables #8019

ikskuh opened this issue Feb 15, 2021 · 28 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@ikskuh
Copy link
Contributor

ikskuh commented Feb 15, 2021

Introduce expression-scoped variables

Original idea by @marler8997 in IRC.

Addresses #5070 and kinda addresses #358 as well.

Idea

The core idea of this proposal is to introduce a new expression type with:

with(var x: usize = 0) while(x < 10) : (i += 1) {
    std.log.info("counter = {d}", .{ i });
}

The with expression allows a in-expression declaration of a scoped variable. The code would be equivalent
to

{
    var x: usize = 0;
    while(x < 10) : (i += 1) {
        std.log.info("counter = {d}", .{ i });
    }
}

The idea of this is to create a way to declare variables that are explicitly scoped just to be used in a single expression or statement like most counter variables in a while or for loop.

Syntax

with follows the same syntax rules as while or if, allowing top-level use as well as expression use:

with(var i: usize = 0) while(i < 10) : (i += 1) {
    std.log.info("counter = {d}", .{ i });
}

var manhatten_distance = with(var dist = a.sub(b)) dist.x + dist.y;

Semantics

with accepts a single variable declaration and publishes the declared variable into the expression block.
The declared variable will not be visible outside the scope of with, rendering this code impossible:

<source>:8:28: error: use of undeclared identifier 'a'
    var x = (with(var a = 0) a) + a;
                                  ^

The result of with will be the result of the enclosed expression and will be passed through. with will not have any effect on the returned value.

Usage Examples

Encapsulating a counter variable:

with(var x: usize = 0) while(x < 10) : (i += 1) {
    std.log.info("counter = {d}", .{ i });
}

Encapsulating an iterator state:

with(var iter = hash_map.iterator()) while(iter.next()) |item| {
    std.log.info("key = {s},\tvalue = {d}", .{ item.key, item.value });
}

Encapsulating iterator state inside of a expression:

const contains_foo = with(var iter = std.mem.split(u8, string, " ")) while(iter.next()) |item| {
    if(std.mem.eql(u8, item, "foo"))
        break true;
} else false;

Minimal scope for dummy variables:

const Window = opaque{};

extern fn getSize(window: *Window, width: *c_int, height: *c_int) void;

fn getWidth(window: *Window) c_int {
  var width: c_int = undefined;
  with(var dummy: c_int = undefined)
    getSize(window, &width, &dummy);
  return width;
}

I know this example isn't a good showcase, but there is enough real-world C code that requires some pointers to be passed in even with dummy variables and with allows to reduce the scope those dummies will spill to the absolute minimum.

Inline-reuse of return values:

takesManyArgs(
  ...,
  with(const x = computeX()) x * x, // this prevents spilling `x` into the outer scope
  ...,
);

Explicit variable scope:

with(var offset: i32) {
  // offset is *meant* to be only visible to this scope
  // and is not just an accidental temp variable in this scope.
}

Why?

with is meant to be more ergonomic than introducing a block that just encapsulates a single variable that might even need a label, break statement and such.

Spilling variables into the outer scope introduces bugs (missed variable re-initialization, usage of wrong loop variables) which are hard to detect.

Most people won't introduce a new scope as it both looks weird (personal taste) and is hard to type, thus will mostly be left out and spills variables.

It also communicates the idea that the encapsulated variable is meant to be scoped into the expression or block

Variants

Using capture syntax instead of declaration syntax

This would affect the syntax and would use elements from other parts of the language

with(@as(usize, 0)) |*i| while(i.* < 10) : (i.* += 1) {
  …
}

Thanks to @fengb for proposing this

Abusecases

This section lists stuff how this feature can be abused

Replacement for @as

This one is kinda absurd, but it would work with the proposed version:

fn printExprType(val: anytype) void {
  @compileLog(@TypeOf(val));
}

comptime {
  printExprType(@as(u32, 0));
  printExprType(with(var t: u32 = 0) t);
}

Nested withs

This is something that could happen

var z = with(var x: usize = 10)
          with(var y: usize = 20)
            foo(x, y);
@Vexu Vexu added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Feb 15, 2021
@Vexu Vexu added this to the 0.8.0 milestone Feb 15, 2021
@cryptocode
Copy link
Contributor

cryptocode commented Feb 15, 2021

I think this is pretty verbose compared to the C equivalent for loop, but I would certainly accept it to solve the leaking-into-parent-scope problem. It's also more general, which is good.

with accepts a single variable declaration and publishes the declared variable into the expression block.

Is there a technical reason why it's limited to only a single declaration? It's not uncommon to do things like for(int x=0, y=0; ...) in C. The nested with-variation works, but that's a lot of code in comparison.

@ikskuh
Copy link
Contributor Author

ikskuh commented Feb 15, 2021

Is there a technical reason why it's limited to only a single declaration?

No, that is just a limitation to restrict the use of this feature. You can always fall back to normal named blocks

I think this is pretty verbose compared to the C equivalent for loop, but I would certainly accept it to solve the leaking-into-parent-scope problem. It's also more general, which is good.

Yeah, that's why i made a proper proposal out of this. I think it could work out and fit the language, but i'm not 100% sure

@Guigui220D
Copy link

It's the best fix i've seen for the problem yet and I like that it's general too, it's pretty elegant in my opinion

@cryptocode
Copy link
Contributor

No, that is just a limitation to restrict the use of this feature.

I think allowing only declarations is the correct restriction, allowing multiple declarations makes it more useful without encouraging misuse.

@cryptocode
Copy link
Contributor

cryptocode commented Feb 15, 2021

I just had a prophetic dream that this will be rejected for being syntactic sugar for { var x = 0; ... }

That's the most likely outcome :) But try doing that for nested loops - nobody does it, because it's too ugly even when formatted properly (and these bugs tend to happen after changes where scoping wasn't originally required)

@uael
Copy link

uael commented Feb 15, 2021

What about this one? Pretty straight forward

let x: usize = 0, y: usize = 0 in <...>

Or reuse var and just add in:

var x: usize = 0 in while(x < 10) : (i += 1) {
    std.log.info("counter = {d}", .{ i });
}

@Mouvedia
Copy link

Mouvedia commented Feb 15, 2021

I am with @uael on this, most programmers are more familiar with let foo: type = bar in.


I suppose with … in while() {} is a shorthand for with … in { while() {} }?

@czrptr
Copy link

czrptr commented Feb 16, 2021

I really like this idea and I think

with(<expr>) |<name>| { ... }

looks more in line with current syntax than

with(var <name> = <expr>) { ... }

@marler8997
Copy link
Contributor

This with construct could also be extended to work on generic functions. It could be used to alias complex types built from one or more generic types that need to be appear in multiple places, like this:

with (const T = []const [*:0]makeACoolType(a, doSomething(y)))
fn foo(a: anytype, b: anytype, c: T, z: T) T {
     ...
}

@marler8997
Copy link
Contributor

I think allowing only declarations is the correct restriction, allowing multiple declarations makes it more useful without encouraging misuse.

@cryptocode You could always do this:

with (var a = 2) with (var b = 3) ...

With @czrptr's suggestion it would look like this:

with (2) |a| with (3) |b| ...

@cryptocode
Copy link
Contributor

@marler8997 yeah that's more than good enough if there are reasons not to allow the (to me) more natural:

with (var a=2, var b=3) ... 

(still only declarations)

@ghost
Copy link

ghost commented Feb 16, 2021

@czrptr,
with(<expr>) |<name>| { ... } doesn't have a place for specfying the type, if you have to, which makes it inferior to the originally proposed syntax, IMO.

Edit: The alternative syntax also doesn't allow a distinction between var and const.

@czrptr
Copy link

czrptr commented Feb 17, 2021

@zzyxyzz
The type wouldn't be needed in most cases I think but the constness problem is something I didn't think about. I do agree now that the first proposed syntax is better.

@mattnite
Copy link
Contributor

I'm for the fengb variant with (...) |...|. We could handle mutability with *, similar to other uses of captures. @as() can set the type, and multiple values could be done:

with (.{ .a = @as(u32, 5), b = "str"}) |*data| while (data.a > 0) : (data.a -= 1) {
    std.log.err("{s}: {d}", .{data.str, data.a});
}

After writing that out I'm a bit meh about it, just trying to keep the number of syntax rules in my brain down by trying to make it similar to other patterns we already have.

@ghost
Copy link

ghost commented Feb 17, 2021

@mattnite,
This syntax is indeed more consistent with the status quo, but the cast and the pointer dereference add visual noise, and reduce the ergonomic benefits of this proposal to the point where it's no longer clear whether there is a benefit. For example, I'm not sure I would prefer

with (@as(i32, 0)) |*i| while (i.* < 10) : (i.* += 1)
    std.log.info("counter = {d}", .{ i.* });

to

{var i: i32 = 0; while (i < 10) : (i += 1)
    std.log.info("counter = {d}", .{ i });
}

or even

{
    var i: i32 = 0;
    while (i < 10) : (i += 1)
        std.log.info("counter = {d}", .{ i });
}

let alone C's

for (int i = 0; i < 10; ++i)
    std.log.info("counter = {d}", .{ i }); // taking liberties here

or D's

foreach (i; 0 .. 10)
    std.log.info("counter = {d}", .{ i }); // ... and here

My 2¢.

@marler8997
Copy link
Contributor

marler8997 commented Feb 17, 2021

If we squint our eyes and twist our brains we can kind of massage an if statement to do something similar to the proposed with statement:

// this is just sugar to make any value optional so we can use it inside an `if` statement
fn with(x: anytype) ?@TypeOf(x) {
    return x;
}

pub fn main() !void {
    if (with(@as(usize, 0))) |*i| while (i.* < 10) : (i.* += 1) {
        @import("std").log.info("counter = {d}", .{ i.* });
    };
}

One big difference with this is that if always creates a const value; so if we want our resulting symbol to be mutable we have to declare our result as a pointer type rather than a value type (i.e. |*i| instead of just |i|).

Things brings up another question, maybe our if/while/for constructs should support mutable values, like this:

var optional_value : ?usize = 0;
if (optional_value) |var value| {
    // value is a mutable copy of the payload of `optional_value`
    value += 1; // OK, doesn't affect 'optional_value'
}

Maybe this is an area that should see some exploration. I haven't seen much on it, has anyone else? Taking it a step further would allow an explicit type to be declared:

if (opt_val) |var val : usize| { 
}

@ghost
Copy link

ghost commented Feb 17, 2021

Interesting idea, but I'm afraid that

if (with(@as(usize, 0))) |*i| while (i.* <= 10) : (i.* += 1) {
    std.debug.print("{}\n", .{ i.* });
}

is only the second-coolest way ever of printing the numbers from 0 to 10. The gold medal still belongs to brainfuck, even though the solution is considerably shorter:

-[>+<-----]>---<++++++++++<++++++++++[>>.+<.<-]>>---------.-.

@mrakh
Copy link
Contributor

mrakh commented Feb 17, 2021

This construct would require a new keyword and context-sensitive declaration parsing, for what is essentially syntax sugar that might save three lines, at best. I'm not a big fan of it.

@gonzus
Copy link
Contributor

gonzus commented Feb 18, 2021

If you could add several names to a declaration, with something like var x: usize = 1, y: isize = 2;, you could use the original proposal to declare more than one variable in the with statement.

On the other hand, this would not solve the issue of declaring both var and const elements inside with.

@ghost
Copy link

ghost commented Feb 18, 2021

Another small note concerning notation: The "let/var/const ... in" variation has the problem that you could do things like that:

var a = const b = var c = const d = foo() in d in c in b;

which is equivalent to var a = foo(); This is a reasonably natural style in ML-derived languages, especially when formatted like this:

let a =
    let b =
        let c =
            let d = foo()
            in d
        in c
    in b;

, but allowing this in Zig would be problematic. I really think that the original with(var name: type = value) expr is the only reasonable syntax that covers all intended use cases.

@uael

@raulgrell
Copy link
Contributor

raulgrell commented Feb 18, 2021

I love this feature, thanks for the comprehensive proposal!

I've wanted to propose something like the with(expr) |name| { ... } form for a long time, but it either requires an awkward @as(T, 0) or declaring the type in the capture if the expression is a literal.

An alternative is with (Type) |name| : (expression) {...} where name is initialized to expression where the : (expression) is optional, otherwise initializing name to undefined. Forgetting to include the value is probably a footgun, but it should be caught in debug builds at least.

with (usize) |i| : (0) while (i < 10) : (i+= 1) {
    std.log.info("counter = {d}", .{ i });
}

My original motivation was from before return location semantics so I wanted an explicit way to initialize a struct field to point to another field. It also enabled the kind of "anonymous block" that was proposed before and rejected because named return blocks are sufficient.

const scanner = with (Scanner) |s| {
    s.buffer = std.mem.zeroes([256]u8);
    s.current= &s.buffer[0];
    while (s.current.* < ' ') s.current += 1;
}

as an alternative to

const scanner = blk: {
    var s: Scanner = undefined;
    s.buffer = std.mem.zeroes([256]u8);
    s.current= &s.buffer[0];
    while (s.current.* < ' ') s.current += 1;
    break :blk s;
}

This would be the use-case for the optional init expression, but we could require : (undefined) for the sake of explicitness.

@raulgrell
Copy link
Contributor

I am also a fan of short variable names for complicated or repeated expressions. Often my capture variables are just single letters because it's obvious what it refers to. It's a double edged sword - less noise is easier to read, but you lose semantic hints from the names. This could be a nice way of creating scopes for temporary aliases. It can also be an interesting way to create "contexts" if we want the expression-statement ability without breaking from a block by adding a : (result_expression)

const image = with (var r = self.renderer.begin()) : (r.end())  {
    r.clear();
    for (objects) |o| r.submit(o);
}

Which I guess would be the equivalent of something like

const image = blk: {
    var r = self.renderer.begin();
    defer break :blk r.end();  
    r.clear();
    for (objects) |o| r.submit(o);
}

@matu3ba
Copy link
Contributor

matu3ba commented Mar 19, 2021

I dont like to enable squeezing too much stuff on one line, because it hurts readability (to me)a lot.

It also reduces editing/refactoring simplicity, because you cant just move or copy the var x: usize = 0;.

One nice thing about zig is for example that loops require an explicit annotation of the loop precondition, so there is less clutter on one line to search.

@andrewrk andrewrk modified the milestones: 0.9.0, 0.10.0 Nov 23, 2021
@iamfraggle
Copy link

iamfraggle commented Dec 31, 2021

An additional reason for the 'why?' of this is that indentation levels are a limited resource, and the current idiom for while iterators consumes one simply for variable initialisation.

Given existing code, I'm surprised that no one has any examples that keep the with clause on a separate line like:

with (var x: u8 = 0)
while (x < 10) : (x += 1) {
    // code...
}

The last example usage "Explicit variable scope" is, I think, quite useful for explicit resource management if with can accept multiple statements (mostly because I have a deep prejudice against context-less blocks):

with ({ var a = Resource.init(); defer a.deinit(); }) {
    // code...
}

This might(?) also apply to some while usages.

@ghost
Copy link

ghost commented Feb 6, 2022

While this might be less cumbersome than another scope, it is still strictly more typing than leaking the index. It’s the right direction, but it doesn’t move the point of minimum effort, so it’s not a solution.

Also, the example puts one line strictly within the scope of the previous line, with no indentation. There was at one point a compile error proposed for that kind of thing, because it will be missed easily when skimming. The solution applied in the rest of the language for such cases is indentation, which is a key thing that this proposal aims to prevent.

I really don’t think this is worth the effort.

@jibal
Copy link

jibal commented Feb 27, 2022

I suggest

with var x: u8 = 0: if the parser can handle it. (I don't think there's any ambiguity or problem with diagnostics because a ':' in a declaration must always be followed by a type.) I think this is more natural and it makes it clear that the declaration applies to whatever follows it.

Another possibility, of course, is to do what C++ and other languages have done and allow putting declarations inside of constructs that start a scope, e.g., while (var i: usize = 0; i <= max) and ditto for if, for, even switch.

@andrewrk andrewrk modified the milestones: 0.10.0, 0.11.0 Apr 16, 2022
@deflock
Copy link

deflock commented Dec 29, 2022

In Pascal and Nix this with keyword is used in similar purposes to "inject" something into a scope, though it's used there usually for already declared and initialized constructions. A lot of people have mixed feelings on using it and it's considered as a bad practice in both languages.

I believe the main problem with it is that at the beginning you think this construction will be used only near your "small" block where you control everything, but after that someone starts making some really crazy things by using this with with @import(), complex structs, or somewhere 5 screens above, etc. and now you don't already understand what your current scope is :)

Otherwise if this is going to be accepted I'd propose to rename it to something less confusing like bind(), scoped(), declare().

PS. Sorry if this comment makes no sens in context of Zig, I don't have much experience and just digging through different proposals

@Nairou
Copy link

Nairou commented Jan 5, 2023

I would like to add that the C# language does something similar via the using keyword. Theirs does a little more than just defining a variable, but close enough for recognition of intent.

@andrewrk andrewrk modified the milestones: 0.11.0, 0.12.0 Apr 9, 2023
@andrewrk andrewrk modified the milestones: 0.13.0, 0.12.0 Jul 9, 2023
@andrewrk andrewrk closed this as not planned Won't fix, can't repro, duplicate, stale Oct 30, 2023
@andrewrk andrewrk modified the milestones: 0.13.0, 0.12.0 Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests