-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syntax flaw: return type #760
Comments
I'm mentioning it for the sake of completeness, not as a serious suggestion, but the ambiguity can be resolved with arbitrary look-ahead / back-tracking. For tooling reasons though you probably want zig's grammar to be LL(1) or LR(1). |
As an aside, I do like having the That said, for this issue it seems the problem is figuring out where the type declaration ends and the bod begins. The main flaw is that Possible solutions I see:
I don't have other ideas at the moment. |
@hasenj Your first suggestion is pretty ok if we can declutter it a bit. With number 3, readability can be hindered by forcing everything to have a name/alias. Number 2 seems like a good compromise. As an iteration on the first, using
|
@raulgrell taking your if x + y > 10 => {
} else {
}
if foo() => |value| {
} else |err| {
}
while it : |node| it = node.next => {
}
while x < 10 : x += 1 => {
} switch expressions already match this pattern. I'm not so sure about this. I'd like to explore changing container initialization syntax and container declaration syntax to disambiguate. |
To clarify:
|
I'd love to rid The curly brackets for container initialization is common in C, C++, Rust, C#. On the other hand, of those, only Rust allows the omission of the parentheses in if statements. Alternative container initialization methods include:
|
This is the cleanest iteration of the above ideas I have found, using the new keyword and changing declaration syntax. Declaration syntax changes to use parentheses instead of curly brackets, and square brackets instead of parentheses. Initialization requires
Actually, with these changes to declaration, I think initialization syntax can stay the same. I'm just not sure about the "struct literal A()" bit.
|
Here's my proposal:
const Foo = struct.{
a: i32,
b: i32,
};
const Id = enum.{
One,
Two,
};
const Blah = union(enum).{
Derp,
Bar: struct.{
aoeu: f64,
},
};
const FileError = error.{
OutOfMemory,
AccessDenied,
};
test "aoeu" {
var array = []u8.{'a', 'b', 'c', 'd'};
var x = Foo.{
.a = 1234,
.b = 5678,
};
if x.a == 1234 {
while true {
eatStinkyCheese();
}
}
} This fixes all the parsing ambiguity. Now |
It looks a little awkward but I guess that's no reason not to do it. That said, I didn't see discussion on reintroducing |
Not really. I removed
|
Ah right. And mandating parens around the return type makes for bad error messages if you forget them. I see the issue now. |
I was wondering whether it would be beneficial to change the built in field/reflection syntax from using the regular scope accessor
Then keep your proposal and make container instantiation requiring a
|
Ok with the proposal, but:
Do we still need dot a, dot b here? We already have a dot after Foo. |
Only because this was mentioned in a separate issue, I just wanted to voice my support for your most recent proposal - disregard my Regarding @jido's point about the dot (ha, pun), I think they are consistent with the rest of the language.
If we look at consider the enum type inference in switches proposed in #683, they mention of issues with shadowing without the
|
Thanks for doing that work, @Hejsil . To those who prefer the previous syntax, I empathize because I also prefer the previous syntax. However parsing ambiguities are simply not acceptable. And so we're stuck with the least worst solution. Proposals for different syntax/solutions are welcome, provided that they are better than status quo and do not introduce any syntax ambiguities. |
@andrewrk as I mention in my previous comment in #1628, using the dot here harm readability, as it's used usually as a member access operators, here it's overloaded with two extra meaning If I'm getting this right, the problem is to solve the ambiguity of expressions like this fn foo() A {}
fn foo() error{}!void {} All what we need is to surround the return type with symbols that are not used somewhere else, how about fn foo() `A {}`
fn foo() `error{}!void` {}
I much prefer to require |
@allochi I'm going to amend your proposal and change it from surrounding the return type with backticks to surrounding it with parentheses. I think that's actually a reasonable proposal. If we end up doing named return values to solve #287, then parentheses around return values is especially reasonable. |
How about in case of ambiguity use a dot? fn foo() A {} // function returns A Now everything stays the same for the most part, assuming this works |
Here's how I'm thinking it: {} is always the body of what's to the left of it. const A = struct {} // works fine The function decl parsing code is special cased to expect the {} to always refer to its body, and it must be escaped if you want to refer to something else. fn foo() A.{} {} |
@UniqueID1 It is encouraged/good style to use error sets more than |
If the standalone |
@Hejsil Wanting to have (subjectively) uglier syntax for encouragement reasons is a valid viewpoint, if I'm not simplifying your viewpoint too much. I think it's in a way risky. I think it'll encourage people to switch over from the global error set at the wrong time for the wrong reason. When I say "at the wrong time" I'm channeling my inner Jonathan Blow who has a way of writing really imperfect code at the right time which works for him to stay hyper productive. So if you're not handling errors yet, and you might not for years, then you'll use the global error set to keep yourself from going off on a tangent such as having to define the errors correctly before you'll ever need to handle them. |
@UniqueID1 I see your concern, but I'm pretty sure inferred error sets cover 99% of this use case (fast prototyping), because in 99% of cases you (or the compiler), knows which errors can be returned from a function. If you don't care, then just add |
Ahh okay, also anyerror is very descriptive so that's a plus for it. It's actually not bad at all and it tells the newbies exactly what it does. |
Alright, seems like we might have found a better solution than what was merged in #1628. Any objections should be voiced as soon as possible, as I wanna revert #1628, and start implementing the proposed solution. One downside to this is that that we have to reject #1659. |
This doesn't seem right. Everything else about the proposed TypeExpr seems good, but I think this example should fail to parse. The formal grammar specification for Zig isn't always up to date, but it might be able to shed light on this if it's updated with this proposal. This is also not very important and shouldn't block any forward progress. This is just something to iron out before 1.0.0. |
I don't think we can have that example fail as simple cases like My plan was to work on #208, but it seems we still haven't reached a conclusion on this (and this issue seemed to be related, so I switched my focus to this). I'll probably work towards having the grammar up to date, as that will help with the stage2 parser rewrite when that is gonna happen. Then add some kind of test that ensures that it stays up to date (have the grammar as a bison parser, and run this parser on all Zig src code in this repo on every commit). This will also help when prototyping new syntax. |
@Hejsil, I really like this proposal (#760 (comment)) a lot better than the one using dot to disambiguate. 😁 I don't love the Is there a place for discussing the error naming or semantics? Please feel free to disregard, just my two cents as an outside observer. But maybe there is an even better solution here. |
@binary132 docs and error set issue :) |
@Hejsil thanks a lot for being so communicative and agile toward this issue. I'm glad that we are getting to agree on a solution for this, I don't mind parentheses in if, for, ..., I admit I got used not to use them in Go for example, but I don't mind them in other languages, and since these are expressions in zig it kind of works well, take for example this from the standard library const n = if (self.pos + bytes.len <= self.slice.len)
bytes.len
else
self.slice.len - self.pos;
// -- vs --
const n = if self.pos + bytes.len <= self.slice.len
bytes.len
else
self.slice.len - self.pos; I don't know about the others, but with parentheses it seems to be easier to read. Now regarding the global error set, why name it |
@allochi There are a few reasons for
|
@Hejsil Thanks, no strong feeling against |
const A = struct {}; // a struct |
@UniqueID1 Sorry, that was a confusing statement. I'm talking about this rename:
We could even rename both examples:
But that's not the goal of this issue. |
Or maybe best: // Leave `error` as is. This allows error.A to work without parser hacks
fn a() error { return error.A; }
// Rename `error` for error sets
const E = errorset {A, B}; |
@Hejsil Yeah, this is what I was thinking of too, it make sense this way. |
That was my proposal.
Edit: ..maybe, I don't know. I'm confused haha. |
Let me get this straight. The new keyword If that is correct, then I think there is no need for
|
I believe that there are 3 options
I prefer error + errorset over anyerror + error. However, with functions accepting TypeExpr instead of Expr, I believe we could also just have the error keyword and use parenthesis in some cases as shown above. Edit: I see that @andrewrk finds the parenthesis unfortunate (first post, 4th example). My bad Hejsil, forgot about that. In that case having two keywords might be better, and it might be clearer overall anyways. Personally I love error + errorset. Whatever the case, Hejsil's change with functions accepting TypeExpr is absolutely excellent. Super cool change! |
I agree with @UniqueID1. If there's an ambiguity in parsing inline return types in fn signatures, the right way to resolve the ambiguity is by explicit grouping. ISTM this would only be necessary when declaring a type inline in a fn signature's return type, and it makes it more readable, so the increased write-time effort is minimal compared to the benefits. a) It does not reducing readability overall, and in fact enhances it. Otherwise, fn signatures can read like a run-on sentence. Easy for humans to parse from left to right. |
The |
Fundamentally we have this ambiguity:
Is this a function that returns A with an empty body? Or is this a function that returns the struct literal
A{}
and we're about to parse the body?Another example
Function that returns an error, or empty error set, and we're about to find the function body?
The above is not allowed because zig thinks the return value is
error
and the function has an empty body, and then is surprised by the!
. So we have to do this:Unfortunate. And the compile errors for forgetting to put a return type on a function point at the wrong place.
Let's come up with a better syntax for return types. All of these issues are potentially on the table:
->
or=>
if it helps.void
return types.fn foo() !T {}
- is special syntax that belongs to a function definition that marks the function as having an inferred error set. It doesn't necessarily have to be specified as it currently is by leaving off the error set of the!
binary operator.The syntax change should be satisfying, let's make this the last time everybody has to update all their return type code.
The text was updated successfully, but these errors were encountered: