Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUGGESTION] [BUG] The syntax of assignments and initializations should be the same for multiple arguments. #449

Closed
msadeqhe opened this issue May 11, 2023 · 20 comments · Fixed by #487

Comments

@msadeqhe
Copy link

msadeqhe commented May 11, 2023

Describe it

I don't know I'm reportring a bug or a feature request. This is an example:

point: type = {
    operator=: (out this, x: int, y: int) = {}
}

check: (p: point) = {}

main: () = {
    p0: point = (0, 0);
    p0 = (0, 0); // BUG!
    // Use this instead:
    p0 = : point = (0, 0);

    p1: point;
    p1 = (0, 0);
    p1 = (0, 0); // BUG!
    // Use this instead:
    p1 = : point = (0, 0);

    check((0, 0)); // BUG!
    // Use this instead:
    check(: point = (0, 0));
}

I've highlighted important lines with BUG! comment.

What is the result?

cppfront generates the following wrong Cpp1 code for main:

auto main() -> int{
    point p0 {0, 0}; 
    p0 = 0, 0;   // BUG!
    // Use this instead:
    p0 = point{0, 0};

    cpp2::deferred_init<point> p1; 
    p1.construct(0, 0);
    p1.value() = 0, 0;// BUG!
    // Use this instead:
    p1.value() = point{0, 0};

    check((0, 0)); // BUG!
    // Use this instead:
    check(point{0, 0});
}

So p0 and p1 don't work at second assignment. Also the first call to check doesn't work, and the type is dependent on overloaded check functions.

What did I expect to happen?

Let's examine the example. The following lines are similar while I expected them to work:

// ...

p1: point;
/* Many statements are here... */
p1 = (0, 0);
/* Many statements are here too... */
p1 = (0, 0); // BUG!

// ...

The first assignment is for object initialization and it works, but the second assignment doesn't work. It leads to surprises at least for novice programmers like me, whereas initialization and assignment are unified with operator= in Cpp2.

Solutions

Three options are available.

1. Make it a syntax error. Do not support it at all.

As simple as that:

p0: point = (0, 0); // OK. It calls the constructor.
p0 = (0, 0); // ERROR!

p1: point;
p1 = (0, 0); // OK. It calls the constructor.
p1 = (0, 0); // ERROR!

2. Support constructor call on assignment.

In assignments the type is completely known (unlike function calls in which positional argument types depend on declared overloaded functions), and (...) for object construction could be supported without any ambiguous.

I have to mention we have two types of constructors: implicit and explicit (the default). The difference between initializations and assignments is that, initializations (first assignments) always works with both implicit and explicit constructors, but assignments (except the first one) won't work with explicit constructors without any extra syntax.

So the previous example for implicit constructor will be like this:

p0: point = (0, 0); // OK. It calls the implicit constructor.
p0 = (0, 0); // OK. It calls the implicit constructor to assign to the variable.

p1: point;
p1 = (0, 0); // OK. It's the first assignment. It calls the implicit constructor.
p1 = (0, 0); // OK. It calls the implicit constructor to assign to the variable.

The above example should work on assignments for implicit constructors, but unnamed objects are required on assignments for explicit constructors. It's like this for an explicit constructor which has only one argument of type int:

number: type = {
    operator=: (out this, n: int) = {}
}

n0: number = 2;
n0 = 2; // ERROR! It doesn't have implicit constructor.
n0 =: number = 2; // OK. It calls explicit constructor.

Now, that example would be like this for explicit constructors:

p0: point = (0, 0); // OK. It calls the explicit constructor.
p0 =: point = (0, 0); // OK. It calls the explicit constructor to assign to the variable.

p1: point;
p1 = (0, 0); // OK. It's the first assignment. It calls the explicit constructor.
p1 =: point = (0, 0); // OK. It calls the explicit constructor to assign to the variable.

The point about this alternative solution is that it's consistent with unnamed objects, at the same time the syntax of initialization and assignment will be distinct for explicit constructors (but they are the same for implicit constructurs).

This syntax complements arrays and dictionaries in Cpp2 as described in this comment. Similarly this feature can be supported for compound assignment operators to directly call a constructor.
In this way, we can easily work with arrays, dictionaries and other objects:

// (key, value) are the arguments for object constructor of `std::pair<std::string, int>`.
c0: my::dictionary<int> = [("a", 1), ("b", 2)];

// Now the dictionary have a different value.
// This is an implicit constructor.
c0 = [("a", 10), ("b", 20)];

// Directly call the constructor and add objects to the dictionary.
c0 += [("c", 30)];

Similar rule applies to -=, *=, /= and other compound assignment operators.

3. Make all assignments to implicitly call constructors

That is the goal of Cpp2 to unify assignments and initializations (e.g. operator=). It doesn't matter if a constructor is implicit or explicit for assignment operators including compound assignment operators (e.g. +=) and return statements:

p0: point = (0, 0); // OK. It calls the explicit/implicit constructor.
p0 = (0, 0); // OK. It calls the explicit/implicit constructor to assign to the variable.

p1: point;
p1 = (0, 0); // OK. It's the first assignment. It calls the explicit/implicit constructor.
p0 = (0, 0); // OK. It calls the explicit/implicit constructor to assign to the variable.

run: () -> point = {
    return (0, 0); // OK. It calls the explicit/implicit constructor and returns.
}

But implicit and explicit constructors have different behaviours in other expressions and operators such as passing arguments in function calls and etc. For example:

number: type = {
    operator=: (out this, n: int) = {}
}

point: type = {
    operator=: (out this, x: int, y: int) = {}
}

a: = (: number = 2) + 2; // ERROR! It would work if it had `implicit` constructor.
a: = (: number = 2) + (: number = 2); // OK.

a: = (: point = (0, 0)) + (0, 0); // ERROR! It would work if it had `implicit` constructor.
a: = (: point = (0, 0)) + (: point = (0, 0)); // OK.

call(2); // ERROR! It would work if it had `implicit` constructor.
call(: number = 2); // OK.

call((0, 0)); // ERROR! It would work if it had `implicit` constructor.
call(: point = (0, 0)); // OK.

Conclusion

I think:

  • Option 1 doesn't solve anything. Simply banning a feature doesn't make sense.
  • Option 2 is good enough.
  • Option 3 fits to the goal of Cpp2 very well. IMO I like this one.

Will your feature suggestion eliminate X% of security vulnerabilities of a given kind in current C++ code?

No.

Will your feature suggestion automate or eliminate X% of current C++ guidance literature?

Yes. Initialization (first assignments) and assignments (except first assignments) are unified with operator= in Cpp2. It leads to surprises at least for novice programmers if visually that's not true.

Edits

  1. I've replaced option 3.
@msadeqhe msadeqhe changed the title [SUGGESTION][BUG] Object Initialization and Assignment [SUGGESTION] [BUG] Assignments and Initializations May 11, 2023
@JohelEGP
Copy link
Contributor

    p0: point = (0, 0);
    p0 = (0, 0); // BUG!
    // Use this instead:
    p0 = : point = (0, 0);

We went over this at #401 (comment).
"See also #321".

    check((0, 0)); // BUG!
    // Use this instead:
    check(: point = (0, 0));

AFAIK, you could always use parentheses to initialize a variable, and assign to one (modulo bugs).
ab29f19 added support for return.
I don't remember check((0, 0)); // BUG! ever being supported (or it was always bugged).
So the bug is either that it lowers

  • to a comma operator, rather than being rejected, or
  • to a parenthesized expression list, rather than braces for initialization.

Solutions

Something that's always been on my mind is how implicit (and its lack thereof)
has no effect at all on how you can call the assignment operator.
Seeing your examples makes me think it's OK that it only affects construction.
Cpp2 has operator= for unified construction and assignment, so the = is explicit.
At the end of 3. you point out where implicitness does have an effect.

  1. Make it a syntax error. Do not support it at all.
  1. Support constructor call on assignment.
  1. Make all assignments to implicitly call constructors

Conclusion

I think:

* _Option 1_ doesn't solve anything. Simply banning a feature doesn't make sense.

* _Option 2_ is good enough.

* _Option 3_ fits to the goal of Cpp2 very well. IMO I like this one.

I don't think an assignment should implicitly call a constructor.
What if there's no available operator=: (out this, data) for construction?
What if the user explicitly wrote an operator=: (inout this, data);? It won't get called.

@JohelEGP
Copy link
Contributor

    check((0, 0)); // BUG!
    // Use this instead:
    check(: point = (0, 0));

AFAIK, you could always use parentheses to initialize a variable, and assign to one (modulo bugs).
ab29f19 added support for return.
I don't remember check((0, 0)); // BUG! ever being supported (or it was always bugged).
So the bug is either that it lowers

  • to a comma operator, rather than being rejected, or
  • to a parenthesized expression list, rather than braces for initialization.

Reading #448 (comment) made me realize that the reason these (happen to) work is thanks to the primary-expression '(' expression-list ')'.
https://github.com/hsutter/cppfront/wiki/Design-note%3A-Unambiguous-parsing mentions that

  • Cpp2 has no comma operator, so that removes the possibility of b,c visually being a comma-expression.

In its context, b,c are template arguments, so it's not a comma expression.
But in this example and the linked one, the are lowered to Cpp1 comma expressions.

@msadeqhe
Copy link
Author

Thanks for your informative comment, and sorry for my duplicated bug report.

I don't think an assignment should implicitly call a constructor.

operator= is the only way to write either a constructor or an assignment operator in Cpp2 that its goal is to unify constructors and assignments under the title "Unification of { copy, move, convert } x { construction, assignment }".

So semantically assignment operators are essentially the same as calling constructors. In this way, syntactically (visually) they should be the same too.

What if there's no available operator=: (out this, data) for construction?

It would have only the default constructor. So we can use assignment to set its value with it:

// Initialize it with default constructor
x: something = ();

// Call default constructor to set its value again
x = ();

a: int = 2;

// Set `a` to the default value of `int` which is `0`
a = ();

What if the user explicitly wrote an operator=: (inout this, data);? It won't get called.

This would happen:

// Initialize it with the constructor
x: something = data;

// Call the constructor to set its value again
x = data;

Briefly, anything that we can do in declarations after =, we can do in assignments.

@JohelEGP
Copy link
Contributor

So semantically assignment operators are essentially the same as calling constructors.

They necessarily aren't. That's why we can still overload them.

What if the user explicitly wrote an operator=: (inout this, data);? It won't get called.

This would happen:

// Initialize it with the constructor
x: something = data;

// Call the constructor to set its value again
x = data;

A reason to overload operator=: (out this, data) with operator=: (inout this, data)
would be to use the existing object for performance reasons.
Your suggestion amounts to #312 (comment).
This way, equivalent Cpp1 code would perform better because it'd actually call the assignment operator.

@JohelEGP
Copy link
Contributor

I don't know I'm reportring a bug or a feature request. This is an example:

Consistent with #451, I believe this should be a bug report.

    check((0, 0)); // BUG!

I think this is the only problem not covered by other issues.

    p0: point = (0, 0);
    p0 = (0, 0); // BUG!
    // Use this instead:
    p0 = : point = (0, 0);

    p1: point;
    p1 = (0, 0);
    p1 = (0, 0); // BUG!
    // Use this instead:
    p1 = : point = (0, 0);

I'd hope the latter // BUG! would be fixed along the first one (by fixing #321).
They each lower to p0 = 0, 0; // BUG! and p1.value() = 0, 0;// BUG!, respectively.
But these could be apart in cppfront's code.

@JohelEGP
Copy link
Contributor

A reason to overload operator=: (out this, data) with operator=: (inout this, data)
would be to use the existing object for performance reasons.
Your suggestion amounts to #312 (comment).
This way, equivalent Cpp1 code would perform better because it'd actually call the assignment operator.

An example is std::vector.
With the current semantics, v = ... reuses v's capacity.
With your suggestion, v = ... first constructors a vector, then move assigns to v, losing v's previous capacity.

@msadeqhe
Copy link
Author

Yes, you're right. I didn't mean to explain technically about implementation details. If operator= with inout this parameter is defined, it will be used for assignment. If it's not defined, operator= with out this parameter will be used instead of it. As you've explained, vector has overloaded operator=, one with inout this parameter, and another with out this parameter.

@realgdman
Copy link

I want to add I haven't found anywhere notes how could look syntax for those major cpp1 things:

  1. Structured binding [a, b] = foo()
  2. Variadic/paramter pack f(Ts... args) { g(&args...) }

Leaving note here, so to not forget for much later, when design will be harder to change.

@JohelEGP
Copy link
Contributor

Those are not supported yet.

@realgdman
Copy link

I don't know if this should be separate issue, but I have found some problems with initialization inside (parameters). I'll put it here, feel free to split if needed

Some excerpts

  1. Initialization in parameter before block fails to emit proper {} initializer
(copy  te : T =  () ) {}	//error, T te = ;
(           tf : T = (())) {}	//error, cpp2::in<T> tf = ();;
(inout sc : S = (2) ) {}	//error, S& sc = 2;
  1. assignment inside function call doesn't compile
foo(sf2 = S(-2));		//error, foo(tf2.construct(T()));
  1. assignment of return parameter doesn't deduce type
retout2: () -> (n := 2) = { return; } //error - return param must have a type

https://cpp2.godbolt.org/z/W5qThxWfG

@JohelEGP
Copy link
Contributor

3. assignment inside function call doesn't compile

foo(sf2 = S(-2));		//error, foo(tf2.construct(T()));

That's #368.

5. assignment of return parameter doesn't deduce type

retout2: () -> (n := 2) = { return; } //error - return param must have a type

I don't think that's supposed to work.
You are supposed to assign the anonymous return type's members inside the function's initializer.

@msadeqhe
Copy link
Author

msadeqhe commented May 12, 2023

I want to add I haven't found anywhere notes how could look syntax for those major cpp1 things:

  1. Structured binding [a, b] = foo()
  2. Variadic/paramter pack f(Ts... args) { g(&args...) }

Leaving note here, so to not forget for much later, when design will be harder to change.

I think structured binding would be like this in Cpp2:

// For declaration:
(a, b): tuple<int, float> = foo();

// For declaration with type deduction:
(a, b): = foo();

// For assignment:
(a, b) = foo();

It works with aggregates too.

@AbhinavK00
Copy link

A reason to overload operator=: (out this, data) with operator=: (inout this, data)
would be to use the existing object for performance reasons.

There should be a way to write a generalized out and inout operator=, no? Because that's how you fully unify {construction×assignment} otherwise we're now writing an inout operator= and an out operator= which is still writing two functions like an assignment and contructor function. Anyway we could proceed about this?

@JohelEGP
Copy link
Contributor

This isn't about generation.
You can still write one out this operator= and have it generate inout this operator=.
But the suggestion is to not use assignment, and instead construct and move assign.
That's what I'm pointing out.

@msadeqhe msadeqhe reopened this May 19, 2023
@msadeqhe
Copy link
Author

msadeqhe commented May 19, 2023

But the suggestion is to not use assignment, and instead construct and move assign.

Infact I didn't mean to suggest implementation detail. Sorry if I couldn't explain my intention well enough.

My suggestion is what @AbhinavK00 pointed out (to have similar syntax for construction and assignment with operator=, option 3 in the suggestion).

@JohelEGP
Copy link
Contributor

  1. Make all assignments to implicitly call constructors

It seems like option 3 is the same.
To not use operator=: (inout this, ...),
but instead implement non-definite first assignment =
in terms of operator=: (out this, ...) and operator=: (inout this, that).
In Cpp1 terms, to not use an existing non-copy/move assignment,
but instead construct and copy/move assign.

run: () -> point = {
    return (0, 0); // OK. It calls the explicit/implicit constructor and returns.
}

This seems like a departure from Cpp1,
where returning an initializer list can't invoke an explicit constructor.
I can't remember the implications.

@msadeqhe
Copy link
Author

By constructor I mean operator=, because constructor and assignment are unified with operator= in Cpp2. That's why currently in Cpp2, if we don't declare operator= with inout this, the operator= with out this will be used for assignment.

The whole point of my suggestion (option 3) is about the syntax that if constructors and assignments are unified via operator=, the syntax to use them should be the same too. I mean if I construct an object with this:

x: Type = ("text", 0);

The same syntax should be allowed for assignment:

x = ("text", 0);

My suggestion is about the syntax should be the same if internally both constructors and assignments are unified via operator=.

@msadeqhe msadeqhe changed the title [SUGGESTION] [BUG] Assignments and Initializations [SUGGESTION] [BUG] The syntax of assignments and initializations should be the same for multiple arguments. May 19, 2023
@JohelEGP
Copy link
Contributor

JohelEGP commented May 19, 2023

OK, that makes sense.
I unfortunately misinterpreted the words "assignment" and "constructor" as Cpp1 terms.
As I understand it, Cpp2 only has = or operator=.
And there's no official documentation to say
that an operator=: (out this, ...) is to be called a "constructor", and
that an operator=: (inout this, ...) (possibly generated from the above) is to be called an "assignment".

@JohelEGP
Copy link
Contributor

JohelEGP commented Jun 2, 2023

    check((0, 0)); // BUG!

I think this is the only problem not covered by other issues.

You might want to open a new suggestion for that
because I think that #487 "fixes" this,
except that as explained,
it actually performs a construction + move assignment.

@JohelEGP
Copy link
Contributor

I opened #542 for the above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants