Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement first-class functions #468

Merged
merged 58 commits into from
Jan 17, 2023
Merged

Implement first-class functions #468

merged 58 commits into from
Jan 17, 2023

Conversation

jfecher
Copy link
Contributor

@jfecher jfecher commented Nov 14, 2022

Related issue(s)

In progress of #467, but does not resolve it as this PR contains only internal/refactoring changes required to add lambdas and function types.

Summary of changes

Implements first-class functions in noir on top of PR #462. The majority of changes revolves around changing call expressions to accept any expression in the function position, and allow arbitrary variables to refer to functions.

Test additions / changes

None, waiting for #467 to be completely resolved to test lambdas and first class functions.

Checklist

  • I have tested the changes locally.
  • I have formatted the changes with Prettier and/or cargo fmt with default settings.
  • I have linked this PR to the issue(s) that it resolves.
  • I have reviewed the changes on GitHub, line by line.
  • I have ensured all changes are covered in the description.

Additional context

This PR is mostly internal changes, we can have another PR to implement lambdas and function types.

Copy link
Contributor

@guipublic guipublic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My feedback regarding #462 still apply here, the frontend should not modify the control flow.

@jfecher jfecher mentioned this pull request Nov 28, 2022
Copy link
Contributor

@guipublic guipublic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In overall it's a good start but needs some re-work.
I would start by merging Builtin and standard function so that they are mostly treated the same by the SSA pass: a builtin function would get a FuncId and a FuncIndex.
For instance the FuncIndex could match the OPCODE so we can easily know when a function is a builtin.
Then I would remove FunctionObj/BuiltinObj and use directly a NodeId of type Field, which would represent a function pointer, i.e its value is equal to the FuncIndex it references. Being a NodeObj, it will be properly handled by the SSA througout all the passes.
Finally, we would need to implement "recursive inlining" in the inlining process of main, i.e when we reach a call instruction, we evaluate the function pointer and inline the function call before processing the other instructions (and return an unimplemented!() error if we cannot evaluate it).

This mechanism could be easily extended later on to support any function pointers (i.e not only the ones known at compile time).
n.b. If you want I can take it from here.

@jfecher
Copy link
Contributor Author

jfecher commented Dec 2, 2022

Thank you for the review. This PR isn't nearly done though, there are still bugs when calling builtin functions and first-class functions aren't implemented in at all yet (there is no recursive inlining at all as you mentioned).

I agree that Function and Builtin NodeObj variants can be merged and I was considering doing so myself as well. I do think they could be both put inside a single Function node though it'd have to contain an enum of either a FuncId or OPCODE internally as FuncIds always correspond to a function in the Ast. I do not think representing functions as Fields is a good idea just because Fields are already supported - functions are not fields and it would be meaningless to support Field operations like addition on functions. Instead we should keep Functions as separate objects to reduce bugs in the future and make the code easier to follow.

(Edit: I would like to have this PR as a draft while it is unfinished but I cannot seem to revert it to one without closing the PR and reopening it)

@vezenovm
Copy link
Contributor

vezenovm commented Jan 5, 2023

Looks like a couple tests are failing now. Once these are passing I think this looks good.

Copy link
Contributor

@guipublic guipublic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is globally ok for me, my main concerns are these 3 points:

  • The frontend is not checking function signatures so you can call f(x) with f= foo when foo is declared as fn foo(), or fn foo(x,y)

  • There is a confusion about the returned arrays, you should remove all the modifications you did about them.

  • You did not implement the recursive inlining, but since it is not needed for basic use cases and because this PR is open for too long already, we should do it in a separate PR.

crates/nargo/tests/test_data/9_conditional/src/main.nr Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/node.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/node.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/function.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/function.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/context.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/context.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/function.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/node.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/node.rs Outdated Show resolved Hide resolved
@jfecher
Copy link
Contributor Author

jfecher commented Jan 11, 2023

Addressing the first point:

The frontend is not checking function signatures so you can call f(x) with f= foo when foo is declared as fn foo(), or fn foo(x,y)

This is incorrect, function types are still checked and you can verify this by changing argument types, counts, etc. The way they are checked now is slightly different. When we have a function call we do not have access to the function id directly anymore, instead we have the function type from the function expression. Consider the call expression (if c { add1 } else { return_some_function() })(5). The exact function being called is unknown but the type checker still knows it has the type Field -> Field (presumably), and it can use this to typecheck the argument 5 and get the expected return type of Field.

@guipublic
Copy link
Contributor

The exact function being called is unknown but the type checker still knows it has the type Field -> Field (presumably), and it can use this to typecheck the argument 5 and get the expected return type of Field.

Yes I agree, so shouldn't we do this?

@jfecher
Copy link
Contributor Author

jfecher commented Jan 11, 2023

Yes I agree, so shouldn't we do this?

I am confused what you mean, this PR already does this.

Edit: I believe I found what you're referring to, it looks to be a bug but doesn't seem to be happening in all cases. I'll investigate further. The intention was always to check these so allowing this bug through is not an option

@jfecher
Copy link
Contributor Author

jfecher commented Jan 11, 2023

Indeed the if param_len != arg_len { check from type_check_function_call was missing once that function was removed. I've re-added it into bind_function

Copy link
Contributor Author

@jfecher jfecher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the comments seem to be based on the returned_arrays handling. I'd like to remove this as well, just ran into difficulty originally. One easy way to remove them would be if we could just create a fresh, empty array at each callsite. Assuming these ids are truly temporary then this could be fine.

Note: I assume these ids are temporary now based on previous talks and code such as:

let mut a1 = get_array();
let a2 = get_array();

which should not both refer to the same array.

crates/noirc_evaluator/src/ssa/code_gen.rs Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/code_gen.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/function.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/function.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/inline.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/inline.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/node.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/node.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/node.rs Outdated Show resolved Hide resolved
crates/noirc_evaluator/src/ssa/node.rs Outdated Show resolved Hide resolved
@jfecher
Copy link
Contributor Author

jfecher commented Jan 13, 2023

Reverted the removal of ArraySetIds commit since the large default array size lead to a large performance regression likely due to copying these large but empty arrays later on.

@guipublic
Copy link
Contributor

Reverted the removal of ArraySetIds commit since the large default array size lead to a large performance regression likely due to copying these large but empty arrays later on.

I don't understand why do you use these big arrays. Using your removal of ArraySetIds commit, I tweaked the function.rs/call() method and it seems to be fine: I add the previous behaviour when we know the function, else I put fresh returned_arrays like you did, but with the correct length. (n.b: we could avoid creating fresh arrays by caching the one we create here and re-use them when the type+len matches.)

```
    if let Some(func_id) = self.context.try_get_funcid(func) {
            let rtt = self.context.functions[&func_id].result_types.clone();
            let mut result = Vec::new();
            for i in rtt.iter().enumerate() {
                result.push(self.context.new_instruction(
                    node::Operation::Result { call_instruction, index: i.0 as u32 },
                    *i.1,
                )?);
            }
           return Ok(result);
        }

        let result_ids = try_vecmap(return_types, |(i, typ)| {
            let result = Operation::Result { call_instruction, index: i as u32 };
            let typ = match typ {
                Type::Array(len, elem_type) => {
                    let elem_type = self.context.convert_type(&elem_type);
                    let array_id =
                        self.context.new_array("", elem_type, len as u32, None).1;
                    returned_arrays.push((array_id, i as u32));
                    ObjectType::Pointer(array_id)
                }
                other => self.context.convert_type(&other),
            };

            self.context.new_instruction(result, typ)
        });

@jfecher
Copy link
Contributor Author

jfecher commented Jan 17, 2023

This PR should be finished now. I did have to re-add returned_arrays as higher order functions that returned arrays would be incorrectly tracked otherwise - leading to the final ssa storing to a different array that it later loaded from.

Copy link
Contributor

@guipublic guipublic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine for me

@jfecher jfecher merged commit 3c3dffb into master Jan 17, 2023
@jfecher jfecher deleted the jf/hof branch January 17, 2023 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants