Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Yul ASTs output #14177

Merged
merged 1 commit into from
May 26, 2023
Merged

Conversation

GiokaMarkella
Copy link
Contributor

@GiokaMarkella GiokaMarkella commented May 2, 2023

Adds:

  • Command line output option--ast-compact-json with Yul input. This outputs the Yul AST in JSON compact format.
  • Command line output options --ir-ast-json and --ir-optimized-ast-json with Solidity input. Those options will produce the AST of the Yul-IR and the optimized Yul-IR, respectively, in JSON compact format.

Resolves #9590. Although #9590 was grouped with #13720 which appears to have other dependencies, #9590 itself is pretty straightforward and can be implemented similarly to the Solidity ASTs that are already present.

@github-actions
Copy link

github-actions bot commented May 2, 2023

Thank you for your contribution to the Solidity compiler! A team member will follow up shortly.

If you haven't read our contributing guidelines and our review checklist before, please do it now, this makes the reviewing process and accepting your contribution smoother.

If you have any questions or need our help, feel free to post them in the PR or talk to us directly on the #solidity-dev channel on Matrix.

@ekpyron
Copy link
Member

ekpyron commented May 3, 2023

For reference, since we talked about this in our call earlier today:
By "standard json" I mean this: https://docs.soliditylang.org/en/v0.8.19/using-the-compiler.html#compiler-input-and-output-json-description, the entry point in the code is https://github.com/ethereum/solidity/blob/develop/libsolidity/interface/StandardCompiler.cpp

Via command line this input/output mode is enabled via --standard-json.

We can, for example, add ast.ir as a possible field to the file-level output selections when compiling Solidity - and can just output the Yul AST when requesting ast on Yul input.

@ekpyron
Copy link
Member

ekpyron commented May 4, 2023

One more thing I forgot to consider yesterday:
When compiling Solidity to IR, we equip the resulting Yul code with special comments like

/// @src 1:0:14  "contract C {..."

which encode the source locations of the original Solidity file.

Furthermore, we will eventually also want to import the Yul ASTs back with the goal to have identical compiler outputs after exporting the Yul AST and then importing it back compared to a single-step compilation. Similarly, compiling to Yul first and then continuing compilation from Yul is supposed to yield identical compiler output as one-step compilation.

This PR doesn't need to address any of this, we can take care of that in subsequent steps, but it has implications for the source locations in the Yul AST:

If you start from Yul code that has comments like the ones the compiler produces when compiling solidity to IR code, the compiler should already pick up on those comments and parse the source locations - in that case those locations should also show up in the src fields of the Yul AST, just as if we had compiled from Solidity. nativeSrc could contain the Yul source locations in this case.

Now in the absence of such comments, i.e. when compiling standalone Yul, it's probably most consistent to still have both fields, src and nativeSrc, but just store identical values in then, i.e. the source locations in Yul.

In this model, we can actually even make it originSrc (instead of src) and nativeSrc as you had suggested - and just have pure Yul sources produce identical locations for both, while compiler-generated IR-Yul sources should get the contents of the @src comments (as should already be parsed by the compiler) as originSrc.

@GiokaMarkella
Copy link
Contributor Author

One more thing I forgot to consider yesterday: When compiling Solidity to IR, we equip the resulting Yul code with special comments like

/// @src 1:0:14  "contract C {..."

which encode the source locations of the original Solidity file.

Furthermore, we will eventually also want to import the Yul ASTs back with the goal to have identical compiler outputs after exporting the Yul AST and then importing it back compared to a single-step compilation. Similarly, compiling to Yul first and then continuing compilation from Yul is supposed to yield identical compiler output as one-step compilation.

This PR doesn't need to address any of this, we can take care of that in subsequent steps, but it has implications for the source locations in the Yul AST:

If you start from Yul code that has comments like the ones the compiler produces when compiling solidity to IR code, the compiler should already pick up on those comments and parse the source locations - in that case those locations should also show up in the src fields of the Yul AST, just as if we had compiled from Solidity. nativeSrc could contain the Yul source locations in this case.

Now in the absence of such comments, i.e. when compiling standalone Yul, it's probably most consistent to still have both fields, src and nativeSrc, but just store identical values in then, i.e. the source locations in Yul.

In this model, we can actually even make it originSrc (instead of src) and nativeSrc as you had suggested - and just have pure Yul sources produce identical locations for both, while compiler-generated IR-Yul sources should get the contents of the @src comments (as should already be parsed by the compiler) as originSrc.

From what I understand, origin location and native location already behave as you describe (meaning e.g. that native and origin are the same in the case of standalone Yul input). So, would it suffice if there are two fields, namely src and nativeSrc that would store originSource and nativeSource, respectively?

@ekpyron
Copy link
Member

ekpyron commented May 4, 2023

One more thing I forgot to consider yesterday: When compiling Solidity to IR, we equip the resulting Yul code with special comments like

/// @src 1:0:14  "contract C {..."

which encode the source locations of the original Solidity file.
Furthermore, we will eventually also want to import the Yul ASTs back with the goal to have identical compiler outputs after exporting the Yul AST and then importing it back compared to a single-step compilation. Similarly, compiling to Yul first and then continuing compilation from Yul is supposed to yield identical compiler output as one-step compilation.
This PR doesn't need to address any of this, we can take care of that in subsequent steps, but it has implications for the source locations in the Yul AST:
If you start from Yul code that has comments like the ones the compiler produces when compiling solidity to IR code, the compiler should already pick up on those comments and parse the source locations - in that case those locations should also show up in the src fields of the Yul AST, just as if we had compiled from Solidity. nativeSrc could contain the Yul source locations in this case.
Now in the absence of such comments, i.e. when compiling standalone Yul, it's probably most consistent to still have both fields, src and nativeSrc, but just store identical values in then, i.e. the source locations in Yul.
In this model, we can actually even make it originSrc (instead of src) and nativeSrc as you had suggested - and just have pure Yul sources produce identical locations for both, while compiler-generated IR-Yul sources should get the contents of the @src comments (as should already be parsed by the compiler) as originSrc.

From what I understand, origin location and native location already behave as you describe (meaning e.g. that native and origin are the same in the case of standalone Yul input). So, would it suffice if there are two fields, namely src and nativeSrc that would store originSource and nativeSource, respectively?

Yes, that should work!

@ekpyron
Copy link
Member

ekpyron commented May 4, 2023

FYI you can ignore the b_archlinux CI failures here on github - that's unrelated to the PR and we're working on fixing it separately.

Copy link
Member

@r0qs r0qs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @GiokaMarkella for your contribution. I left some comments ;)

Changelog.md Outdated Show resolved Hide resolved
solc/CommandLineInterface.h Outdated Show resolved Hide resolved
libyul/AsmJsonConverter.cpp Outdated Show resolved Hide resolved
solc/CommandLineParser.cpp Outdated Show resolved Hide resolved
@GiokaMarkella GiokaMarkella requested a review from r0qs May 8, 2023 10:52
Changelog.md Outdated Show resolved Hide resolved
libsolidity/interface/StandardCompiler.cpp Outdated Show resolved Hide resolved
libyul/AsmJsonConverter.cpp Outdated Show resolved Hide resolved
libyul/AsmJsonConverter.h Outdated Show resolved Hide resolved
Copy link
Collaborator

@nikola-matic nikola-matic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the changelog, looks good, and is ready to be merged soon. We're gonna decide on the actual elements names exposed in the Standard JSON interface, and once we've got that sorted, we'll get back to you. The commits will also need to be squashed by the way.

Changelog.md Outdated Show resolved Hide resolved
Changelog.md Outdated Show resolved Hide resolved
Changelog.md Outdated Show resolved Hide resolved
Changelog.md Outdated Show resolved Hide resolved
libsolidity/interface/CompilerStack.cpp Outdated Show resolved Hide resolved
r0qs
r0qs previously requested changes May 15, 2023
Changelog.md Outdated Show resolved Hide resolved
@ekpyron
Copy link
Member

ekpyron commented May 15, 2023

We just discussed this internally a bit further: we want the output artifacts in standard json to be named irAst and irOptimizedAst without the Json suffix - apart from that the design in the PR works out!

Do you know when you'll be available for rebasing the PR, making this change and fixing the remaining style comments?

@cameel
Copy link
Member

cameel commented May 15, 2023

@ekpyron Actually, we could use the ast.ir and ast.irOptimized structure you proposed initially on the call after all. You said that the blocker was that already have ast in there with the Solidity AST, but looks like that's not the case. In current output structure that sits under sources so we're free to make ast a dict under contracts.

BTW, just noticed that the output docs do not show the irOptimized key. We should fix that.

@ekpyron
Copy link
Member

ekpyron commented May 15, 2023

@ekpyron Actually, we could use the ast.ir and ast.irOptimized structure you proposed initially on the call after all. You said that the blocker was that already have ast in there with the Solidity AST, but looks like that's not the case. In current output structure that sits under sources so we're free to make ast a dict under contracts.

BTW, just noticed that the output docs do not show the irOptimized key. We should fix that.

I actually meant ir.ast and hadn't considered ast.ir :-). But reusing and repurposing this on different levels is more confusing than nice, I'd argue, so let's just keep it at irAst and irOptimizedAst.

@GiokaMarkella
Copy link
Contributor Author

We just discussed this internally a bit further: we want the output artifacts in standard json to be named irAst and irOptimizedAst without the Json suffix - apart from that the design in the PR works out!

Do you know when you'll be available for rebasing the PR, making this change and fixing the remaining style comments?

I'll try to keep track of all the comments and make the changes today! :)

libsolidity/ast/ASTJsonExporter.cpp Outdated Show resolved Hide resolved
libsolidity/interface/StandardCompiler.cpp Outdated Show resolved Hide resolved
libyul/Object.cpp Outdated Show resolved Hide resolved
libyul/Object.h Show resolved Hide resolved
solc/CommandLineInterface.cpp Outdated Show resolved Hide resolved
solc/CommandLineParser.cpp Outdated Show resolved Hide resolved
solc/CommandLineInterface.cpp Outdated Show resolved Hide resolved
test/solc/CommandLineParser.cpp Outdated Show resolved Hide resolved
Changelog.md Outdated Show resolved Hide resolved
@cameel
Copy link
Member

cameel commented May 15, 2023

I actually meant ir.ast and hadn't considered ast.ir :-). But reusing and repurposing this on different levels is more confusing than nice, I'd argue, so let's just keep it at irAst and irOptimizedAst.

Fair enough, though in that case I think we might still be better off with something like astIr and astIrOptimized just to emphasize that the output is the AST (not IR).

docs/using-the-compiler.rst Outdated Show resolved Hide resolved
docs/using-the-compiler.rst Outdated Show resolved Hide resolved
docs/using-the-compiler.rst Outdated Show resolved Hide resolved
libsolidity/interface/StandardCompiler.cpp Outdated Show resolved Hide resolved
Changelog.md Outdated Show resolved Hide resolved
Changelog.md Outdated Show resolved Hide resolved
@ekpyron
Copy link
Member

ekpyron commented May 15, 2023

I actually meant ir.ast and hadn't considered ast.ir :-). But reusing and repurposing this on different levels is more confusing than nice, I'd argue, so let's just keep it at irAst and irOptimizedAst.

Fair enough, though in that case I think we might still be better off with something like astIr and astIrOptimized just to emphasize that the output is the AST (not IR).

Let's keep it at irAst and irOptimizedAst.

Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good now. Great work!

I did find some small things that could be improved but I don't want to drop another wall of comments so I just corrected them as I went through the review - see my fixups. Let me know if there's anything there that you don't like or is wrong. In any case, there was nothing that would require any bigger changes from your side.

So now we're almost done here. The last thing remaining is to squash the commits to keep history clean. Then I can approve. Actually, if you don't mind me squashing it all into a single commit, I can just do it on my end while merging, so just let me know.

@cameel
Copy link
Member

cameel commented May 25, 2023

By the way, don't worry too much about the failing ext tests, that's unrelated to your PR and we have a fix in #14263. It should be merged soon and then rebasing your PR on develop should make these tests pass.

@GiokaMarkella
Copy link
Contributor Author

GiokaMarkella commented May 25, 2023

The PR looks good now. Great work!

I did find some small things that could be improved but I don't want to drop another wall of comments so I just corrected them as I went through the review - see my fixups. Let me know if there's anything there that you don't like or is wrong. In any case, there was nothing that would require any bigger changes from your side.

So now we're almost done here. The last thing remaining is to squash the commits to keep history clean. Then I can approve. Actually, if you don't mind me squashing it all into a single commit, I can just do it on my end while merging, so just let me know.

Awesome, thanks! Sure, if you could do that, that'd be great!
Edit: Also had a quick look over the fixups and looks fine to me

Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok done. #14263 is also in develop now so all tests should now pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Output Yul ASTs
5 participants