Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(forge fmt): sorting of imports adds newlines in groups #7944

Open
2 tasks done
beeb opened this issue May 18, 2024 · 7 comments
Open
2 tasks done

feat(forge fmt): sorting of imports adds newlines in groups #7944

beeb opened this issue May 18, 2024 · 7 comments
Labels
C-forge Command: forge Cmd-forge-fmt Command: forge fmt T-bug Type: bug

Comments

@beeb
Copy link
Contributor

beeb commented May 18, 2024

Component

Forge

Have you ensured that all of these are up to date?

  • Foundry
  • Foundryup

What version of Foundry are you on?

master

What command(s) is the bug in?

forge fmt

Operating System

Linux

Describe the bug

Imports sorting sometimes adds empty lines in the middle of a group. Repro:

import { ERC20 } from "@openzeppelin/contracts/token/ERC20/ERC20.sol";
import { ERC20Permit } from "@openzeppelin/contracts/token/ERC20/extensions/ERC20Permit.sol";
import { ERC20Burnable } from "@openzeppelin/contracts/token/ERC20/extensions/ERC20Burnable.sol";
import { IERC20 } from "@openzeppelin/contracts/token/ERC20/IERC20.sol";
import { IERC20Permit } from "@openzeppelin/contracts/token/ERC20/extensions/IERC20Permit.sol";
import { AccessControl } from "@openzeppelin/contracts/access/AccessControl.sol"`;

Output:

import { AccessControl } from "@openzeppelin/contracts/access/AccessControl.sol";
import { ERC20 } from "@openzeppelin/contracts/token/ERC20/ERC20.sol";
import { IERC20 } from "@openzeppelin/contracts/token/ERC20/IERC20.sol";
import { ERC20Burnable } from "@openzeppelin/contracts/token/ERC20/extensions/ERC20Burnable.sol";
import { ERC20Permit } from "@openzeppelin/contracts/token/ERC20/extensions/ERC20Permit.sol";

import { IERC20Permit } from "@openzeppelin/contracts/token/ERC20/extensions/IERC20Permit.sol";

As you can see, there is an empty line before the last element.

I dug a bit into the formatter code and I noticed that the Loc::File offsets are not updated when re-ordering the SourceUnitParts in:

import_directives.sort_by_cached_key(|item| match item {
SourceUnitPart::ImportDirective(import) => match import {
Import::Plain(path, _) => path.to_string(),
Import::GlobalSymbol(path, _, _) => path.to_string(),
Import::Rename(path, _, _) => path.to_string(),
},
_ => {
unreachable!("import group contains non-import statement")
}
});

Since the rest of the formatter makes heavy use of those locations, I assumed they are important.

I attempted a dirty fix by re-writing the Loc start and end offsets according to the new ordering and that seems to fix the bug described above, but my dirty fix does not take into account the spacing between the various SourceUnitParts (due to semicolon and line break, or a potential comment on the same line).

// Before sorting
let group_start_loc = import_directives[0].loc();
// After sorting
let mut offset = group_start_loc.start();
for source_unit_part in import_directives.iter_mut() {
    let loc = source_unit_part.loc();
    let len = loc.end() - loc.start();
    match source_unit_part {
        SourceUnitPart::ImportDirective(import) => match import {
            Import::Plain(_, loc) => {
                *loc = loc.with_start(offset).with_end(offset + len);
            }
            Import::GlobalSymbol(_, _, loc) => {
                *loc = loc.with_start(offset).with_end(offset + len);
            }
            Import::Rename(_, _, loc) => {
                *loc = loc.with_start(offset).with_end(offset + len);
            }
        },
        _ => {
            unreachable!("import group contains non-import statement")
        }
    }
    offset += len;
}

Maybe someone smarter can create a better fix? Also, I haven't even attempted to update the location of the renames in the Import::Rename variant, but that should also probably be done?

@beeb beeb added the T-bug Type: bug label May 18, 2024
@beeb
Copy link
Contributor Author

beeb commented May 18, 2024

Pinging @meetmangukiya who authored #5442 and @mattsse who contributed too

@beeb
Copy link
Contributor Author

beeb commented May 18, 2024

Another consequence of this is that any comment at the end of an import statement line (not on a separate line) is moved to a random position after formatting. (e.g. in my case on the pragma line a few lines above).

Original:

// SPDX-License-Identifier: MIT
pragma solidity 0.8.20;

import { ERC20 } from "@openzeppelin/contracts/token/ERC20/ERC20.sol";
import { ERC20Permit } from "@openzeppelin/contracts/token/ERC20/extensions/ERC20Permit.sol";
import { ERC20Burnable } from "@openzeppelin/contracts/token/ERC20/extensions/ERC20Burnable.sol"; // test
import { IERC20 } from "@openzeppelin/contracts/token/ERC20/IERC20.sol";
import { IERC20Permit } from "@openzeppelin/contracts/token/ERC20/extensions/IERC20Permit.sol";
import { AccessControl } from "@openzeppelin/contracts/access/AccessControl.sol";

Formatted:

// SPDX-License-Identifier: MIT
pragma solidity 0.8.20; // test

import { AccessControl } from "@openzeppelin/contracts/access/AccessControl.sol";
import { ERC20 } from "@openzeppelin/contracts/token/ERC20/ERC20.sol";
import { IERC20 } from "@openzeppelin/contracts/token/ERC20/IERC20.sol";
import { ERC20Burnable } from "@openzeppelin/contracts/token/ERC20/extensions/ERC20Burnable.sol";
import { ERC20Permit } from "@openzeppelin/contracts/token/ERC20/extensions/ERC20Permit.sol";

import { IERC20Permit } from "@openzeppelin/contracts/token/ERC20/extensions/IERC20Permit.sol";

@beeb
Copy link
Contributor Author

beeb commented Jun 21, 2024

I took a bit of time to look further into this, and I think it's really not the best idea to do the sorting at the CST level. The formatter relies on offsets into the original source to function, and reordering the nodes, even if we adjust the offsets (which is a pain to do), would still leave the CST divergent from the source.

What about performing the sorting at the string level before it gets parsed by solang_parser? It would require some refactoring of the interface to pass it an owned string, or maybe a Cow, in the hope of avoiding the clone if possible.

Also, since the formatter preferences are not currently passed to the parse function, this would also require some refactoring.

I am thinking about making a very simple parser with winnow or chumsky that would identify import groups, create a very simple AST for them, sort the imports and then crudely format them back into a string (without worrying about the formatting rules) to be replaced into the original source before it gets passed to the solang parser.

To avoid reallocating maybe we could even try to get a sorted output that is smaller or equal to the length of the original imports group, padding it to its original length with whitespace if necessary, so that the rest of the string doesn't need to move. Alternatively a string rope (with crop) could be used but that feels like a bit much.

@mattsse
Copy link
Member

mattsse commented Jun 21, 2024

What about performing the sorting at the string level before it gets parsed by solang_parser?

yeah perhaps we'd need two passes here, first sort, then parse again, maybe

@beeb
Copy link
Contributor Author

beeb commented Jun 21, 2024

Brilliant if you agree @mattsse ! I could work on this over the next few days, maybe it will take a few days to a couple of weeks. I was thinking to use chumsky because it has built-in Spans which would be useful to know the length of the imports group.

I refactored the inputs to the parser to be a impl Into<Cow<'_, str>> already and that was easy enough, so we can skip cloning the source if sorting is disabled. I noticed there is already some cloning happening anyway, for diagnostics and errors and such, but if we can avoid having one more copy that'd be great.

@mattsse
Copy link
Member

mattsse commented Jun 21, 2024

please don't waste your time writing yet another solidity parser :)

cc @DaniPopes

I assume we could fix this particular issue with a "hacky" preprocessing step that does sorting first or smarter loc updates

@beeb
Copy link
Contributor Author

beeb commented Jun 21, 2024

please don't waste your time writing yet another solidity parser :)

cc @DaniPopes

I assume we could fix this particular issue with a "hacky" preprocessing step that does sorting first or smarter loc updates

Right, I was only saying this because for reordering imports it feels wasteful to parse the full language, and also it's much easier to work with an AST.

@foundry-rs foundry-rs deleted a comment from ArkFunds Jul 15, 2024
@zerosnacks zerosnacks added C-forge Command: forge Cmd-forge-fmt Command: forge fmt labels Jul 15, 2024
@zerosnacks zerosnacks changed the title Imports sorting adds newlines in groups feat(forge fmt): sorting of imports adds newlines in groups Jul 15, 2024
@zerosnacks zerosnacks modified the milestone: v1.0.0 Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-forge Command: forge Cmd-forge-fmt Command: forge fmt T-bug Type: bug
Projects
None yet
Development

No branches or pull requests

3 participants