Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add formatting infrastructure to the project #2505

Merged
merged 3 commits into from
Aug 20, 2024
Merged

Conversation

danlapid
Copy link
Contributor

@danlapid danlapid commented Aug 9, 2024

Fixes #2507

Add infrastructure to format c++ files with clang-format.
Added:

  1. clang-format file matching internal one
  2. formatting tool based on python for cross-platform
  3. pre-commit and pre-push hooks for formatting validation
  4. Lint CI action.

In terms of CR workflow here's what I was thinking:
Reviewing the entire CR is going to be difficult which is why it's split into 3 distinct commits:

  1. Establish the framework for reformatting.
  2. Plain format.py run
  3. add .git-blame-ignore-revs file with format commit's hash.

Seeing as how this CR will probably take a while I will not try to resolve conflicts during the CR. Instead whenever we finish the CR I will revert to

  1. commit 1
  2. rebase from main
  3. apply format
  4. add .git-blame-ignore-revs
  5. merge without further CR

At the very least should have @jp4a50 approval before merge.

@danlapid danlapid force-pushed the dlapid/add-cpp-formatting branch 2 times, most recently from 313d830 to 932daad Compare August 9, 2024 16:49
@danlapid danlapid marked this pull request as ready for review August 9, 2024 16:57
@danlapid danlapid requested review from a team as code owners August 9, 2024 16:57
githooks/pre-commit Show resolved Hide resolved
tools/cross/format.py Outdated Show resolved Hide resolved
tools/cross/format.py Outdated Show resolved Hide resolved
tools/cross/format.py Outdated Show resolved Hide resolved
tools/cross/format.py Outdated Show resolved Hide resolved
@fhanau
Copy link
Collaborator

fhanau commented Aug 9, 2024

Thank you for this!

On updating LLVM: Leaving it at 15 for now has been a conscious choice: My philosophy has been that it should be possible to build workerd using the LLVM version available in the default APT repo in the latest stable release of Ubuntu and Debian, so that developers can build workerd without having to build LLVM themselves or install it from an alternative repository like https://apt.llvm.org/ which is used in the CI build. For Debian (bookworm) that's currently 16 (also available on the earlier Debian Bullseye), for Ubuntu 22.04 it is LLVM 15, the recently introduced Ubuntu 24.04 supports LLVM 18. If we upgrade the version we use to build on CI to 18, we can easily introduce changes that break the build with lower versions.
With the V8 12.8 update (#2441, should land pretty soon), we will have to update this to LLVM 16 anyway, which may introduce some headaches for internal developers that are on Ubuntu 22.04-based systems.

What we can do is to require LLVM 18 and document that installing it may require additional developer actions, or we can update the version we use on CI to 18, but leave the documented minimum version as is and just state that LLVM18 is recommended and add a CI job that uses LLVM16 so that it doesn't break. I would prefer to address compiler upgrades separately from this PR TBH, for now we can just say that clang-format-18 is required for linting.

@danlapid
Copy link
Contributor Author

danlapid commented Aug 9, 2024

Thank you for this!

On updating LLVM: Leaving it at 15 for now has been a conscious choice: My philosophy has been that it should be possible to build workerd using the LLVM version available in the default APT repo in the latest stable release of Ubuntu and Debian, so that developers can build workerd without having to build LLVM themselves or install it from an alternative repository like apt.llvm.org which is used in the CI build. For Debian (bookworm) that's currently 16 (also available on the earlier Debian Bullseye), for Ubuntu 22.04 it is LLVM 15, the recently introduced Ubuntu 24.04 supports LLVM 18. If we upgrade the version we use to build on CI to 18, we can easily introduce changes that break the build with lower versions. With the V8 12.8 update (#2441, should land pretty soon), we will have to update this to LLVM 16 anyway, which may introduce some headaches for internal developers that are on Ubuntu 22.04-based systems.

What we can do is to require LLVM 18 and document that installing it may require additional developer actions, or we can update the version we use on CI to 18, but leave the documented minimum version as is and just state that LLVM18 is recommended and add a CI job that uses LLVM16 so that it doesn't break. I would prefer to address compiler upgrades separately from this PR TBH, for now we can just say that clang-format-18 is required for linting.

Yeah sure, I thought it'd be nice to upgrade our llvm version but I don't mind reverting this change and just requiring clang-format-18 for format

@danlapid danlapid changed the title Add formatting infrastructure to the project and upgrade LLVM. Add formatting infrastructure to the project Aug 9, 2024
@danlapid danlapid force-pushed the dlapid/add-cpp-formatting branch 2 times, most recently from 5cd2027 to 4569c12 Compare August 9, 2024 18:51
@danlapid danlapid requested a review from jp4a50 August 9, 2024 18:56
src/workerd/jsg/function.h Outdated Show resolved Hide resolved
.github/workflows/lint.yml Outdated Show resolved Hide resolved
tools/cross/format.py Outdated Show resolved Hide resolved
tools/cross/format.py Outdated Show resolved Hide resolved
@fhanau
Copy link
Collaborator

fhanau commented Aug 11, 2024

Unfortunately format checking fails for me with Configuration file(s) do(es) not support Objective-C: [...]/workerd/.clang-format (macOS, clang-format 18.1.8 installed via homebrew). Looks like clang-format is mistaking some files to be Objective-C instead of C++?

@danlapid
Copy link
Contributor Author

danlapid commented Aug 11, 2024

Unfortunately format checking fails for me with Configuration file(s) do(es) not support Objective-C: [...]/workerd/.clang-format (macOS, clang-format 18.1.8 installed via homebrew). Looks like clang-format is mistaking some files to be Objective-C instead of C++?

Yea that warning is printed for about 5 files where we have embedded typescript in macros but macro expansion happens after syntax parsing and language recognition in clang-format so it gets confused.
It's not an error but a warning.
We can make it go away by marking objective c files as "don't format" but there isn't any easy fix to make clang format be able to format these files.
It would require quite a massive change in the way it works.
We can revisit it later but I don't think it should block formatting the other hundreds of files we have.

@danlapid danlapid force-pushed the dlapid/add-cpp-formatting branch 5 times, most recently from 2e4abf4 to 19de564 Compare August 12, 2024 10:23
@fhanau
Copy link
Collaborator

fhanau commented Aug 12, 2024

Sadly it seems that clang-format-18 spews out different results in my updated linux environment on macOS and on github actions.

That's unfortunate – does that only affect a few files or is it a broad issue? Perhaps we can manage this by manually excluding a few files until 19 comes out, or add

// clang-format off
// clang-format on

sections where appropriate.

@danlapid
Copy link
Contributor Author

Sadly it seems that clang-format-18 spews out different results in my updated linux environment on macOS and on github actions.

That's unfortunate – does that only affect a few files or is it a broad issue? Perhaps we can manage this by manually excluding a few files until 19 comes out, or add

// clang-format off
// clang-format on

sections where appropriate.

Got it sorted with clang-format-18, I edited the comment above.

@fhanau
Copy link
Collaborator

fhanau commented Aug 12, 2024

Got it sorted with clang-format-18, I edited the comment above.

Running the script works on macOS now. Unfortunately, it takes 16s to finish on M1 after freezing on src/workerd/jsg/iterator.h. This would slow down the commit process – when excluding that file this goes down to 4s which is still quite some overhead but perhaps acceptable. Maybe we can manually exclude that file?

@danlapid
Copy link
Contributor Author

danlapid commented Aug 12, 2024

overhead but perhaps acceptable. Maybe

The pre-commit hook only runs on staged files, it shouldn't slow down commits at all unless you editted that specific file.
If that were a problem I would probably multithread the format.py script but on a regular commit with just a few c++ files the format is almost instant.

tools/cross/format.py Outdated Show resolved Hide resolved
tools/cross/format.py Outdated Show resolved Hide resolved
tools/cross/format.py Outdated Show resolved Hide resolved
Copy link
Contributor

@hoodmane hoodmane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @danlapid, I made some minor comments.

tools/unix/apply-big-move.sh Show resolved Hide resolved
tools/unix/find-python3.sh Outdated Show resolved Hide resolved
@jp4a50
Copy link
Collaborator

jp4a50 commented Aug 20, 2024

One thing that I did when I did the internal formatting was to go through the diff and find cases where clang-format made the code much less readable so I could selectively turn off clang-format for certain lines. I won't block on this because it was incredibly painstaking but if you fancy killing about half a day with some monotonous work then it would be an added bonus 😅

Copy link
Collaborator

@jp4a50 jp4a50 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GG

.clang-format Show resolved Hide resolved

source "$(dirname -- $BASH_SOURCE)/find-python3.sh"
PYTHON_PATH=$(get_python3)
$PYTHON_PATH "$(dirname -- $BASH_SOURCE)/../cross/format.py"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW this is a slightly different approach taken than the internal big move script in that you are running against only those files changed wrt. the HEAD commit rather than all files which should be formatted.

The intention of the original approach was that you can use this script to actually apply the big move in the first place by running apply-big-move.sh --apply so you can be sure you are doing the same (idempotent) thing when applying the big move in the first place and then applying it to each branch's commit.

That being said, assuming you run clang-format on (at least) the set of files that the format.py script might format, this should still work fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the format.py script is the only way I used to clang-format so it should match :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, I just realized that you actually changed the semantics of the original script when porting it - you added the --git option to the script when porting it. This makes a lot more sense now.

#
# This ensures that if further changes are made to this file before the Big Move actually happens,
# then the rebase command that rebases to the commit before the Big Move won't confuse bash.
main "$@"; exit 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed over this, assuming that everything below apply_big_move() is unchanged. LMK if not.

Add infrastructure to format c++ files with clang-format.
We require clang-format-18 because some of the clang-format options
we're using have only been released with clang-format 18.
General project llvm version required remains unchanged.
Added:
1. clang-format file matching internal one
2. formatting tool based on python for cross-platform
3. pre-commit and pre-push hooks for formatting validation
4. Lint CI action.
@danlapid danlapid merged commit 1c54402 into main Aug 20, 2024
10 checks passed
@danlapid danlapid deleted the dlapid/add-cpp-formatting branch August 20, 2024 22:51
@hoodmane
Copy link
Contributor

Thanks @danlapid!

@hoodmane
Copy link
Contributor

hoodmane commented Aug 21, 2024

I'm having some trouble with apply-big-move:

  1. I called it via ./tools/unix/apply-big-move.sh and it fails to find itself on the line exec ./apply-big-move.sh. I tried cding into tools/unix which got me a bit further.
  2. have to install clang-format-18 and ensure that there is a symlink from clang-format to clang-format-18 on your path. It checks whether clang-format --version is 18 after starting a rebase and doesn't clean it up when it quits, so you have to git reset --hard my-branch first.
  3. Then I see:
    Fetching tags...
    Rewriting commit 63b43725: Add formatting infrastructure to the project.
    Updated 1 path from 45fe9bd9
    Applying format...
    Changes applied successfully.
    [detached HEAD ee5ebcdf5] Add formatting infrastructure to the project.
     Author: Dan Lapid <[email protected]>
     Date: Fri Aug 9 15:27:49 2024 +0100
     1 file changed, 0 insertions(+), 0 deletions(-)
     mode change 100644 => 100755 tools/unix/apply-big-move.sh
    Rewriting commit 37bed02d: Add Python development compatibility flag for tests
    Updated 0 paths from e34e2818
    Applying format...
    Changes applied successfully.
    fatal: empty string is not a valid pathspec. please use . instead if you meant to match all paths
    
    And the latest commit is d916b2af Add formatting infrastructure to the project. Maybe invoking from within the tools/unix directory isn't good?
  4. I switched to bash tools/unix/apply-big-move.sh which gets further and actually formats a bunch of stuff. This fails on FileNotFoundError: [Errno 2] No such file or directory: 'node_modules/.bin/prettier'
  5. Run npm i prettier and try again.
  6. Now it does a lot more stuff but still fails with
    fatal: empty string is not a valid pathspec. please use . instead if you meant to match all paths
    
    But now HEAD is detached on a commit that makes my changes on top of d916b2af! So all I have to do is manually update my original branch with e.g., git switch -C orig-branch and I'm good to go.

@danlapid
Copy link
Contributor Author

danlapid commented Aug 21, 2024

I'm having some trouble with apply-big-move:

  1. I called it via ./tools/unix/apply-big-move.sh and it fails to find itself on the line exec ./apply-big-move.sh. I tried cding into tools/unix which got me a bit further.

  2. have to install clang-format-18 and ensure that there is a symlink from clang-format to clang-format-18 on your path. It checks whether clang-format --version is 18 after starting a rebase and doesn't clean it up when it quits, so you have to git reset --hard my-branch first.

  3. Then I see:

    Fetching tags...
    Rewriting commit 63b43725: Add formatting infrastructure to the project.
    Updated 1 path from 45fe9bd9
    Applying format...
    Changes applied successfully.
    [detached HEAD ee5ebcdf5] Add formatting infrastructure to the project.
     Author: Dan Lapid <[email protected]>
     Date: Fri Aug 9 15:27:49 2024 +0100
     1 file changed, 0 insertions(+), 0 deletions(-)
     mode change 100644 => 100755 tools/unix/apply-big-move.sh
    Rewriting commit 37bed02d: Add Python development compatibility flag for tests
    Updated 0 paths from e34e2818
    Applying format...
    Changes applied successfully.
    fatal: empty string is not a valid pathspec. please use . instead if you meant to match all paths
    

    And the latest commit is d916b2af Add formatting infrastructure to the project. Maybe invoking from within the tools/unix directory isn't good?

  4. I switched to bash tools/unix/apply-big-move.sh which gets further and actually formats a bunch of stuff. This fails on FileNotFoundError: [Errno 2] No such file or directory: 'node_modules/.bin/prettier'

  5. Run npm i prettier and try again.

  6. Now it does a lot more stuff but still fails with

    fatal: empty string is not a valid pathspec. please use . instead if you meant to match all paths
    

    But now HEAD is detached on a commit that makes my changes on top of d916b2af! So all I have to do is manually update my original branch with e.g., git switch -C orig-branch and I'm good to go.

For anyone else coming into this try:

  1. install clang-format-18
  2. sudo npm install -g pnpm@latest
  3. pnpm install
  4. git rebase --onto before-big-move-1
  5. ./tools/unix/apply-big-move.sh

if anything goes wrong you can
git reset --hard origin/<original_branch_name>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Code formatting: C++
7 participants