Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rules_go improvement to externalize the nogo fix #4102

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

peng3141
Copy link

@peng3141 peng3141 commented Sep 12, 2024

What type of PR is this?

Feature

What does this PR do? Why is it needed?

  • Problem
    nogo linters may produce fixes as analysis.Diagnostic.
    However, the fixes are not externalized properly.

  • Solution
    In this PR, we externalize the fixes if any by (1) declaring a fix path for each build target, (2) propagates the fix to the output group accessible externally.

  • Note
    Also note that, during the externalization, we need to get rid of fileset since it is transient, instead, we use per-file offset.

  • Test
    Tested below (some info hidden for privacy)
    bazel build --output_groups=nogo_fix --norun_validations //... --override_repository=io_bazel_rules_go=~/Uber/rules_go
    Target //src/code.uber.internal/infra/progsys/renovate/change-exporter/subjecttargets:uselessloggerwith_example up-to-date:
    bazel-bin/src/.../uselessloggerwith_example.nogo.fix

Other notes for review

Copy link

google-cla bot commented Sep 12, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@peng3141 peng3141 marked this pull request as draft September 12, 2024 02:07
@peng3141 peng3141 changed the title [draft][do not review] hack to get nogo fix out of bazel sandbox rules_go improvement to externalize the nogo fix Sep 24, 2024
@peng3141 peng3141 force-pushed the rules_go_hack_for_dumping_fix branch 9 times, most recently from f506b28 to c079a7f Compare September 25, 2024 15:09
@peng3141 peng3141 marked this pull request as ready for review September 25, 2024 15:33
@peng3141 peng3141 force-pushed the rules_go_hack_for_dumping_fix branch 2 times, most recently from 3b49ecb to b57424d Compare September 25, 2024 16:54
@linzhp linzhp requested a review from fmeum September 29, 2024 04:51
go/private/actions/compilepkg.bzl Show resolved Hide resolved
go/private/actions/compilepkg.bzl Outdated Show resolved Hide resolved
go/tools/builders/nogo.go Outdated Show resolved Hide resolved
go/tools/builders/nogo.go Outdated Show resolved Hide resolved
go/tools/builders/nogo.go Outdated Show resolved Hide resolved
go/tools/builders/nogo.go Outdated Show resolved Hide resolved
go/private/rules/test.bzl Outdated Show resolved Hide resolved
go/tools/builders/nogo_change.go Show resolved Hide resolved
@peng3141
Copy link
Author

peng3141 commented Oct 1, 2024

thanks for the comments, i am in the process of addressing them.

@peng3141 peng3141 force-pushed the rules_go_hack_for_dumping_fix branch 2 times, most recently from 4f41cce to f6bab4d Compare October 1, 2024 21:54
@peng3141
Copy link
Author

peng3141 commented Oct 1, 2024

this shows the log of the latest version:
https://gist.github.com/peng3141/bdaefac333434cf2ecbef4edfd8d0200

@peng3141 peng3141 force-pushed the rules_go_hack_for_dumping_fix branch 3 times, most recently from 47d7bf7 to ee5bf11 Compare October 2, 2024 15:23
@peng3141
Copy link
Author

peng3141 commented Oct 3, 2024

@fmeum @linzhp the PR is ready for review. Could you take a look? thanks!

could you focus on the high-level design, once it is agreed upon, we can have one pass to address the readability nits.

Copy link
Contributor

@linzhp linzhp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only half way through, posting some comments so far. I will continue later this week

go/tools/builders/nogo_validation.go Show resolved Hide resolved
// Otherwise, bazel will complain "not all outputs were created or valid"
change, err := NewChangeFromDiagnostics(diagnostics, pkg.fset)
if err != nil {
errs = append(errs, fmt.Errorf("errors in dumping nogo fix, specifically in converting diagnostics to change %v", err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are errors here, does it make sense to call ToPatches and SavePatchesToFile below?

Copy link
Author

@peng3141 peng3141 Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that was by design, although we may need to discuss about design choices here.

Btw, we need to write to empty string to nogoFixPath when there are errors. See nogo.go (line 100) for similar logic. Otherwise, bazel complains that the declared file is not defined.

Coming back to the design choices, here is another design in comparison:

if nogoFixPath != "" {
	// If nogo fixes are requested, save the fixes to the file even if they are empty.
	// This prevents Bazel from complaining about missing or invalid outputs.
	change, err := NewChangeFromDiagnostics(diagnostics, pkg.fset)
	if err != nil {
		// Ensure an empty patch file is saved when there's an error in generating the change.
		errs = append(errs, fmt.Errorf("error converting diagnostics to change: %v", err))
		if saveErr := SavePatchesToFile(nogoFixPath, nil); saveErr != nil {
			errs = append(errs, fmt.Errorf("error saving empty patches file: %v", saveErr))
		}
	} else {
		fileToPatch, err := ToPatches(Flatten(*change))
		if err != nil {
			errs = append(errs, fmt.Errorf("error generating patches: %v", err))
			if saveErr := SavePatchesToFile(nogoFixPath, nil); saveErr != nil {
				errs = append(errs, fmt.Errorf("error saving empty patches file: %v", saveErr))
			}
		} else {
			if err := SavePatchesToFile(nogoFixPath, fileToPatch); err != nil {
				errs = append(errs, fmt.Errorf("error saving patches to file: %v", err))
			}
		}
	}
}

In my current design, when NewChangeFromDiagnostics returns error, the change has partial result of fixes which can be still applied. Let us consider this case:
There are high-quality analyzers that produce great fixes, and there are some poorly written analyzers that produce wrong fixes, e.g., they produced corrupted offsets. Current design still allows the fixes from the well-written analyzers to be applied.

There is one caveat case: the change may already include some edits from the bad analyzer, which have valid offsets but are of poor quality.

In my opinion, the nogo framework should faithfully apply the fixes that are applicable (i.e., those with valid offsets). It should not ban fixes from all analyzers in the case that one analyzer is bad. Also it is the responsibility of the monorepo owners to remove/fix bad analyzers.

Let me know your thoughts,

Copy link
Author

@peng3141 peng3141 Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we adopt my current design, I also updated NewChangeFromDiagnostics to more permissive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the ability to let user choose analyzers to trust

panic("wrong size")
}

return string(out), nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we return bytes instead? The out is converted to string here and immediately converted back to []byte by its only caller

// The following is about the `Change`, a high-level abstraction of edits.
// Change represents a set of edits to be applied to a set of files.
type Change struct {
AnalyzerToFileToEdits map[string]map[string][]Edit `json:"analyzer_file_to_edits"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need 3 levels of nesting, only to be flattened later?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we need this.

the two levels file:edits is required, since we track and apply patch per file.

besides, see the Flatten() function, it considers the cases that different analyzers produce conflicting edits, i.e., edits that overlap with each other. In this case, it is impossible to apply both edits.

we will ignore the latter analyzer (sorted already for determinism) but still allow the former analyzer to proceed.
This is why we add the extra indirection of indexing by analyzer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid overusing maps and create more informative data structure: https://abhinav.github.io/future-proof-packages-2023/#/%EF%B8%8F-map-overuse

@peng3141 peng3141 force-pushed the rules_go_hack_for_dumping_fix branch from ee5bf11 to 23277ea Compare October 4, 2024 14:57

// Trim left
for i := 0; i < len(lines); i++ {
if hasNonWhitespaceCharacter(lines[i]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if hasNonWhitespaceCharacter(lines[i]) {
if strings.TrimSpace(lines[i]) == "" {

this is the same, right?

}

// LoadPatchesFromFile loads the map[string]string (file paths to patch content) from a JSON file.
// Note LoadPatchesFromFile is used for testing only.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test utilities should be in _test.go. Putting here and export it means it's part of public API

}

diff := UnifiedDiff{
// difflib.SplitLines does not handle well the whitespace at the beginning or the end.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if difflib.SplitLines doesn't work well, can we not use it? It's also inefficient: you first read the whole file into the memory and then split it, which doubles the memory usage. You could just read the file line by line with bufio.Scanner

// Otherwise, bazel will complain "not all outputs were created or valid"
change, err := NewChangeFromDiagnostics(diagnostics, pkg.fset)
if err != nil {
errs = append(errs, fmt.Errorf("errors in dumping nogo fix, specifically in converting diagnostics to change %v", err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the ability to let user choose analyzers to trust

}

// Flatten takes a Change and returns a map of FileToEdits, merging edits from all analyzers.
func Flatten(change Change) map[string][]Edit {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we produce one patch file per analyzer? That way, we don't need to merge edits to the same file from different analyzers. Like you said, some analyzers may produce bad edits. When users apply patches one by one, they can ignore those patched produced by bad analyzers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants