cmd/go: cache not invalidated if testdata files are changed while the test is running #26790

bcmills · 2018-08-03T15:55:24Z

I was editing cmd/go/testdata/mod_test.txt for #26722, and saw an unexpected (cached) test result.

I run tests pretty aggressively while editing, so I suspect that I saved the file while a previous test run was in progress. We appear to have cached the result based on the contents of the testdata directory after the run completed, not the contents at the start of the run.

~/go/src$ go test cmd/go -run=TestScript
ok      cmd/go  (cached)

~/go/src$ GODEBUG=gocacheverify=1 go test cmd/go -run=TestScript
go test proxy starting
go test proxy running at GOPROXY=http://127.0.0.1:38657/mod
go proxy: no archive w.1 v1.2.0
go proxy: no archive x.1 v1.0.0
go proxy: no archive z.1 v1.2.0
go proxy: no archive golang.org/x/text/language 14c0d48
go proxy: no archive golang.org/x/text/language 14c0d48
go proxy: no archive golang.org/x/text/language 14c0d48
go proxy: no archive golang.org/x/text/foo 14c0d48
go proxy: no archive sub.1 v1.0.0
go proxy: no archive badsub.1 v1.0.0
go proxy: no archive versioned.1 v1.0.0
go proxy: no archive versioned.1 v1.1.0
--- FAIL: TestScript (0.00s)
    --- FAIL: TestScript/mod_test (1.28s)
        script_test.go:150:
            # A test in the module's root package should work. (1.107s)
            # A test with the "_test" suffix in the module root should also work. (0.171s)
            > cd ../b/
            $WORK/gopath/src/b
            > go test
            [stderr]
            build github.com/user/b_test: cannot find module for path github.com/user/b_test
            [exit status 1]
            FAIL: testdata/script/mod_test.txt:10: unexpected command failure

FAIL
FAIL    cmd/go  9.231s

~/go/src$ go test cmd/go -run=TestScript
ok      cmd/go  (cached)

~/go/src$ go version
go version devel +fc1a18c452 Fri Aug 3 11:36:32 2018 -0400 linux/amd64

The text was updated successfully, but these errors were encountered:

ianlancetaylor · 2018-08-03T17:40:37Z

That is indeed how it works. We log the files that the test opens as the test runs. Then when the test is complete we go back and read those files while building the cache. See runCache.saveOutput in cmd/go/internal/test/test.go.

This approach works well because all we have to do is log what the test actually does, which we do by, e.g., having os.OpenFile call testlog.Open. In order to avoid the problem you describe, I think we would have to not just call testlog.Open, but have the test actually read and hash the contents of the file. I am concerned that adding these extra operations would perturb the behavior of some tests.

bcmills · 2018-08-03T19:26:11Z

@heschik suggested that we could hash the files both before and after the test, and refuse to cache the results if the hash changed during the run.

Ideally, though, we only want to pre-hash the files that will actually be accessed. We could speculate on that set based on previous cached results, but if we missed an addition we'd need to re-run the test to be sure.

ianlancetaylor · 2018-08-03T19:59:46Z

I don't understand how we can hash the files before the test when we don't yet know which files will be opened. For example, the tests in image/gif read files in image/testdata, not image/gif/testdata.

I suppose we could base it on previous test runs, but then we don't catch this problem if this is the first time we're running the test. If we're going for a complex solution it would be nice to have a complete one.

bcmills · 2018-08-09T14:19:44Z

I don't understand how we can hash the files before the test when we don't yet know which files will be opened.

I've been thinking about this a bit more, and it seems to me that the solution is pretty straightforward. We can add another entry to the cache for “the set of input file and directory names”. We would read that entry and hash the files and directories in the list, then run the test, and finally hash the files after the run and compare both the list and the hashes.

If the actual list of files accessed is not (a subset of) the cached list, we can cache only the updated set of filenames and not the actual test results. We don't know whether the test results themselves are up-to-date, and don't particularly need to care, because we can always just run the test again the next time it is requested — and, assuming that it is reasonably deterministic, we'll be able to cache that result.

Under that scheme, we'll miss the cache for the first two runs of any new test that accesses external files, but that doesn't seem like that big a deal: it only happens ~once per user per test, and many tests don't read external files anyway.

If it's important to cache the very first run, I suppose we could add some mechanism for users to add the set of test inputs to source control (something along the lines of .d files), but I'd rather not add that unless we really need it.

bcmills · 2018-10-25T15:23:00Z

@rsc suggests that we could record the timestamp at the start of the test, then check mtimes when we hash file contents: if the file was modified at or after the start of the test, then we probably shouldn't cache it.

bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 3, 2018

bcmills added this to the Go1.12 milestone Aug 3, 2018

bcmills mentioned this issue Sep 11, 2018

cmd/go: -exec prevents test caching #27207

Open

bcmills modified the milestones: Go1.12, Go1.13 Oct 25, 2018

myitcv mentioned this issue Feb 19, 2019

cmd/go: unexpected test cache hit of "PASS" when test should "FAIL" #30313

Closed

andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019

rsc modified the milestones: Go1.14, Backlog Oct 9, 2019

gabyhelp mentioned this issue Sep 1, 2024

cmd/go: GO111MODULE does not invalidate test cache #69203

Closed

gabyhelp mentioned this issue Sep 21, 2024

cmd/go: invoking go run from go test can corrupt build cache #69566

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/go: cache not invalidated if testdata files are changed while the test is running #26790

cmd/go: cache not invalidated if testdata files are changed while the test is running #26790

bcmills commented Aug 3, 2018

ianlancetaylor commented Aug 3, 2018

bcmills commented Aug 3, 2018

ianlancetaylor commented Aug 3, 2018

bcmills commented Aug 9, 2018

bcmills commented Oct 25, 2018 •

edited

Loading

cmd/go: cache not invalidated if testdata files are changed while the test is running #26790

cmd/go: cache not invalidated if testdata files are changed while the test is running #26790

Comments

bcmills commented Aug 3, 2018

ianlancetaylor commented Aug 3, 2018

bcmills commented Aug 3, 2018

ianlancetaylor commented Aug 3, 2018

bcmills commented Aug 9, 2018

bcmills commented Oct 25, 2018 • edited Loading

bcmills commented Oct 25, 2018 •

edited

Loading