-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go: cache not invalidated if testdata files are changed while the test is running #26790
Comments
That is indeed how it works. We log the files that the test opens as the test runs. Then when the test is complete we go back and read those files while building the cache. See This approach works well because all we have to do is log what the test actually does, which we do by, e.g., having |
@heschik suggested that we could hash the files both before and after the test, and refuse to cache the results if the hash changed during the run. Ideally, though, we only want to pre-hash the files that will actually be accessed. We could speculate on that set based on previous cached results, but if we missed an addition we'd need to re-run the test to be sure. |
I don't understand how we can hash the files before the test when we don't yet know which files will be opened. For example, the tests in image/gif read files in image/testdata, not image/gif/testdata. I suppose we could base it on previous test runs, but then we don't catch this problem if this is the first time we're running the test. If we're going for a complex solution it would be nice to have a complete one. |
I've been thinking about this a bit more, and it seems to me that the solution is pretty straightforward. We can add another entry to the cache for “the set of input file and directory names”. We would read that entry and hash the files and directories in the list, then run the test, and finally hash the files after the run and compare both the list and the hashes. If the actual list of files accessed is not (a subset of) the cached list, we can cache only the updated set of filenames and not the actual test results. We don't know whether the test results themselves are up-to-date, and don't particularly need to care, because we can always just run the test again the next time it is requested — and, assuming that it is reasonably deterministic, we'll be able to cache that result. Under that scheme, we'll miss the cache for the first two runs of any new test that accesses external files, but that doesn't seem like that big a deal: it only happens ~once per user per test, and many tests don't read external files anyway. If it's important to cache the very first run, I suppose we could add some mechanism for users to add the set of test inputs to source control (something along the lines of |
@rsc suggests that we could record the timestamp at the start of the test, then check mtimes when we hash file contents: if the file was modified at or after the start of the test, then we probably shouldn't cache it. |
I was editing
cmd/go/testdata/mod_test.txt
for #26722, and saw an unexpected(cached)
test result.I run tests pretty aggressively while editing, so I suspect that I saved the file while a previous test run was in progress. We appear to have cached the result based on the contents of the
testdata
directory after the run completed, not the contents at the start of the run.The text was updated successfully, but these errors were encountered: