Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go test -race detects significant amount of data races #8586

Closed
jcajka opened this issue Feb 22, 2023 · 1 comment · Fixed by #13174 · May be fixed by #13391
Closed

go test -race detects significant amount of data races #8586

jcajka opened this issue Feb 22, 2023 · 1 comment · Fixed by #13174 · May be fixed by #13391

Comments

@jcajka
Copy link

jcajka commented Feb 22, 2023

Describe the bug
There appears to be race conditions detected by the Go's race detector. Is that expected?

To Reproduce
Steps to reproduce the behavior:

  1. in root directory of the loki source code run 'go test -race ./...'

Expected behavior
There are no races detected by the Go's race detector.

Environment:

  • Infrastructure: container, amd64/linux with go1.20

Screenshots, Promtail config, or terminal output
In attachment is output of the tests run with the race detector enabled and cleaned up to only failed test cases. It seems rather long, but there might be some duplicit race conditions. I haven't investigated deeper as IMHO it would be better suited for someone with deeper knowledge of the loki's codebase.

For bit of context I have seen some tests failures on other than amd64/linux, that from guts feeling could have been race conditions. So I have run the race detector on amd64/linux to rule out any preexisting ones.
loki-race.log

@jcajka jcajka changed the title go test -race detecs significant amount of data races go test -race detects significant amount of data races Feb 22, 2023
@jeschkies
Copy link
Contributor

Thanks for pointing this out. You are correct but the tests run quite long to be executed by the CI. So we'd have to find a better way.

MasslessParticle pushed a commit that referenced this issue Aug 22, 2023
**What this PR does / why we need it**:
We must call wg.Add before starting the goroutine.
And protect the data in mockCompactorClient.
And another fix in shipper/index/compactor.

**Which issue(s) this PR fixes**:
Part of #8586

**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- NA Documentation added
- NA Tests updated
- NA `CHANGELOG.md` updated
- NA If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- NA Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- NA For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e)
kavirajk pushed a commit that referenced this issue Aug 25, 2023
Backport cf353bb from #10310

---

**What this PR does / why we need it**:
~Lock around data that is modified from multiple goroutines
concurrently.~
Replaced with the implementation from #9984 since @akhilanarayanan did
it more efficiently.

**Which issue(s) this PR fixes**:
Relates to #8586

**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- NA Documentation added
- NA Tests updated
- [x] `CHANGELOG.md` updated
- NA If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- NA Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- NA For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e)

Co-authored-by: Bryan Boreham <[email protected]>
jeschkies pushed a commit that referenced this issue Aug 30, 2023
**What this PR does / why we need it**:
There were many race conditions reported if you run `go test -race` on
this package.
Google's 'singleflight' package is much neater and makes all the race
warnings go away.

**Which issue(s) this PR fixes**:
Relates to #8586

**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- NA Documentation added
- NA Tests updated
- [x] `CHANGELOG.md` updated
- NA If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- NA Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- NA For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e)
chaudum pushed a commit that referenced this issue Sep 14, 2023
There were many race conditions reported if you run `go test -race` on
this package.
Google's 'singleflight' package is much neater and makes all the race
warnings go away.

**Which issue(s) this PR fixes**:
Relates to #8586
chaudum added a commit that referenced this issue Sep 14, 2023
There were many race conditions reported if you run `go test -race` on this package.
Google's 'singleflight' package is much neater and makes all the race warnings go away.

**Which issue(s) this PR fixes**:

Relates to #8586

**Special notes for your reviewer**:

This is a backport of #10314 to the
2.9.x release branch.

---------

Signed-off-by: Christian Haudum <[email protected]>
Co-authored-by: Bryan Boreham <[email protected]>
paul1r added a commit that referenced this issue Feb 12, 2024
**What this PR does / why we need it**:
A data race existed with the workerID variable, as it could be modified
by multiple goroutines.

Relates to: #8586
--

Before fix:
```
go test -count=1 -race ./pkg/querier/worker
==================
WARNING: DATA RACE
Read at 0x00c000494108 by goroutine 229:
  github.com/grafana/loki/pkg/querier/worker.(*processorManager).concurrency.func1()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/processor_manager.go:81 +0x118

Previous write at 0x00c000494108 by goroutine 222:
  github.com/grafana/loki/pkg/querier/worker.(*processorManager).concurrency()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/processor_manager.go:70 +0x108
  github.com/grafana/loki/pkg/querier/worker.(*querierWorker).resetConcurrency()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker.go:267 +0x10c
  github.com/grafana/loki/pkg/querier/worker.(*querierWorker).AddressAdded()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker.go:219 +0x868
  github.com/grafana/loki/pkg/querier/worker.TestResetConcurrency.func1()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker_test.go:64 +0x1c8
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40

Goroutine 229 (running) created at:
  github.com/grafana/loki/pkg/querier/worker.(*processorManager).concurrency()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/processor_manager.go:75 +0xcc
  github.com/grafana/loki/pkg/querier/worker.(*querierWorker).resetConcurrency()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker.go:267 +0x10c
  github.com/grafana/loki/pkg/querier/worker.(*querierWorker).AddressAdded()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker.go:219 +0x868
  github.com/grafana/loki/pkg/querier/worker.TestResetConcurrency.func1()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker_test.go:64 +0x1c8
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40

Goroutine 222 (running) created at:
  testing.(*T).Run()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x5e8
  github.com/grafana/loki/pkg/querier/worker.TestResetConcurrency()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker_test.go:52 +0x1b0
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40
==================
--- FAIL: TestResetConcurrency (0.02s)
    --- FAIL: TestResetConcurrency/concurrency_is_correct_when_numTargets_does_not_divide_evenly_into_maxConcurrent (0.01s)
        testing.go:1465: race detected during execution of test
    testing.go:1465: race detected during execution of test
FAIL
FAIL	github.com/grafana/loki/pkg/querier/worker	4.626s
FAIL
```

--

After fix:

```
go clean -testcache
go test -count=1 -race ./pkg/querier/worker
ok  	github.com/grafana/loki/pkg/querier/worker	6.034s
```
**Which issue(s) this PR fixes**:
Fixes #<issue number>

**Special notes for your reviewer**:

**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e)
- [ ] If the change is deprecating or removing a configuration option,
update the `deprecated-config.yaml` and `deleted-config.yaml` files
respectively in the `tools/deprecated-config-checker` directory.
[Example
PR](0d4416a)
paul1r added a commit that referenced this issue Feb 13, 2024
**What this PR does / why we need it**:
This addresses the data race present on the `t.stopped` variable in
`tail.go`.

```
==================
WARNING: DATA RACE
Write at 0x00c00098b198 by goroutine 568:
  github.com/grafana/loki/pkg/querier.(*Tailer).close()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail.go:272 +0x104
  github.com/grafana/loki/pkg/querier.TestTailer.func7.2()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail_test.go:169 +0x34
  runtime.deferreturn()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/runtime/panic.go:477 +0x34
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40

Previous read at 0x00c00098b198 by goroutine 569:
  github.com/grafana/loki/pkg/querier.(*Tailer).loop()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail.go:88 +0x13c
  github.com/grafana/loki/pkg/querier.newTailer.func1()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail.go:342 +0x34

Goroutine 568 (running) created at:
  testing.(*T).Run()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x5e8
  github.com/grafana/loki/pkg/querier.TestTailer()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail_test.go:158 +0x10dc
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40

Goroutine 569 (running) created at:
  github.com/grafana/loki/pkg/querier.newTailer()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail.go:342 +0x300
  github.com/grafana/loki/pkg/querier.TestTailer.func7()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail_test.go:168 +0x138
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40
==================
```


**Which issue(s) this PR fixes**:
Relates to: #8586

**Special notes for your reviewer**:

**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e)
- [ ] If the change is deprecating or removing a configuration option,
update the `deprecated-config.yaml` and `deleted-config.yaml` files
respectively in the `tools/deprecated-config-checker` directory.
[Example
PR](0d4416a)
@paul1r paul1r mentioned this issue Apr 11, 2024
5 tasks
rhnasc pushed a commit to inloco/loki that referenced this issue Apr 12, 2024
**What this PR does / why we need it**:
A data race existed with the workerID variable, as it could be modified
by multiple goroutines.

Relates to: grafana#8586
--

Before fix:
```
go test -count=1 -race ./pkg/querier/worker
==================
WARNING: DATA RACE
Read at 0x00c000494108 by goroutine 229:
  github.com/grafana/loki/pkg/querier/worker.(*processorManager).concurrency.func1()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/processor_manager.go:81 +0x118

Previous write at 0x00c000494108 by goroutine 222:
  github.com/grafana/loki/pkg/querier/worker.(*processorManager).concurrency()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/processor_manager.go:70 +0x108
  github.com/grafana/loki/pkg/querier/worker.(*querierWorker).resetConcurrency()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker.go:267 +0x10c
  github.com/grafana/loki/pkg/querier/worker.(*querierWorker).AddressAdded()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker.go:219 +0x868
  github.com/grafana/loki/pkg/querier/worker.TestResetConcurrency.func1()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker_test.go:64 +0x1c8
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40

Goroutine 229 (running) created at:
  github.com/grafana/loki/pkg/querier/worker.(*processorManager).concurrency()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/processor_manager.go:75 +0xcc
  github.com/grafana/loki/pkg/querier/worker.(*querierWorker).resetConcurrency()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker.go:267 +0x10c
  github.com/grafana/loki/pkg/querier/worker.(*querierWorker).AddressAdded()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker.go:219 +0x868
  github.com/grafana/loki/pkg/querier/worker.TestResetConcurrency.func1()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker_test.go:64 +0x1c8
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40

Goroutine 222 (running) created at:
  testing.(*T).Run()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x5e8
  github.com/grafana/loki/pkg/querier/worker.TestResetConcurrency()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/worker/worker_test.go:52 +0x1b0
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40
==================
--- FAIL: TestResetConcurrency (0.02s)
    --- FAIL: TestResetConcurrency/concurrency_is_correct_when_numTargets_does_not_divide_evenly_into_maxConcurrent (0.01s)
        testing.go:1465: race detected during execution of test
    testing.go:1465: race detected during execution of test
FAIL
FAIL	github.com/grafana/loki/pkg/querier/worker	4.626s
FAIL
```

--

After fix:

```
go clean -testcache
go test -count=1 -race ./pkg/querier/worker
ok  	github.com/grafana/loki/pkg/querier/worker	6.034s
```
**Which issue(s) this PR fixes**:
Fixes #<issue number>

**Special notes for your reviewer**:

**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](grafana@d10549e)
- [ ] If the change is deprecating or removing a configuration option,
update the `deprecated-config.yaml` and `deleted-config.yaml` files
respectively in the `tools/deprecated-config-checker` directory.
[Example
PR](grafana@0d4416a)
rhnasc pushed a commit to inloco/loki that referenced this issue Apr 12, 2024
**What this PR does / why we need it**:
This addresses the data race present on the `t.stopped` variable in
`tail.go`.

```
==================
WARNING: DATA RACE
Write at 0x00c00098b198 by goroutine 568:
  github.com/grafana/loki/pkg/querier.(*Tailer).close()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail.go:272 +0x104
  github.com/grafana/loki/pkg/querier.TestTailer.func7.2()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail_test.go:169 +0x34
  runtime.deferreturn()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/runtime/panic.go:477 +0x34
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40

Previous read at 0x00c00098b198 by goroutine 569:
  github.com/grafana/loki/pkg/querier.(*Tailer).loop()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail.go:88 +0x13c
  github.com/grafana/loki/pkg/querier.newTailer.func1()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail.go:342 +0x34

Goroutine 568 (running) created at:
  testing.(*T).Run()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x5e8
  github.com/grafana/loki/pkg/querier.TestTailer()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail_test.go:158 +0x10dc
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40

Goroutine 569 (running) created at:
  github.com/grafana/loki/pkg/querier.newTailer()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail.go:342 +0x300
  github.com/grafana/loki/pkg/querier.TestTailer.func7()
      /Users/progers/dev/src/github.com/grafana/loki/pkg/querier/tail_test.go:168 +0x138
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40
==================
```


**Which issue(s) this PR fixes**:
Relates to: grafana#8586

**Special notes for your reviewer**:

**Checklist**
- [ ] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [ ] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](grafana@d10549e)
- [ ] If the change is deprecating or removing a configuration option,
update the `deprecated-config.yaml` and `deleted-config.yaml` files
respectively in the `tools/deprecated-config-checker` directory.
[Example
PR](grafana@0d4416a)
chaudum pushed a commit that referenced this issue Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants