Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Flaky TestOriginalSampleRateIsNotedInMetaField #896

Closed
VinozzZ opened this issue Nov 3, 2023 · 1 comment · Fixed by #934
Closed

CI Flaky TestOriginalSampleRateIsNotedInMetaField #896

VinozzZ opened this issue Nov 3, 2023 · 1 comment · Fixed by #934
Assignees
Labels
type: bug Something isn't working
Milestone

Comments

@VinozzZ
Copy link
Contributor

VinozzZ commented Nov 3, 2023

Test Name

TestOriginalSampleRateIsNotedInMetaField

Link to failure build

Test Output

TestOriginalSampleRateIsNotedInMetaField
collect_test.go:191:
Error Trace: /home/circleci/project/collect/collect_test.go:191
Error: Not equal:
expected: 2
actual : 1
Test: TestOriginalSampleRateIsNotedInMetaField
Messages: should be some events transmitted
panic: test timed out after 1m0s
running tests:
TestOriginalSampleRateIsNotedInMetaField (1m0s)

goroutine 54 [running]:
testing.(*M).startAlarm.func1()
/usr/local/go/src/testing/testing.go:2241 +0x219
created by time.goFunc
/usr/local/go/src/time/sleep.go:176 +0x48

goroutine 1 [chan receive]:
testing.(*T).Run(0xc0001829c0, {0xead944, 0x28}, 0xedbc90)
/usr/local/go/src/testing/testing.go:1630 +0x82e
testing.runTests.func1(0x0?)
/usr/local/go/src/testing/testing.go:2036 +0x8e
testing.tRunner(0xc0001829c0, 0xc0003dfb48)
/usr/local/go/src/testing/testing.go:1576 +0x217
testing.runTests(0xc0001b4500?, {0x14467c0, 0xf, 0xf}, {0x1c?, 0x4ac5f9?, 0x14589e0?})
/usr/local/go/src/testing/testing.go:2034 +0x87d
testing.(*M).Run(0xc0001b4500)
/usr/local/go/src/testing/testing.go:1906 +0xb45
main.main()
_testmain.go:75 +0x2ea

goroutine 5 [sync.RWMutex.Lock]:
sync.runtime_SemacquireRWMutex(0xc0001a6588?, 0x37?, 0xd25a95?)
/usr/local/go/src/runtime/sema.go:87 +0x26
sync.(*RWMutex).Lock(0xc0001a6588)
/usr/local/go/src/sync/rwmutex.go:152 +0x8b
github.com/honeycombio/refinery/transmit.(*MockTransmission).EnqueueSpan(0xc0001a6570, 0xc000494370)
/home/circleci/project/transmit/mock.go:25 +0x65
github.com/honeycombio/refinery/collect.(*InMemCollector).send(0xc000455040, 0xc000494420, {0xe9ff63, 0x17})
/home/circleci/project/collect/collect.go:654 +0xe95
github.com/honeycombio/refinery/collect.(*InMemCollector).Stop(0xc000455040)
/home/circleci/project/collect/collect.go:670 +0x258
panic({0xe459e0, 0xc0001566a8})
/usr/local/go/src/runtime/panic.go:890 +0x263
github.com/honeycombio/refinery/collect.TestOriginalSampleRateIsNotedInMetaField(0x0?)
/home/circleci/project/collect/collect_test.go:192 +0x1289
testing.tRunner(0xc0000076c0, 0xedbc90)
/usr/local/go/src/testing/testing.go:1576 +0x217
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:1629 +0x806

goroutine 51 [sleep]:
time.Sleep(0x186a0)
/usr/local/go/src/runtime/time.go:195 +0x135
github.com/honeycombio/refinery/collect/cache.NewCuckooTraceChecker.func1()
/home/circleci/project/collect/cache/cuckoo.go:63 +0x75
created by github.com/honeycombio/refinery/collect/cache.NewCuckooTraceChecker
/home/circleci/project/collect/cache/cuckoo.go:58 +0x225

goroutine 6 [sleep]:
time.Sleep(0x186a0)
/usr/local/go/src/runtime/time.go:195 +0x135
github.com/honeycombio/refinery/collect/cache.NewCuckooTraceChecker.func1()
/home/circleci/project/collect/cache/cuckoo.go:63 +0x75
created by github.com/honeycombio/refinery/collect/cache.NewCuckooTraceChecker
/home/circleci/project/collect/cache/cuckoo.go:58 +0x225

goroutine 7 [select]:
github.com/honeycombio/refinery/collect/cache.(*cuckooSentCache).monitor(0xc0000d6040)
/home/circleci/project/collect/cache/cuckooSentCache.go:108 +0x12c
created by github.com/honeycombio/refinery/collect/cache.NewCuckooSentCache
/home/circleci/project/collect/cache/cuckooSentCache.go:100 +0x245
FAIL github.com/honeycombio/refinery/collect 60.034s

@VinozzZ VinozzZ added the type: bug Something isn't working label Nov 3, 2023
@kentquirk
Copy link
Contributor

Also related to #897 #901 #902

Note this upcoming Go feature.

In particular, some sampler tests need to be probabalistic because the samplers themselves are random. A simple fix is to tweak the rate at which they fail, but if we increase the failure probability then we decrease our likelihood of detecting a real problem.

The way I'd like to fix most tests that are failing randomly is to add code to re-run them one time if they fail (as the Go feature above will do, but we can't depend on it yet).

@kentquirk kentquirk added this to the v2.2 milestone Nov 21, 2023
@kentquirk kentquirk modified the milestones: v2.2, v2.3 Nov 28, 2023
@kentquirk kentquirk self-assigned this Dec 6, 2023
kentquirk added a commit that referenced this issue Dec 6, 2023
## Which problem is this PR solving?

We've had some problematic flaky tests. 
* Some were because of probabalistic assertions relating to samplers --
we now run the worst of those a second time if they fail.
* Some were waiting for events on another goroutine that might not have
been scheduled, and these have been adjusted by using
`assert.Eventually` which is actually pretty useful.

Fixes #896
Fixes #897
Fixes #901
Fixes #902

Plus a couple of other tests that hadn't gotten their own issue.

This might also make the tests run a little bit faster overall.

You might want to turn off whitespace when reviewing this, because the
retry loop changed indentation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants