Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/logictest: TestParallel failed #81278

Closed
cockroach-teamcity opened this issue May 15, 2022 · 6 comments
Closed

sql/logictest: TestParallel failed #81278

cockroach-teamcity opened this issue May 15, 2022 · 6 comments
Assignees
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-storage Storage Team

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented May 15, 2022

sql/logictest.TestParallel failed with artifacts on master @ bc5f5b7159ed2b7dbffa94bb2a590fdddb642b9b:

=== RUN   TestParallel/subquery_retry_multinode
[07:53:06] rng seed: 2414127004467792118

[07:53:06] --- done: testdata/parallel_test/subquery_retry_multinode/setup with config : 2 tests, 0 failures
[07:53:06] rng seed: -4875557951589801784

[07:53:06] rng seed: 978107826719808660

[07:53:06] rng seed: -6100648960853086908

[07:53:06] rng seed: 4390598264454698310

[07:53:06] rng seed: 7240843997238583220

[07:53:06] --- done: testdata/parallel_test/subquery_retry_multinode/txn with config : 1 tests, 0 failures
[07:53:06] --- done: testdata/parallel_test/subquery_retry_multinode/txn with config : 1 tests, 0 failures
[07:53:06] --- done: testdata/parallel_test/subquery_retry_multinode/txn with config : 1 tests, 0 failures
[07:53:06] --- done: testdata/parallel_test/subquery_retry_multinode/txn with config : 1 tests, 0 failures
[07:53:06] --- done: testdata/parallel_test/subquery_retry_multinode/txn with config : 1 tests, 0 failures
[07:53:06] rng seed: -2623638579287344242

[07:53:06] --- done: testdata/parallel_test/subquery_retry_multinode/final with config : 2 tests, 0 failures
Help

See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:

  • TAGS=bazel,gss,deadlock

/cc @cockroachdb/sql-queries

This test on roachdash | Improve this report!

Jira issue: CRDB-15257

Epic CRDB-16237

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. labels May 15, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label May 15, 2022
@rytaft
Copy link
Collaborator

rytaft commented May 17, 2022

Might be a Bazel run

@yuzefovich
Copy link
Member

The actual failure is here.

panic: pebble: inconsistent reference count: 1

goroutine 2704187 [running]:
github.com/cockroachdb/pebble.(*flushableEntry).readerRef(...)
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/flushable.go:64
github.com/cockroachdb/pebble.(*DB).updateReadStateLocked(0xc001092900, 0x0, 0x0)
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/read_state.go:92 +0x2a8
github.com/cockroachdb/pebble.(*DB).makeRoomForWrite(0xc001092900, 0x0)
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/db.go:1962 +0x1167
github.com/cockroachdb/pebble.(*DB).maybeScheduleDelayedFlush.func1()
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/compaction.go:1390 +0x253
created by github.com/cockroachdb/pebble.(*DB).maybeScheduleDelayedFlush
	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/compaction.go:1364 +0x10f
I220515 07:53:07.027687 1 (gostd) testmain.go:82  [-] 1  Test //pkg/sql/logictest:logictest_test exited with error code 2


ERROR: <nil>

64 runs completed, 1 failures, over 10m39s

@blathers-crl blathers-crl bot added the T-storage Storage Team label May 17, 2022
@michae2 michae2 removed their assignment May 17, 2022
@jlinder jlinder added sync-me and removed sync-me labels May 20, 2022
@nicktrav
Copy link
Collaborator

Note to self: seem to recall seeing something like this (i.e. ref counts on flushables) recently. Will take a closer look.

@exalate-issue-sync exalate-issue-sync bot removed the T-sql-queries SQL Queries Team label Jun 29, 2022
@nicktrav nicktrav self-assigned this Jun 29, 2022
@nicktrav
Copy link
Collaborator

I did a stress test of this on both Darwin and Linux at the SHA mentioned in the original description, but I didn't get any hits (yet).

$ ./dev test ./pkg/sql/logictest --filter '^TestParallel$' --stress
$ bazel test pkg/sql/logictest:all --test_env=GOTRACEBACK=all --test_timeout=86400 --run_under '@com_github_cockroachdb_stress//:stress -bazel -shardable-artifacts '"'"'XML_OUTPUT_FILE=/Users/nickt/go/src/github.com/cockroachdb/cockroach/
bin/dev-versions/dev.32 merge-test-xmls'"'"' ' '--test_filter=^TestParallel$' --test_sharding_strategy=disabled --test_output streamed
INFO: Invocation ID: 6faa5152-b18a-41c7-beaf-768a3793e352
WARNING: Streamed test output requested. All tests will be run locally, without sharding, one at a time
INFO: Build option --run_under has changed, discarding analysis cache.
INFO: Analyzed 3 targets (0 packages loaded, 22468 targets configured).
INFO: Found 2 targets and 1 test target...
0 runs so far, 0 failures, over 5s
0 runs so far, 0 failures, over 10s
...
5024 runs so far, 0 failures, over 2h2m5s

I'm trying to find the other issue I mentioned, but maybe I was just imagining it.

It's certainly possible we have a ref count leak somewhere, though tracking this down without a clean reproducer may be tough. Will keep this open for now.

@nicktrav
Copy link
Collaborator

The issue with a similar failure mode is #79879, which seems like an iterator leak, though we never found the cause.

@nicktrav
Copy link
Collaborator

Given this particular failure is unfortunately unactionable without a solid reproducer, I'm going to close it out.

We have cockroachdb/pebble#1597 that will help in debugging future occurrences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-storage Storage Team
Projects
None yet
Development

No branches or pull requests

6 participants