Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: OOM during TPCC with schema changes #69317

Open
ajwerner opened this issue Aug 24, 2021 · 14 comments
Open

sql: OOM during TPCC with schema changes #69317

ajwerner opened this issue Aug 24, 2021 · 14 comments
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@ajwerner
Copy link
Contributor

ajwerner commented Aug 24, 2021

Describe the problem

This is meta issue to clearly phrase observations from debugging #69100 and #69086. In both of those we see OOMs. We've tracked this down to a change in prepared statement handling due to #68608. We observe that pgx is issuing lots of DEALLOCATE commands, presumably because of something involving its LRU. However, I did try increasing the LRU size to no avail.

https://github.com/jackc/pgconn/blob/53f5fed36c570f0b5c98d6ec2415658c7b9bd11c/stmtcache/lru.go#L118

The issue is not about the DEALLOCATE or cache misses but rather about the OOM that seems to come with it.

To Reproduce

Run schemachange/index/tpcc/w=1000 at 7c36a9d and it seems to happen every time.

Additional data / screenshots

One observation is that we have a lot of allocations in the optimizer and the in numeric constants. These numeric constants hold on to strings which are backed by the pgwire buffer reader. That might be part of the problem and might point to why we don't see the data in the heap profile. However, there's still the question of why we're holding on to all of these. Maybe it's related to the statements metrics.

Another point of confusion is the extent to which the schema changes are actually relevant here. Is it the straw that breaks the camels back?

Jira issue: CRDB-9542

@ajwerner ajwerner added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Aug 24, 2021
@rafiss
Copy link
Collaborator

rafiss commented Aug 25, 2021

Note to all working on this: now that #69313 has been merged, the test no longer OOMs, but the underlying issue still exists. We should continue our effort to understand the OOMs better. To reproduce the OOMs, apply this diff:

diff --git a/pkg/workload/pgx_helpers.go b/pkg/workload/pgx_helpers.go
index 356da23a61..5a7a517abf 100644
--- a/pkg/workload/pgx_helpers.go
+++ b/pkg/workload/pgx_helpers.go
@@ -61,9 +61,6 @@ func NewMultiConnPool(
         connsPerPool := distributeMax(connsPerURL[i], maxConnsPerPool)
         for _, numConns := range connsPerPool {
             connCfg, err := pgxpool.ParseConfig(urls[i])
-            // Disable the automatic prepared statement cache. We've seen a lot of
-            // churn in this cache since workloads create many of different queries.
-            connCfg.ConnConfig.BuildStatementCache = nil
             if err != nil {
                 return nil, err
             }

I tried out #69375 to see if storing stats for all the DEALLOCATE statements separately was the problem, but that PR still resulted in an OOM.

Next theory: perhaps the issue is with the prepared statement cache having a memory leak. @ajwerner observes that:

That’d be bad but worth looking into. It would align with the profiles. In fact, it might align better. Notice that a ton of memory is in these numeric constants which are all holding references to strings which are all part of a buffer that’s at least 4k. So, we know that each prepared statement with a numeric constant is holding on to at least 4k.

@rafiss
Copy link
Collaborator

rafiss commented Aug 25, 2021

I believe there is some sort of issue with dangling references to the prepared statements. I added this diff on top of the one that allows the pgx stmt cache:

diff --git a/pkg/sql/prepared_stmt.go b/pkg/sql/prepared_stmt.go
index 07ac0665f8..9dfef076e4 100644
--- a/pkg/sql/prepared_stmt.go
+++ b/pkg/sql/prepared_stmt.go
@@ -85,6 +85,7 @@ func (p *PreparedStatement) decRef(ctx context.Context) {
                log.Fatal(ctx, "corrupt PreparedStatement refcount")
        }
        p.refCount--
+       log.Warningf(ctx, "decRef: refCount=%d stmt=%s", p.refCount, p.PrepareMetadata.Statement)
        if p.refCount == 0 {
                p.memAcc.Close(ctx)
        }

Then ran schemachange/index/tpcc/w=1000 and noticed this:

ubuntu@rafi-1629927880-01-n5cpu16-0004:~/logs$ grep 'refCount=0' *log | wc
      7     297    3421

ubuntu@rafi-1629927880-01-n5cpu16-0004:~/logs$ grep 'refCount=1' *log | wc
grep: cockroach.rafi-1629927880-01-n5cpu16-0004.ubuntu.2021-08-26T03_21_43Z.011088.log: No such file or directory
 113377 4918900 55051570

@RaduBerinde
Copy link
Member

We store portals in a map by value: portals map[string]PreparedPortal and yet we're doing some kind of refcounting on them. Looks like these were pointers and they were changed to values in #48842

@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Aug 26, 2021
@rafiss rafiss added branch-master Failures and bugs on the master branch. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Aug 26, 2021
@yuzefovich
Copy link
Member

I'm currently looking into this.

I tried switching portals map[string]PreparedPortal to portals map[string]*PreparedPortal, and it didn't help, so I would guess that the switch from pointers to values in #48842 isn't to blame.

@RaduBerinde
Copy link
Member

Yeah, we looked at it more and that doesn't explain it. I also noticed that we're not clearing prepStmtsNamespaceAtTxnRewindPos when we reset the txn state, which means we may keep around a snapshot of the namespace indefinitely. But that wouldn't explain accumulation of prepared statements either.

@yuzefovich
Copy link
Member

I did find something suspicious, here is the comment on preparedStatementsAccessor.Delete:

	// Delete removes the PreparedStatement with the provided name from the
	// collection. If a portal exists for that statement, it is also removed.
	// The method returns true if statement with that name was found and removed,
	// false otherwise.
	Delete(ctx context.Context, name string) bool

Note that we currently don't actually delete any portals created for the prepared statement. The postgres docs aren't clear whether we should be deleting the portals or not, and I haven't looked into the source code yet.

I also think it's possible that the switch to values instead of pointers introduced a bug. In prepStmtNamespace.resetTo we have this code snippet:

	for name, p := range to.portals {
		p.incRef(ctx)
		ns.portals[name] = p
	}

We're taking PreparedPortal, incrementing the reference on it, and then setting it into the space being reset. However, I think the "to" space still contains the original unincremented refcount.

@yuzefovich
Copy link
Member

FWIW on the binary with the fixes to both of the problems I mentioned above the schemachange roachtest has successfully gotten past the point of completing all 3 schema changes queries. (I did pollute the binary with logging, but hoping that didn't matter.)

@rafiss
Copy link
Collaborator

rafiss commented Aug 27, 2021

Note that we currently don't actually delete any portals created for the prepared statement.

Portals that are created outside of a transaction are supposed to automatically be destroyed after they are executed. Portals created in a transaction are supposed to automatically be destroyed at the end of the transaction.

See f612484. one thing I don't see though is that resetExtraTxnState is not called in the txnCommit case -- I only see it happening for txnRestart or txnAborted (despite the comments on resetExtraTxnState saying that it's called when a transaction "commits, rolls back or restarts.")

@RaduBerinde
Copy link
Member

RaduBerinde commented Aug 27, 2021

I only see it happening for txnRestart or txnAborted

There's a fallthrough above that code.

2624     fallthrough
2625   case txnRestart, txnRollback:
2626     if err := ex.resetExtraTxnState(ex.Ctx(), advInfo.txnEvent); err != nil {
2627       return advanceInfo{}, err
2628     }
2629   default:

@RaduBerinde
Copy link
Member

@yuzefovich definitely having the refcounts on values is a problem, but it seems to me like it can only result in "premature" final decRef, and not in the "missing" decRef. In any case, calling or not calling decRef shouldn't affect whether the prepared statement stays alive in memory. I hope your change does fix it, but I'd be curious to find out how exactly.

@yuzefovich
Copy link
Member

Tough luck - most runs with the fixes to both possible issues outlines above still failed. I also included the deletion of portals from prepStmtsNamespaceAtTxnRewindPos into resetExtraTxnState, and I still see failures.

@yuzefovich yuzefovich self-assigned this Aug 28, 2021
craig bot pushed a commit that referenced this issue Aug 28, 2021
68831: ui: add date range selector in stmts and txns pages r=xinhaoz a=xinhaoz

Resolves #68089, #69474

This commit adds date range state
properties and a date range selector component to the DB
console's statements and transactions pages. This change
is necessary to support surfacing persisted
statistics in the DB console.

This commit adds date range state
properties and a date range selector component to the DB
console's statements and transactions pages. This change
is necessary to support surfacing persisted
statistics in the DB console.

The date range query parameters used by the api to fetch
data is stored in localSettings on db console and
localStorage in cluster-ui. The default date range
is set to 1 hour ago, and is used as the value when user's
reset the date range.

The user is able to select dates as far back as 1 year ago,
this does not mean there is data available.

Release justification: category 4 low risk, high benefit
changes to existing functionality

Release note (ui change): New date range selector component
to the DB console's statements page with the ability to show
historical data. The default date range is set to 1 hour ago,
and is used as the value when user's reset the date range.

------------------
![image](https://user-images.githubusercontent.com/20136951/131007768-9a7bfeb6-a0f0-417a-8160-51f8d636f73b.png)
![image](https://user-images.githubusercontent.com/20136951/131007809-ff98359e-bfcc-4cb1-9b0b-eb34f8d04cd0.png)
![image](https://user-images.githubusercontent.com/20136951/131036176-11b03aa6-16e5-4601-b92e-e13dd0dc03df.png)
![image](https://user-images.githubusercontent.com/20136951/131198830-6b727e68-0983-483e-9385-f23e16cde705.png)

Transactions page:
![image](https://user-images.githubusercontent.com/20136951/131222215-bb2b013c-034e-40bf-8d83-7f5d9784065a.png)


69375: tree: redact name in DEALLOCATE in FmtHideConstants r=ajwerner,maryliag a=rafiss

This will make it so that all DEALLOCATE statements show up in one row
in the SQL Statements UI and in the internal stats data structures.

Recently, a workload was updated (accidentally) to start creating more
DEALLOCATE statements, and it made the UI unusable because there were so
many DEALLOCATE statement keys.

The second commit is a rename to stop implying that statements with
redacted constants are "anonymized" -- they still have names+identifiers.
This also renames the name of a field in a response payload.

Created while working on #69317 but doesn't fix it. 

Release justification: fix a high-priority bugs with a low-risk change.
Release note: None

69457: sql, import: add metrics for max_row_size guardrails r=rytaft,yuzefovich a=michae2

**sql: rename max_row_size guardrails to match transaction row limits**

Addresses: #67400

Rename sql.mutations.max_row_size.{log|err} to
sql.guardrails.max_row_size_{log|err} for consistency with
transaction_rows_{read|written}_{log|err} and upcoming metrics.

Release justification: Low-risk update to new functionality.

Release note (ops change): New variables
sql.mutations.max_row_size.{log|err} were renamed to
sql.guardrails.max_row_size_{log|err} for consistency with other
variables and metrics.

**sql/row: remove dependency on sql/execinfra**

The next commit needs to make sql/row a dependency of sql/execinfra, so
remove two small references to sql/execinfra.

(I will squash this into the next commit before merging.)

Release justification: Low-risk update to new functionality.

Release note: None

**sql, import: plumb RowMetrics from ExecutorConfig to row.Helper**

Addresses: #67400

Add metrics for sql/row and pass them down from ExecutorConfig and
FlowCtx all the way to row.Helper. Like sql.Metrics, there are two
copies, one for user queries and one for internal queries. (These were
originally part of sql.Metrics, but there are several users of sql/row
that do not operate under a sql.Server or connExecutor so we are forced
to add these to the ExecutorConfig and FlowCtx instead.)

I ran into difficulty passing these through import, as FlowCtx itself
is not plumbed through.

There are only two metrics at first, corresponding to
sql.guardrails.max_row_size_{log|err}.

Release justification: Low-risk update to new functionality.

Release note (ops): Added four new metrics,
sql.guardrails.max_row_size_{log|err}.count{.internal} which are
incremented whenever a large row violates
sql.guardrails.max_row_size_{log|err}.

69535: server: remove non-deterministic test r=maryliag a=maryliag

A new test was added but it would fail depending on the
time it was executed. This commit remove this flaky test
so in a future PR we can add new set of tests that make
more sense to test this functionality.

Fixes #69533

Release justification: Category 3 bug fix
Release note: None

Co-authored-by: Xin Hao Zhang <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Michael Erickson <[email protected]>
Co-authored-by: Marylia Gutierrez <[email protected]>
@yuzefovich
Copy link
Member

Removing the release-blocker label since this is not a regression AFAICT.

@yuzefovich yuzefovich removed the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Aug 30, 2021
@yuzefovich yuzefovich removed their assignment Aug 30, 2021
@yuzefovich
Copy link
Member

I think I was too quick to say that #69479 - I'm now seeing the same failures on schemachange roachtest w/ (with different versions) and w/o that PR, so putting back into the triage box.

@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

5 participants