-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
attempting to GC indexes: clearing index 2: command is too large #61206
Comments
Hello, I am Blathers. I am here to help you get the issue triaged. Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here. I have CC'd a few people who may be able to assist you:
If we have not gotten back to your issue within a few business days, you can try the following:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Is this using interleaving somehow? |
There is a cluster setting with the max command size; that may unstick you but it carries modest risk. Definitely set it back down afterwards. Fortunately this isn't attempting to do anything totally nuts. The other option is to decrease the max range size for the database to, say, 32 MiB (or just the default one if that database is gone now). That's a safer choice but will incur more load on the cluster as splitting and merging happens. |
Interestingly it seems like that data still got dumped regardless... so I guess there's nothing about our cluster that needs remediation per se. But I assume you would want to prevent this from happening in the future. Nothing about the dataset was really unusual. And no, no interleaving. This was a really simple dataset with one table, a few columns with one INT primary key. |
"Still got dumped" as in gc happened? Also, that error, where are you seeing it? Is it in the logs or did it return from truncate itself? |
What I mean is, the number of live bytes dropped dramatically, so seemingly most of the data got cleared if not all of it. This error is not on the TRUNCATE job, it's on the GC job that followed it, i.e. it's If that failure does leave data leftover, will it eventually get cleared in the normal compaction process? |
It's bad that that job failed. We've had recent discussions about whether we should ever let that job fail. #55740 I can help you to re-start that job. In the meantime, can you grab a copy of its record before it gets deleted by the system? That'd be: SELECT id, status, created, encode(payload, 'hex'), encode(progress, 'hex') FROM system.jobs WHERE id = <relevant job id>; |
What I don't understand is why the GC job would be sending a large raft command. The clear range operation it uses should be small. |
Are your keys somehow absolutely gigantic? |
I think I've got a lead on this one. We'll need to do some manual things to recover the job. Thanks for the bug report! |
Actually we're still pretty confused. Do you have any more intel on the structure of these tables to share? |
Yeah so I can at least say it's an extremely simple table, basically a few INT columns that together are the primary key. No other columns, no indexes. I just sent a debug zip through the support portal and tagged this issue. |
If y'all couldn't find anything and don't intend to investigate further (because it's an alpha), it's okay if you want to close this, we're good as far as our cluster. |
I think we've found the cause of this, submitted a fix in #74674. |
Describe the problem
On truncating a table with about 180GB of data, the GC got this error:
attempting to GC indexes: clearing index 2: command is too large: 120227141 bytes (max: 67108864)
This was data we had imported (via IMPORT INTO ... CSV) within the past day.
Note, this is using v21.1.0-alpha3 in order to get this fix, or else our S3 reads time out.
I have a debug zip exported if you want me to upload it in the support portal.
Environment:
The text was updated successfully, but these errors were encountered: