Tool to remove corrupted parts of corrupt shards #31389

DaveCTurner · 2018-06-18T10:24:18Z

Today, if we detect shard corruption then we mark the store as corrupt and refuse to open it again. If there are no replicas then you might be able to use Lucene’s CheckIndex to remove the corrupted segments, although this does not remove the corruption marker, requires knowledge of our filesystem layout, and might be tricky to do in a containerised or heavily automated environment. The only way forward via the API is to force the allocation of an empty primary which drops all the data in the shard. We have an index.shard.check_on_startup: fix setting but this is suboptimal for a couple of reasons:

it’s index-wide and requires closing and verifying the whole index.
it has no effect on shards that have a corruption marker, because the corruption marker is checked before this option takes effect.

(it also does nothing in versions 6.0 and above, but that's another story)

The Right Way™ to recover a corrupted shard is certainly to fail it and recover another copy from one of its replicas, assuming such a thing exists, but we’ve seen a couple of cases recently where a user was running without replicas, e.g. to do a bulk load of data (which we sorta suggest might be a good idea sometimes) and hit some corruption that they'd have preferred to recover from with a bit of data loss rather than by restarting the load or allocating an empty primary.

I propose removing the fix option of the index.shard.check_on_startup setting and instead adding another dangerous forced allocation command that can attempt to allocate a primary on top of a corrupt store by fixing the store and removing its corruption marker.

/cc @tsouza @ywelsch re. this forum thread

Actual points and opened questions:

Tool name: elasticsearch-shard with subcommand remove-corrupted-segments
- the main goal is to fix corrupted index - the action is destructive - therefore no any fix or repair, avoid truncate as it is far from Lucene terminology
Available options for remove-corrupted-segments:
- --index-name index_name and --shard-id shard_id (mandatory)
  - alternative: -d path_to_index_folder or --dir path_to_index_folder
- --dry-run do fast check without actual dropping of corrupted segments
- no options means exorcise - interactive keyboard confirmation is required
merge elasticsearch-translog into elasticsearch-shard
- elasticsearch-translog becomes elasticsearch-shard truncate-translog
- elasticsearch-translog has only -d option to specify folder - it would be nice to have --index-name index_name and --shard-id shard_id
Exit immediately if there is no corruption marker file
- for both cases
actually missed segments are unrecoverable case with checkIndex
- we leave it as unrecoverable case - with referring to how to allocate an empty shard
- there is a room for improvement - LUCENE-6762.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2018-06-18T10:24:20Z

Pinging @elastic/es-distributed

bleskes · 2018-06-18T12:24:21Z

+1. That settings is dangerous :(

DaveCTurner · 2018-06-26T15:16:00Z

We (@elastic/es-distributed) discussed this today and decided:

removing index.shard.check_on_startup: fix is the right thing to do.
fixing a corrupted shard should not be done online, via the API, but we should have an offline tool, similar to the translog tool, that can fix it without requiring the user to descend into the filesystem by hand.

vladimirdolzhenko · 2018-07-17T09:57:27Z

It has been observed that index.shard.check_on_startup is broken since 6.0.0 and in fact no any other value rather true/false -

elasticsearch/server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

Line 1292 in f04c579

if (Booleans.isTrue(checkIndexOnStartup)) {

tsouza · 2018-07-17T10:27:59Z

I don't think this is correct

elasticsearch/server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

Line 1890 in f04c579

private void doCheckIndex() throws IOException {

UPDATE: It seems that method is dead code.

vladimirdolzhenko · 2018-07-18T11:02:16Z

Decided to fix “checksum” but not fix “fix”

Closes elastic#31389

vladimirdolzhenko · 2018-07-23T09:12:45Z

description is updated

Relates elastic#31389

bleskes · 2018-07-23T12:22:53Z

-d path_to_index_folder

as discussed one of the upside of a tool vs running lucene directly is the translation between index names and index folders. I think we should allow people to specify an index name and an shard id as parameters.

jpountz · 2018-07-23T13:50:30Z

+1 to remove-corrupted-segments, I like that it is explicit about the data loss and the fact that it works at the segment level.

+1 to pass in an index name and a shard rather than a folder.

elasticsearch-index

Or maybe elasticsearch-shard?

-fast (default)
-slow

I'd probably not expose these options at all, and always run with fast=true and crossCheckTermVectors=false.

-exorcise

Since the name of the command already implies data loss I'm not sure we need this one. Maybe turn it around and make it a dry-run option that only prints what is going to happen when enabled?

vladimirdolzhenko · 2018-07-24T08:45:52Z

@jpountz I like idea of dry-run - that is default one - in this case appears question for proper naming for exorcise - force-remove or smth else ?

jpountz · 2018-07-24T09:44:18Z

I'm open to ideas here as long as the fact that this command will cause data loss is clear. My thinking was that since the command is already called remove-corrupted-segments then we don't need additional warnings and could just go with the removal of corrupted segments by default unless --dry-run is passed. But I also understand why someone would like to add a second-level of protection, I am fine either way. I know Lucene uses exorcise but I think we could find a better name / workflow. For instance I think some other tools are interactive whenever changes need to be applied and ask to confirm, maybe this is something we can get inspiration from. In any case I'm open to how we want to handle that part. The main things that I care about are having a name that makes the data loss obvious (remove-corrupted-segments sounds great) and having as few options as possible (ie. skip options that don't help much in the context of Elasticsearch like cross-checking term vectors).

Relates elastic#31389

bleskes · 2018-08-09T10:00:21Z

We've had a good discussion around this tool and have concluded the following:

We should have one tool for dealing with corruptions, both in the translog and in the lucene index
The tool will refuse to run if there are no existing corruption markers (i.e., it will only work on known corrupted shards)
The tool will first run a dry run and show an analysis of what it's going to do to the user, get confirmation and then perform required operations.
The tool should fail when check index fails to drop corrupted segments in Lucene. In the future we can offer users to only recover the translog, if needed. We don't feel we need the complexity right now.
We should document the implication of the tool to join relationships as it may be unexpected to users.
The tool should generate a new history uuid to prevent ops based recoveries and CCR.
The tool should generate a new allocation id and tell the user what command they need to run in order for the cluster to use this shard (allocate stale primary).

We have run out of time and didn't discuss the parameters and tool naming. @vladimirdolzhenko can you post a suggestion here based on the above and we can discuss it further?

vladimirdolzhenko · 2018-08-31T11:52:40Z

tool is elasticsearch-shard with a subcommand remove-corrupted-data
actually there is no corrupted markers for a translog (PR is following) - therefore tool performs analysis first for it - if it is clean (not corrupted) is the same as no corruption marker, check for corruption marker for index files
tool has --index-name and --shard-id parameters or --dir for the cases of multiple nodes per data dir / environment
as tool performs analysis before any destructive actions (those have to be confirmed) it is decided to drop --dry-run option

drop `index.shard.check_on_startup: fix` Relates #31389

Relates #31389 (cherry picked from commit 3d82a30)

Relates #31389

Relates elastic#31389 (cherry picked from commit a3e8b83)

Relates #31389 (cherry picked from commit a3e8b83)

#32281 adds elasticsearch-shard to provide bwc version of elasticsearch-translog for 6.x; have to remove elasticsearch-translog for 7.0 Relates to #31389

DaveCTurner · 2019-01-07T13:29:31Z

Closed by #32281.

DaveCTurner added :Distributed/Store Issues around managing unopened Lucene indices. If it touches Store.java, this is a likely label. team-discuss labels Jun 18, 2018

DaveCTurner changed the title ~~Online recovery of corrupt shards~~ Tool for recovery of corrupt shards Jun 26, 2018

DaveCTurner added >enhancement help wanted adoptme and removed team-discuss labels Jul 3, 2018

vladimirdolzhenko added the team-discuss label Jul 17, 2018

vladimirdolzhenko self-assigned this Jul 17, 2018

vladimirdolzhenko added v7.0.0 v6.5.0 and removed team-discuss labels Jul 17, 2018

vladimirdolzhenko removed the v6.5.0 label Jul 18, 2018

vladimirdolzhenko pushed a commit to vladimirdolzhenko/elasticsearch that referenced this issue Jul 21, 2018

drop index.shard.check_on_startup: fix

ac74e5f

Closes elastic#31389

vladimirdolzhenko pushed a commit to vladimirdolzhenko/elasticsearch that referenced this issue Jul 23, 2018

drop index.shard.check_on_startup: fix

5828119

Relates elastic#31389

vladimirdolzhenko mentioned this issue Jul 23, 2018

drop index.shard.check_on_startup: fix #32279

Merged

vladimirdolzhenko mentioned this issue Jul 23, 2018

add RemoveCorruptedShardDataCommand #32281

Merged

vladimirdolzhenko pushed a commit to vladimirdolzhenko/elasticsearch that referenced this issue Jul 27, 2018

drop index.shard.check_on_startup: fix

843f977

Relates elastic#31389

vladimirdolzhenko mentioned this issue Aug 23, 2018

Docs: We should have examples of allocate_empty_primary and allocate_stale_primary #33069

Closed

DaveCTurner changed the title ~~Tool for recovery of corrupt shards~~ Tool to remove corrupted parts of corrupt shards Aug 31, 2018

vladimirdolzhenko added a commit that referenced this issue Aug 31, 2018

drop index.shard.check_on_startup: fix (#32279)

3d82a30

drop `index.shard.check_on_startup: fix` Relates #31389

vladimirdolzhenko added a commit that referenced this issue Aug 31, 2018

drop index.shard.check_on_startup: fix (#32279)

6de8c6b

Relates #31389 (cherry picked from commit 3d82a30)

This was referenced Sep 4, 2018

drop elasticsearch-translog for 7.0 #33373

Merged

Translog corruption marker #33415

Closed

vladimirdolzhenko added a commit that referenced this issue Sep 19, 2018

add elasticsearch-shard tool (#32281)

a3e8b83

Relates #31389

vladimirdolzhenko mentioned this issue Sep 19, 2018

add elasticsearch-shard tool to 6.x #33848

Merged

vladimirdolzhenko added a commit to vladimirdolzhenko/elasticsearch that referenced this issue Sep 19, 2018

add elasticsearch-shard tool (elastic#32281)

f2570f4

Relates elastic#31389 (cherry picked from commit a3e8b83)

vladimirdolzhenko added a commit that referenced this issue Sep 22, 2018

add elasticsearch-shard tool (#33848)

43a30c5

Relates #31389 (cherry picked from commit a3e8b83)

vladimirdolzhenko added the v6.5.0 label Oct 1, 2018

vladimirdolzhenko added a commit that referenced this issue Oct 1, 2018

drop elasticsearch-translog for 7.0 (#33373)

2e2ae19

#32281 adds elasticsearch-shard to provide bwc version of elasticsearch-translog for 6.x; have to remove elasticsearch-translog for 7.0 Relates to #31389

kcm pushed a commit that referenced this issue Oct 30, 2018

drop elasticsearch-translog for 7.0 (#33373)

aeba58a

#32281 adds elasticsearch-shard to provide bwc version of elasticsearch-translog for 6.x; have to remove elasticsearch-translog for 7.0 Relates to #31389

DaveCTurner closed this as completed Jan 7, 2019

DaveCTurner unassigned vladimirdolzhenko Jan 7, 2019

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tool to remove corrupted parts of corrupt shards #31389

Tool to remove corrupted parts of corrupt shards #31389

DaveCTurner commented Jun 18, 2018 •

edited

Loading

elasticmachine commented Jun 18, 2018

bleskes commented Jun 18, 2018

DaveCTurner commented Jun 26, 2018 •

edited

Loading

vladimirdolzhenko commented Jul 17, 2018

tsouza commented Jul 17, 2018 •

edited

Loading

vladimirdolzhenko commented Jul 18, 2018 •

edited

Loading

vladimirdolzhenko commented Jul 23, 2018 •

edited

Loading

bleskes commented Jul 23, 2018

jpountz commented Jul 23, 2018

vladimirdolzhenko commented Jul 24, 2018

jpountz commented Jul 24, 2018

bleskes commented Aug 9, 2018

vladimirdolzhenko commented Aug 31, 2018

DaveCTurner commented Jan 7, 2019

Tool to remove corrupted parts of corrupt shards #31389

Tool to remove corrupted parts of corrupt shards #31389

Comments

DaveCTurner commented Jun 18, 2018 • edited Loading

elasticmachine commented Jun 18, 2018

bleskes commented Jun 18, 2018

DaveCTurner commented Jun 26, 2018 • edited Loading

vladimirdolzhenko commented Jul 17, 2018

tsouza commented Jul 17, 2018 • edited Loading

vladimirdolzhenko commented Jul 18, 2018 • edited Loading

vladimirdolzhenko commented Jul 23, 2018 • edited Loading

bleskes commented Jul 23, 2018

jpountz commented Jul 23, 2018

vladimirdolzhenko commented Jul 24, 2018

jpountz commented Jul 24, 2018

bleskes commented Aug 9, 2018

vladimirdolzhenko commented Aug 31, 2018

DaveCTurner commented Jan 7, 2019

DaveCTurner commented Jun 18, 2018 •

edited

Loading

DaveCTurner commented Jun 26, 2018 •

edited

Loading

tsouza commented Jul 17, 2018 •

edited

Loading

vladimirdolzhenko commented Jul 18, 2018 •

edited

Loading

vladimirdolzhenko commented Jul 23, 2018 •

edited

Loading