update_mapping timeouts cause replica inconsistencies and prevents synced _flush and force_merge commands #30351

redlus · 2018-05-02T23:16:32Z

Hi,

During bulk indexing to multiple indices and encountering multiple update_mapping timeouts (30s), some indices have shown inconsistencies between primaries and replicas.
The disk size of indices with these inconsistencies was considerably larger than as expected (as compared to similar data on indices from previous dates), and synced_flush or force_merge commands could not be completed.

Elasticsearch version:
6.2.3

Plugins installed:
ingest-attachment
ingest-geoip
mapper-murmur3
mapper-size
repository-azure
repository-gcs
repository-s3

JVM version:
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

OS version:
Linux 4.13.0-1011-azure #14-Ubuntu SMP 2018 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
During indexing to multiple live indices and performing maintenance on our cluster (creating new indices and deleting others), some of the indexing bulks encountered timeouts for some of their update_mappings (30s). Following these timeouts, we began observing inconsistencies between primaries and replicas. The disk size of indices with these inconsistencies was considerably larger than as expected (as compared to similar data on indices from previous dates), and synced_flush or force_merge commands could not be completed - even after stopping indexing to these indices completely.

Here is an example of a _flush/synced command on one of these indices:

{
    "_shards": {
        "total": 76,
        "successful": 72,
        "failed": 4
    },
    "fae12b2c5f73_newlogs_20180502-01": {
        "total": 76,
        "successful": 72,
        "failed": 4,
        "failures": [
            {
                "shard": 18,
                "reason": "out of sync replica; num docs on replica [4212470]; num docs on primary [4212473]",
                "routing": {
                    "state": "STARTED",
                    "primary": false,
                    "node": "QeSPYyW7TfOOgR1SdVR42g",
                    "relocating_node": null,
                    "shard": 18,
                    "index": "fae12b2c5f73_newlogs_20180502-01",
                    "allocation_id": {
                        "id": "PArVLA9uSyqZ4ZxKHu_yvQ"
                    }
                }
            },
            {
                "shard": 29,
                "reason": "out of sync replica; num docs on replica [4211285]; num docs on primary [4211289]",
                "routing": {
                    "state": "STARTED",
                    "primary": false,
                    "node": "M_QP7sSsQQ6jhGnAblOESg",
                    "relocating_node": null,
                    "shard": 29,
                    "index": "fae12b2c5f73_newlogs_20180502-01",
                    "allocation_id": {
                        "id": "M_ws8hLURsmKzEtNq26SxQ"
                    }
                }
            },
            {
                "shard": 0,
                "reason": "out of sync replica; num docs on replica [4210878]; num docs on primary [4210883]",
                "routing": {
                    "state": "STARTED",
                    "primary": false,
                    "node": "g8cGEEK0SAGYC_I-FfZprg",
                    "relocating_node": null,
                    "shard": 0,
                    "index": "fae12b2c5f73_newlogs_20180502-01",
                    "allocation_id": {
                        "id": "pWX9k1pZTuuWBKHOpNCJHw"
                    }
                }
            },
            {
                "shard": 25,
                "reason": "out of sync replica; num docs on replica [4211536]; num docs on primary [4211537]",
                "routing": {
                    "state": "STARTED",
                    "primary": false,
                    "node": "FmnjvN8YROKJ6PO8E3QEjA",
                    "relocating_node": null,
                    "shard": 25,
                    "index": "fae12b2c5f73_newlogs_20180502-01",
                    "allocation_id": {
                        "id": "5wMI1o01SH2kJW9baYJymw"
                    }
                }
            }
        ]
    }
}

Reducing the replica of these indices to 0 and then increasing it back to 1 seems to solve the problem.
However, in some of the indices, a weird state was kept for one of its shards (again, when calling _flush/synced):

{
    "_shards": {
        "total": 38,
        "successful": 37,
        "failed": 1
    },
    "eebfc6dc8f07_newlogs_20180502-01": {
        "total": 38,
        "successful": 37,
        "failed": 1,
        "failures": [
            {
                "shard": 2,
                "reason": "[1] ongoing operations on primary"
            }
        ]
    }
}

For these indices, the problem was eventually solved by closing the index and re-opening it. Upon opening, some of the primary shards have been declared unassigned by elasticsearch and had to be verified and initialized.

Overall, this is a very long, time-consuming and resource-consuming operation - recognizing indices are larger than expected, identifying a primary-replica inconsistency, reducing and increasing the replicas, waiting for the replicas to be fully initialized, closing and re-opening the index, and waiting for elasticsearch to fully initialize the unassigned primary shards.

I believe the initial inconsistency state could be related to #30244, and maybe these discussions as well:
https://discuss.elastic.co/t/flush-threads-stuck-es-6-1-2/117707
https://discuss.elastic.co/t/unable-to-complete-synced-flush-on-idle-index/118449

I'd be happy to provide additional data and information as needed.

Thanks.

The text was updated successfully, but these errors were encountered:

DaveCTurner · 2018-05-03T07:11:12Z

I think what you are describing is the bug that is fixed by #30244, which will be released in v6.3.0, so I'm going to close this.

However, I'm not completely sure this explains the ongoing operations on primary message that required a close-and-reopen to clear. To avoid confusion with #30244 please could you open a thread on https://discuss.elastic.co issue about that, and include the output of

GET _tasks

in your post?

DaveCTurner closed this as completed May 3, 2018

redlus mentioned this issue May 3, 2018

Bulk operation fail to replicate operations when a mapping update times out #30244

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update_mapping timeouts cause replica inconsistencies and prevents synced _flush and force_merge commands #30351

update_mapping timeouts cause replica inconsistencies and prevents synced _flush and force_merge commands #30351

redlus commented May 2, 2018

DaveCTurner commented May 3, 2018

update_mapping timeouts cause replica inconsistencies and prevents synced _flush and force_merge commands #30351

update_mapping timeouts cause replica inconsistencies and prevents synced _flush and force_merge commands #30351

Comments

redlus commented May 2, 2018

DaveCTurner commented May 3, 2018