Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster state update takes a long time due to lots of aliases #11100

Closed
msimos opened this issue May 11, 2015 · 2 comments
Closed

Cluster state update takes a long time due to lots of aliases #11100

msimos opened this issue May 11, 2015 · 2 comments
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates

Comments

@msimos
Copy link

msimos commented May 11, 2015

When taking a snapshot or creating an index takes a very long time. For example when taking a snapshot takes 18 hours. The cluster has 23 nodes & 655 indexes with 400 alias per index. Each user is an alias (to an index). They track the current Index (in zookeeper). Once the current index has 400 alias, they create a new index. When looking at the pending tasks while the snapshot is running we see a lot of long run tasks like:

{
"insert_order" : 30872,
"priority" : "NORMAL",
"source" : "update snapshot state",
"executing" : false,
"time_in_queue_millis" : 62424008,
"time_in_queue" : "17.3h"
}

This is an example of the alias that is created:

"63e5786c2bea7e51a2d2359de2bc42419bf1c8e96d14c40336e7a0aa594dfb80" : {
    "filter" : {
        "term" : {
            "user_id" : "63e5786c2bea7e51a2d2359de2bc42419bf1c8e96d14c40336e7a0aa594dfb80"
        }
    },
    "index_routing" : "63e5786c2bea7e51a2d2359de2bc42419bf1c8e96d14c40336e7a0aa594dfb80",
    "search_routing" : "63e5786c2bea7e51a2d2359de2bc42419bf1c8e96d14c40336e7a0aa594dfb80"
},

According to Igor, publishing of cluster state updates always takes about 14-15 seconds no matter what type of update it is. Here is one example when updating aliases:

[03:23:37,861][DEBUG][cluster.service ] [Mad Dog Rassitano] processing [index-aliases]: execute
[03:23:40,144][DEBUG][cluster.service ] [Mad Dog Rassitano] cluster state updated, version [39957], source [index-aliases]
[03:23:40,144][DEBUG][cluster.service ] [Mad Dog Rassitano] publishing cluster state version 39957
[03:23:55,545][DEBUG][cluster.service ] [Mad Dog Rassitano] set local cluster state to version 39957
[03:23:55,554][DEBUG][cluster.service ] [Mad Dog Rassitano] processing [index-aliases]: done applying updated cluster_state (version: 39957)

Here is another example of a node leaving the cluster:

[07:37:21,822][DEBUG][cluster.service ] [Mad Dog Rassitano] processing [zen-disco-node_failed([Death-Stalker][l045BOE8RWC3Wm6DDZJERA][polaris-prod-135-w-esnode-9t65fuj427hgh][inet[/10.0.0.67:9300]]{master=false}), reason failed to ping, tried [3] times, each with maximum [30s] timeout]: execute
[07:37:58,920][DEBUG][cluster.service ] [Mad Dog Rassitano] cluster state updated, version [40189], source [zen-disco-node_failed([Death-Stalker][l045BOE8RWC3Wm6DDZJERA][polaris-prod-135-w-esnode-9t65fuj427hgh][inet[/10.0.0.67:9300]]{master=false}), reason failed to ping, tried [3] times, each with maximum [30s] timeout]
[07:37:58,921][INFO ][cluster.service ] [Mad Dog Rassitano] removed {[Death-Stalker][l045BOE8RWC3Wm6DDZJERA][polaris-prod-135-w-esnode-9t65fuj427hgh][inet[/10.0.0.67:9300]]{master=false},}, reason: zen-disco-node_failed([Death-Stalker][l045BOE8RWC3Wm6DDZJERA][polaris-prod-135-w-esnode-9t65fuj427hgh][inet[/10.0.0.67:9300]]{master=false}), reason failed to ping, tried [3] times, each with maximum [30s] timeout
[07:37:58,921][DEBUG][cluster.service ] [Mad Dog Rassitano] publishing cluster state version 40189
[07:38:12,901][DEBUG][cluster.service ] [Mad Dog Rassitano] set local cluster state to version 40189
[07:38:12,942][DEBUG][cluster.service ] [Mad Dog Rassitano] processing [zen-disco-node_failed([Death-Stalker][l045BOE8RWC3Wm6DDZJERA][polaris-prod-135-w-esnode-9t65fuj427hgh][inet[/10.0.0.67:9300]]{master=false}), reason failed to ping, tried [3] times, each with maximum [30s] timeout]: done applying updated cluster_state (version: 40189)

Slow cluster state propagation causes issues with index creation and slow snapshots.

The hot threads provided shows a lot of activity in reading cluster state from the network with a majority of the hot threads being busy reading aliases:

threads-0007: 79.9% (399.7ms out of 500ms) cpu usage by thread 'elasticsearch[Dormammu][transport_client_worker][T#2]{New I/O worker #2}'
threads-0007- 4/10 snapshots sharing following 31 elements
threads-0007- org.elasticsearch.common.hppc.ObjectObjectOpenHashMap.get(ObjectObjectOpenHashMap.java:564)
threads-0007- org.elasticsearch.common.collect.ImmutableOpenMap$Builder.get(ImmutableOpenMap.java:255)
threads-0007- org.elasticsearch.cluster.metadata.MetaData.<init>(MetaData.java:222)
threads-0007- org.elasticsearch.cluster.metadata.MetaData$Builder.build(MetaData.java:1245)
threads-0007- org.elasticsearch.cluster.metadata.MetaData$Builder.readFrom(MetaData.java:1382)
threads-0007- org.elasticsearch.cluster.ClusterState$Builder.readFrom(ClusterState.java:620)
threads-0007- org.elasticsearch.action.admin.cluster.state.ClusterStateResponse.readFrom(ClusterStateResponse.java:58)
threads-0007- org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:150)
threads-0007- org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:127)
threads-0007- org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
threads-0007- org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
threads-0007- org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
threads-0007- org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
threads-0007- org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
threads-0007- org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
threads-0007- org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:310)
threads-0007- org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
threads-0007- org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
threads-0007- org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)

threads-0008: 88.7% (443.7ms out of 500ms) cpu usage by thread 'elasticsearch[Tower][clusterService#updateTask][T#1]'
threads-0008- 2/10 snapshots sharing following 11 elements
threads-0008- org.elasticsearch.common.hppc.ObjectObjectOpenHashMap.get(ObjectObjectOpenHashMap.java:564)
threads-0008- org.elasticsearch.common.collect.ImmutableOpenMap$Builder.get(ImmutableOpenMap.java:255)
threads-0008- org.elasticsearch.cluster.metadata.MetaData.<init>(MetaData.java:222)
threads-0008- org.elasticsearch.cluster.metadata.MetaData$Builder.build(MetaData.java:1245)
threads-0008- org.elasticsearch.cluster.ClusterState$Builder.metaData(ClusterState.java:525)
threads-0008- org.elasticsearch.discovery.zen.ZenDiscovery$8.execute(ZenDiscovery.java:690)
threads-0008- org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:328)
threads-0008- org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:153)
threads-0008- java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
threads-0008- java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
threads-0008- java.lang.Thread.run(Unknown Source)
threads-0008- 4/10 snapshots sharing following 12 elements
threads-0008- org.elasticsearch.common.hppc.AbstractIterator.hasNext(AbstractIterator.java:34)
threads-0008- org.elasticsearch.common.hppc.AbstractObjectCollection.toArray(AbstractObjectCollection.java:86)
threads-0008- org.elasticsearch.common.hppc.ObjectArrayList.toArray(ObjectArrayList.java:44)
threads-0008- org.elasticsearch.cluster.metadata.MetaData.<init>(MetaData.java:232)
threads-0008- org.elasticsearch.cluster.metadata.MetaData$Builder.build(MetaData.java:1245)
threads-0008- org.elasticsearch.cluster.ClusterState$Builder.metaData(ClusterState.java:525)
threads-0008- org.elasticsearch.discovery.zen.ZenDiscovery$8.execute(ZenDiscovery.java:690)

threads-0008: 100.0% (500ms out of 500ms) cpu usage by thread 'elasticsearch[Freak][transport_server_worker][T#6]{New I/O worker #15}'
threads-0008- 6/10 snapshots sharing following 33 elements
threads-0008- org.elasticsearch.cluster.metadata.MetaData.<init>(MetaData.java:205)
threads-0008- org.elasticsearch.cluster.metadata.MetaData$Builder.build(MetaData.java:1245)
threads-0008- org.elasticsearch.cluster.metadata.MetaData$Builder.readFrom(MetaData.java:1382)
threads-0008- org.elasticsearch.cluster.ClusterState$Builder.readFrom(ClusterState.java:620)
threads-0008- org.elasticsearch.discovery.zen.publish.PublishClusterStateAction$PublishClusterStateRequestHandler.messageReceived(PublishClusterStateAction.java:175)
threads-0008- org.elasticsearch.discovery.zen.publish.PublishClusterStateAction$PublishClusterStateRequestHandler.messageReceived(PublishClusterStateAction.java:156)
threads-0008- org.elasticsearch.transport.netty.MessageChannelHandler.handleRequest(MessageChannelHandler.java:217)
threads-0008- org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:111)
threads-0008- org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
threads-0008- org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
threads-0008- org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
threads-0008- org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
threads-0008- org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
threads-0008- org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
threads-0008- org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:310)
threads-0008- org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
threads-0008- org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
threads-0008- org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
threads-0008- org.elasticsearch.common.netty.OpenChannelsHandler.handleUpstream(OpenChannelsHandler.java:74)

I realize this issue is related to these two issues:

#10212
#10295

However I am looking for suggestions as how to improve the performance of the cluster state propagation.

@msimos msimos changed the title Snapshot & Index creation takes a long time Cluster state update takes a long time due to lots of aliases May 11, 2015
@clintongormley
Copy link

Hi @msimos

I've marked this for discussion, but I think the outcome will be: you're using too many aliases. These aliases are consuming a big chunk of memory on every node in your cluster, and making cluster state updates very heavy. Even with cluster state diffs, it's still going to be slow. I'd really consider adopting another plan which doesn't involve that many aliases (eg just adding the filter and routing values dynamically on to each request via your application)

@msimos
Copy link
Author

msimos commented May 15, 2015

Hi @clintongormley,

This is what we recommended. So I'll close this now.

Mike

@msimos msimos closed this as completed May 15, 2015
@clintongormley clintongormley added :Data Management/Indices APIs APIs to create and manage indices and templates and removed :Aliases labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates
Projects
None yet
Development

No branches or pull requests

3 participants