Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CloseableThreadLocal can severely degrade performance #55772

Closed
Hailei opened this issue Apr 26, 2020 · 1 comment
Closed

CloseableThreadLocal can severely degrade performance #55772

Hailei opened this issue Apr 26, 2020 · 1 comment

Comments

@Hailei
Copy link

Hailei commented Apr 26, 2020

Describe the feature:

Elasticsearch version (bin/elasticsearch --version): 7.5.1

Plugins installed: []

JVM version (java -version): in-built JDK

OS version (uname -a if on a Unix-like system):
Linux 4.14.81.bm.20-amd64 #1 SMP Debian 4.14.81.bm.20 Sat Mar 14 10:14:04 UTC 2020 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
In our benchmark found out that different shard nums have a deleterious effect on the perfromance
environment:
10 machine 96core 1T mem 16T nvme SSD, four es nodes per machine
Index size: 75GB
document size: 0.64 billion
The index is time series, divided according to the days, we will search more then 8~12 days index

shard num pct95(ms) qps
1 700 4000
10 10000 800

Using jstack, Flame grapha and arthas(performance tool from alibaba), we found that a lot of thread including search、transport_worker and http_server_worker was blocked by CloseableThreadLocal's lock
image

According to stack, deep into the code, closeablethreadpool belong to ThreadContext,ThreadContext is globally unique, So all threads are affected.So more shards, more intense lock contention, and severely degrade performance
I think CloseableThreadLocal Solved the gc problem, but bring a lot lock contention.Especially in the case of high concurrency scenario, more harm than good. So we need redesign this code

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.

Provide logs (if relevant):

@DaveCTurner
Copy link
Contributor

Duplicates https://discuss.elastic.co/t/swarm-of-shard-search-requests-cause-elasticsearch-transport-worker-blocked-on-closablethreadlocal/229803 and either fixed or changed beyond recognition by #43249, so I'm closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants