Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ByteBuf leak from Netty in ES (client node) 5.3.0 #25099

Closed
asafm opened this issue Jun 7, 2017 · 14 comments
Closed

ByteBuf leak from Netty in ES (client node) 5.3.0 #25099

asafm opened this issue Jun 7, 2017 · 14 comments
Labels
:Distributed/Network Http and internode communication implementations

Comments

@asafm
Copy link

asafm commented Jun 7, 2017

Elasticsearch version: 5.3.0

Plugins installed: []

JVM version (java -version): OpenJDK Runtime Environment (build 1.8.0_111-8u111-b14-3~14.04.1-b14)

OS version (uname -a if on a Unix-like system):
.13.0-48-generic #80-Ubuntu

Description of the problem including expected versus actual behavior:
Elasticsearch Client node shows memory leak logs by Netty:

[2017-06-07T10:58:12,521][ERROR][i.n.u.ResourceLeakDetector] LEAK: ByteBuf.release() was not called before it's garbage-collected. Enable advanced leak reporting to find out where the leak occurred. To enable advanced leak reporting, specify the JVM option '-Dio.netty.leakDetection.level=advanced' or call ResourceLeakDetector.setLevel() See http://netty.io/wiki/reference-counted-objects.html for more information.
@markharwood markharwood added the :Distributed/Network Http and internode communication implementations label Jun 7, 2017
@jasontedor
Copy link
Member

Would you be able to provide a reproduction? The client and the server use the same code paths. I've tested the server previously with leak detection enabled at the paranoid level and none were reported.

@asafm
Copy link
Author

asafm commented Jun 7, 2017

@jasonodonnell It' hard to reproduce - it's from our production cluster. It was doing bulk inserts, which took a lot of time, since the data nodes were quite busy doing merges. Does this help?
The log appeared only twice today, and twice a couple of days ago.

I've experienced this bug my self when I wrote our Log Receiver. The only way I got it was through meticulous read of the code path. This stuff is hard.

@jasontedor
Copy link
Member

It' hard to reproduce - it's from our production cluster. It was doing bulk inserts, which took a lot of time, since the data nodes were quite busy doing merges.

Did anything abnormal happen, for example, a timeout tripped that you know of?

Would you be able to run your client with -Dio.netty.leakDetection.level=advanced which will give us more information about the buffer that leaked?

This stuff is hard.

I agree having chased quite a few these myself, that's why I was hoping for a reproduction. I really appreciate your empathy here. 😇

@asafm
Copy link
Author

asafm commented Jun 11, 2017

We'll add the advanced leak detection and report back in 3-5 days

@asafm
Copy link
Author

asafm commented Jun 11, 2017

Ok, that was way faster than I though. Here is it:

[2017-06-11T12:18:03,759][ERROR][i.n.u.ResourceLeakDetector] LEAK: ByteBuf.release() was not called before it's garbage-collected. See http://netty.io/wiki/reference-counted-objects.html for more information.
WARNING: 1 leak records were discarded because the leak record count is limited to 4. Use system property io.netty.leakDetection.maxRecords to increase the limit.
Recent access records: 4
#4:
	io.netty.buffer.AdvancedLeakAwareByteBuf.readRetainedSlice(AdvancedLeakAwareByteBuf.java:102)
	io.netty.handler.codec.http.HttpObjectDecoder.decode(HttpObjectDecoder.java:305)
	io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
	io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
	io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
	io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
	io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:527)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:481)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441)
	io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	java.lang.Thread.run(Thread.java:745)
#3:
	Hint: 'decoder' will handle the message from this point.
	io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:346)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
	io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
	io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
	io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:527)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:481)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441)
	io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	java.lang.Thread.run(Thread.java:745)
#2:
	Hint: 'openChannels' will handle the message from this point.
	io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:346)
	io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
	io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
	io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:527)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:481)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441)
	io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	java.lang.Thread.run(Thread.java:745)
#1:
	Hint: 'DefaultChannelPipeline$HeadContext#0' will handle the message from this point.
	io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
	io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:346)
	io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:527)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:481)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441)
	io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	java.lang.Thread.run(Thread.java:745)
Created at:
	io.netty.util.ResourceLeakDetector.track(ResourceLeakDetector.java:219)
	io.netty.buffer.PooledByteBufAllocator.newHeapBuffer(PooledByteBufAllocator.java:277)
	io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:160)
	io.netty.buffer.AbstractByteBufAllocator.heapBuffer(AbstractByteBufAllocator.java:151)
	io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:133)
	io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:73)
	io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:117)
	io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642)
	io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:527)
	io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:481)
	io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441)
	io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
	java.lang.Thread.run(Thread.java:745)```

@jasontedor
Copy link
Member

Thanks.

Did anything abnormal happen, for example, a timeout tripped that you know of?

Are you running with any plugins installed?

@asafm
Copy link
Author

asafm commented Jun 13, 2017

I don't think this leak is related to any plugins installed. Seems like the leak is in the http api of client node

@jasontedor
Copy link
Member

@asafm You might very well be right but it would still be good to know what plugins you have installed.

@asafm
Copy link
Author

asafm commented Jun 15, 2017

@jasontedor The plugins are: discovery-ec2, repository-s3 and x-pack.

@jasontedor
Copy link
Member

Thank you, this very much does indeed matter. Can you confirm what features of X-Pack you are using, are you using Security?

@asafm
Copy link
Author

asafm commented Jun 15, 2017 via email

@asafm
Copy link
Author

asafm commented Nov 7, 2017

Hey @jasontedor - any luck fixing it?

@jasontedor
Copy link
Member

No, this does not reproduce and without a reproduction or wider reports I am afraid that no additional effort will be put into this.

@jasontedor
Copy link
Member

I would ask for you to:

  • upgrade to the latest version (as the underlying Netty library has been upgraded which may have addressed this)
  • provide a reliable reproduction on a new issue if this continues to reproduce

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Network Http and internode communication implementations
Projects
None yet
Development

No branches or pull requests

5 participants