Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MonitorConnectionContext queue cause memory leak #412

Closed
YoungHu opened this issue Jun 3, 2023 · 23 comments
Closed

MonitorConnectionContext queue cause memory leak #412

YoungHu opened this issue Jun 3, 2023 · 23 comments
Assignees
Labels
bug Something isn't working

Comments

@YoungHu
Copy link

YoungHu commented Jun 3, 2023

Describe the bug

AT 1.1.5+ version, in Monitor class, Queue<MonitorConnectionContext> just have add in startMonitoring and run method, no remove in stopMonitoring method, this will cause memory leak.
we can see 1.14 have remove action but missing in 1.1.5. maybe code lost at this version.

Expected Behavior

remove unused MonitorConnectionContext in queue.

Current Behavior

no remove action. dump file can see lots of MonitorConnectionContext instance。
image

Reproduction Steps

integrate aws driver and run application monitor the class instance

Possible Solution

No response

Additional Information/Context

No response

The AWS JDBC Driver for MySQL version used

1.1.7

JDK version used

JDK8

Operating System and version

Linux

@crystall-bitquill
Copy link
Contributor

Hi @YoungHu,

Thanks for reaching out and raising this issue.

We'll take a look at this and keep you updated as we investigate.

Thank you for your patience!

@aaronchung-bitquill
Copy link
Contributor

Hi @YoungHu

I saw your pull request and wanted to thank you for contributing!

Before we get it approve and merge in, I'd just like to do some more testing on our side.

Thank you!

@aaronchung-bitquill
Copy link
Contributor

Hi @YoungHu

I took a closer look at the Monitor class and the uses of Queue<MonitorConnectionContext>. While the stopMonitoring() method does not call remove() to remove the context from the queues, the contexts are removed from the queues in the run() method by calling poll() on the queues.
https://github.com/awslabs/aws-mysql-jdbc/blob/main/src/main/user-impl/java/com/mysql/cj/jdbc/ha/plugins/Monitor.java#L150
https://github.com/awslabs/aws-mysql-jdbc/blob/main/src/main/user-impl/java/com/mysql/cj/jdbc/ha/plugins/Monitor.java#L184

I also tried reproduce a memory leak, but so far unable to.
I made a simple Spring Boot application using aws-mysql-jdbc:1.1.7 and profiled it using JProfiler.
Below is a graph of the number of instances of the MonitorConnectionContext class over time.
Screenshot 2023-06-08 at 1 39 00 PM
In the graph, there is a continuous pattern of the number of instances rising, then falling, then rising again. Not captured in the screenshot, but I did not observe any trends of the peaks getting higher over time.

I was hoping you could you provide the following details?

  • Is your application using any framework, libraries, and/or connection pools? What are they?
  • What type of queries is your application making? Are the long running? Transactional?
  • Is your application using any non-default garbage collection settings/options?
  • Anything else you think may be useful

Thank you for patience!

@YoungHu
Copy link
Author

YoungHu commented Jun 13, 2023

Hi @YoungHu

I took a closer look at the Monitor class and the uses of Queue<MonitorConnectionContext>. While the stopMonitoring() method does not call remove() to remove the context from the queues, the contexts are removed from the queues in the run() method by calling poll() on the queues. https://github.com/awslabs/aws-mysql-jdbc/blob/main/src/main/user-impl/java/com/mysql/cj/jdbc/ha/plugins/Monitor.java#L150 https://github.com/awslabs/aws-mysql-jdbc/blob/main/src/main/user-impl/java/com/mysql/cj/jdbc/ha/plugins/Monitor.java#L184

I also tried reproduce a memory leak, but so far unable to. I made a simple Spring Boot application using aws-mysql-jdbc:1.1.7 and profiled it using JProfiler. Below is a graph of the number of instances of the MonitorConnectionContext class over time. Screenshot 2023-06-08 at 1 39 00 PM In the graph, there is a continuous pattern of the number of instances rising, then falling, then rising again. Not captured in the screenshot, but I did not observe any trends of the peaks getting higher over time.

I was hoping you could you provide the following details?

  • Is your application using any framework, libraries, and/or connection pools? What are they?
  • What type of queries is your application making? Are the long running? Transactional?
  • Is your application using any non-default garbage collection settings/options?
  • Anything else you think may be useful

Thank you for patience!

Hi @aaronchung-bitquill
we have 8 applications use 1.1.7 version, just one application have this problem.
following details as blew:

  • SpringBoot 2.x, Druid 1.2.9, Mybatis 3.5.4.
  • just simple sql select by key
  • GC setting like this:-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+DisableExplicitGC -XX:+CMSParallelRemarkEnabled

normally, run method will call poll() to remove item in queue, but when debug the code, I find run method will break out sometimes, I have not idea what cause it. after break out, queue will just add not remove.

@aaronchung-bitquill
Copy link
Contributor

Hi @YoungHu

A few additional questions for the application that is experiencing this issue

  • Can you give me some details about the database? Is it MySQL or Aurora MySQL? What versions?
  • How is the DB URL specified?
  • What plugins are being used?
  • What database and connection related configurations do you have?
  • Are there any other details about this particular application that is different than the others are working without issue?

Also, would u be able to enable driver logging and provide some driver logs?

Thank you

@YoungHu
Copy link
Author

YoungHu commented Jun 19, 2023

Hi @YoungHu

A few additional questions for the application that is experiencing this issue

  • Can you give me some details about the database? Is it MySQL or Aurora MySQL? What versions?
  • How is the DB URL specified?
  • What plugins are being used?
  • What database and connection related configurations do you have?
  • Are there any other details about this particular application that is different than the others are working without issue?

Also, would u be able to enable driver logging and provide some driver logs?

Thank you

we are using Aurora MySQL 5.7 version,DB url is cluster endpoint

@aaronchung-bitquill
Copy link
Contributor

Hi @YoungHu

I suspect that it might be InterruptedException that is breaking the while loop in the Monitor#run method. But unfortunately, I am still unable to reproduce the issue.

If possible would you be able to provide driver logs of when the issue occurs? That would greatly help us be able to confirm the source of the issue. Otherwise, would you be willing to share a sample app that reproduces the issue?

Thank you

@YoungHu
Copy link
Author

YoungHu commented Jun 21, 2023

Hi @YoungHu

I suspect that it might be InterruptedException that is breaking the while loop in the Monitor#run method. But unfortunately, I am still unable to reproduce the issue.

If possible would you be able to provide driver logs of when the issue occurs? That would greatly help us be able to confirm the source of the issue. Otherwise, would you be willing to share a sample app that reproduces the issue?

Thank you

how to set logger? just like jdbc:mysql:aws://db-cluster-id:3306?key1=value&key2=value2& logger=com.mysql.cj.log.StandardLogger?

@aaronchung-bitquill
Copy link
Contributor

HI @YoungHu

Apologies for the late reply.
Yes, that is correct.

@aaronchung-bitquill
Copy link
Contributor

Hi @YoungHu

I wanted to check in to see how things are going? Were you able to obtain driver logs?

@aaronchung-bitquill
Copy link
Contributor

Hi @YoungHu

We have not heard back from you for some time and wanted to check back in.

If there are no further updates on this ticket in the next 3 days, we will close this ticket. However, if you require further assistance, you can reopen this issue, or create a new one.

Thank you!

@aaronchung-bitquill
Copy link
Contributor

Hi @YoungHu

We will be closing this ticket now due to lack of updates. However, if you require further assistance, you can reopen this issue, or create a new one.

Thank you!

@ftasso
Copy link

ftasso commented Nov 22, 2023

Hello,
I have the same problem upgrading from 1.1.0 to 1.1.11.
Please take a look to the entity of the memory leak:
Memoryleak

Thank you,
Fabrizio

@crystall-bitquill
Copy link
Contributor

Hi @ftasso,

Thank you for letting us know you've experienced this. Your screenshot indicates there are many objects in memory, but that may not necessarily mean those objects were leaked. Leaked objects are usually shown as a long uptrend line of consumed memory, would you be able to verify whether this was the case?

Thank you!

@ftasso
Copy link

ftasso commented Nov 24, 2023

Hello,
I made some tests about this issue and I confirm you that the instances of MonitorConnectionContext are not released.
I made a simply thread that make a full GC cleaner every 5 minutes but the allocated objects still remain allocated.
Details about my environment:
AWS

Thank you,
Fabrizio

@crystall-bitquill
Copy link
Contributor

Hi @ftasso,

Thanks for confirming that, we'll take another look at this and keep you updated as we investigate.

Thank you for your patience!

@KerouacEZ
Copy link

I have also encountered this situation. aws-mysql-jdbc 1.1.8
Snipaste_2023-11-28_15-14-58

@amitkumariit
Copy link

amitkumariit commented Nov 30, 2023

Hi,
We also observed the issue in all the versions after 1.4.Please have a look on the attached screenshot.our observation were a simple insert will not reproduce the issue in our laod env. but when there are select and insert we are able to reproduce.
Screenshot 2023-11-30 at 11 47 17

@KerouacEZ
Copy link

So, are you currently addressing this issue? What good temporary solutions do I have now?

@crystall-bitquill
Copy link
Contributor

Hi @KerouacEZ,

We have received similar reports on this issue for the AWS JDBC Driver. While we are still working on the root cause of the issue for both drivers, we have released a new version (v2.3.1) for the the AWS JDBC Driver that alleviates the issues you are seeing.

Once we have a fix for the issue, we will make it available for both the AWS JDBC Driver and the AWS JDBC Driver for MySQL. In the meantime, we suggest that you try out the AWS JDBC Driver with the MySQL Connector/J as the underlying driver. There is a migration guide on moving from the AWS JDBC Driver for MySQL to the AWS JDBC Driver available here.
Please let us know if you run into any integration issues.

@crystall-bitquill
Copy link
Contributor

Hi all,

We've recently added an experimental Host Monitoring Plugin (PR #764) to the AWS JDBC Driver, called the Host Monitoring Plugin v2. More information can be found here. The Host Monitoring Plugin v2 is functionally equivalent to the Host Monitoring Plugin and they are configured using the same parameters. The Host Monitoring Plugin v2 was created to address the issues that you all have been experiencing, and it is meant to be a more stable version of the Host Monitoring Plugin.

As it is currently an experimental plugin, please note that it should be tested before being used in a production environment. This plugin is not currently available on the AWS JDBC Driver for MySQL, but it can be tried out by using the snapshot build of the AWS JDBC Driver: aws-advanced-jdbc-wrapper-2.3.2-20231213.213138-10.jar. For more information on using a snapshot build, see this page. If you are able to test out the plugin, please let us know if you encounter any errors or if this plugin was able to resolve the issues you've encountered.

Thank you!

@karenc-bq
Copy link
Contributor

Hi all, the aforementioned experimental Host Monitoring Plugin v2 is now available on our latest release of the driver. It is currently an experimental plugin so please test it out before using it in a production environment.

We will be closing this issue now, please don't hesitate to comment or open a new ticket if this issue persists with the experimental plugin or if you encounter any other issues.

@christoph-zero
Copy link

christoph-zero commented Jan 12, 2024

We've observed the same memory leak issue with aws-mysql-jdbc version 1.1.10. After upgrade to 1.1.12, we might not reproduce the memory leak anymore. Even when default settings are met or explicitly defined as:

-> It seems to be resolved in 1.1.12 with default settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants