Performance Issues due to Thread Locking #333

zromano · 2023-01-06T18:09:32Z

Describe the bug

We were interested in switching our JDBC driver to the AWS MySql JDBC driver to utilize its fast failover capabilities. However, when we did performance testing with this new driver, we noticed it had a substantial performance impact to our service.

We do not experience this issue when using the mysql-connector-j JDBC driver.

Expected Behavior

I'd expect this to have similar performance as the mysql-connector-j JDBC driver since this is advertised as a "drop-in compatible".

Current Behavior

The JDBC driver was causing poor performance because it was causing Thread Locking due to the synchronous code in the following classes:

Reproduction Steps

We are using a SpringBoot 2.7.5 application on Java 17 backed by an Aurora Mysql DB and performing a Locust load test against it.

Sadly can't post code, but I'd think load testing a simple SpringBoot app that can communicate with Aurora would be sufficient to repro.

Possible Solution

No response

Additional Information/Context

No response

The AWS JDBC Driver for MySQL version used

1.1.2

JDK version used

17.0.5 (corretto)

Operating System and version

amazoncorretto:17 Docker Image

The text was updated successfully, but these errors were encountered:

zromano · 2023-01-06T18:54:58Z

I see these questions from the other thread, I can add more details:

What connection string and configuration parameters do you use?

We are using a Hikari Connection Pool.
url: jdbc:mysql:aws://${SERVER}:${PORT}/${SCHEMA}?verifyServerCertificate=true&useSSL=true&requireSSL=true
connectionTimeout: 10000
minimumIdle: 25
maximumPoolSize: 72

What are usual SQL statements that your service executes?

Very simple query along the lines of SELECT * FROM table WHERE column IN (x, y, z). It doesn't matter how many values we provide for the IN clause, this always happens.

Did you have a chance to try MySQL JDBC Connector/J Driver instead of MariaDb driver? Any observations about performance? https://github.com/mysql/mysql-connector-j

Neither had this issue.

Does your application use any connection pool?

See above.

What database access frameworks your application uses?

We use Spring Data JPA, but the same issue happens even if we use plain JDBC queries.

congoamz · 2023-01-06T19:36:41Z

Hi @zromano,

Thank you for reporting this issue. We will look into the problem and will share more info as we investigate. Thank you for your patience!

congoamz · 2023-01-06T19:55:45Z

@zromano Had a couple of follow up questions:

You mention you are using version 2.18.33 of our driver, however the latest driver release is 1.1.3. Can you clarify which version of our driver you are experiencing the problem on?
Is the long lock wait time shown in your screenshot reported every time you run your performance test or only occasionally? If the answer is occasionally, do you know roughly how often it occurs (eg every X runs)?

zromano · 2023-01-06T20:05:00Z

Whoops, copied the wrong version.

We are on version 1.1.2. (updated above as well).
We only notice the locking issue when our pod is getting significant load. We don't register anything abnormal when our pod is receiving low traffic (or at least that I can see).
During performance testing, this caused our pod to only handle around 12-15% the amount of requests per second as if we used mysql-connector-j. This happens every time we run a performance test with the AWS MySQL connector.

zromano · 2023-01-11T18:26:18Z

Okay so we did a little more testing, we actually see a 20-30% performance boost when switching from this driver to mysql-connector-j even when our pods aren't under much load.

It's also worth noting that this 20-30% is on our overall response time, not just on the time we are waiting on JDBC. I wasn't sure how to measure that.

Please let me know if there is any other data we can provide that might be helpful for your testing 👍

sergiyvamz · 2023-02-14T17:08:26Z

Hello @zromano

Thread lock improvements have been merged, can you please test out our 1.1.4-Snapshot build here and let us know if the issue persists.
#356

Thank you!

zromano · 2023-02-15T09:04:35Z

Hey @sergiyvamz,

I just load tested our application with the following two JDBC drivers:

implementation(files("aws-mysql-jdbc-1.1.5-20230214.000541-4.jar"))
implementation "com.mysql:mysql-connector-j:8.0.32"

Summary:
The standard MySql connector had about 4x more throughput than the AWS one.

With the configuration I tested, I was getting ~450 requests per second over a 20 minute test period with the standard MySQL driver.

When I tested the SNAPSHOT version, it peaked at around 150 RPS and then exhausted our connection pool and started throwing errors. The profile of the application still shows a large amount of locks from the AuroraTopologyService.

sergiyvamz · 2023-02-15T18:02:31Z

Hello @zromano

Thank you for a prompt check of the new snapshot build. It's sad to hear that the problem is still there. I'm afraid I need to ask you to provide a sample app that reproduce the issue. I'd also ask you to provide a description of tools and strategy that you use to measure locks and request throughput.

Thank you!

zromano · 2023-02-15T20:00:49Z

Unfortunately I can't provide a sample App. Our team has decided not to invest more time into this topic. 😞

However, I can provide as much info as possible though

For our tech stack:

Hosting app in a Kubernetes cluster, but only using a single pod for this performance testing
Obviously using AWS Aurora Mysql cluster for the DB🙂
Using DataDog agent to measure performance and gather analytics
Using Spring Data JPA and Hikari connection pool
Using Locust to run a load test against a simple endpoint that just fetches and returns data from the DB

I am willing to set up a sample Spring App that I assume will repro this, but unfortunately I don't have the resources to test it. I don't want to personally pay for the AWS resources

Please let me know.

sergiyvamz · 2023-03-01T19:14:37Z

Hello @zromano

Thank you for providing details about your app. We understand that there may be challenges providing a sample app including time and necessary resources to test it. Our team would be appreciated to get a sample app with no proper testing. It's important to us investigate the issue and found a root cause of such dramatic (4x) performance degradation as you reported.

Thank you

zromano · 2023-03-11T00:39:39Z

I did my best to create a sample application for you. It can be seen in: https://github.com/zromano/AWS-JDBC-Performance

I tested that the app works locally, but didn't hook it up to a real AWS Aurora Mysql DB and it doesn't offer an help in terms of deploying the app on AWS.

Hope this helps, please let me know if there is anything else I can do to help

karenc-bq · 2023-03-13T22:35:33Z

Hi @zromano, thank you for the sample application! We will take a look and keep you posted with our progress.
Thanks again!

karenc-bq · 2023-03-15T02:07:12Z

Hi @zromano,

I ran the sample application you provided with different versions of the AWS MySQL JDBC Driver:

v1.1.4
v1.1.5-20230214.000541-4 (Snapshot)
latest main (77a1cdf)

In summary, we were able to reproduce the performance issues you raised with version 1.1.4 and the snapshot build. However, these issues are addressed in the latest main.

More details below.

v1.1.4

We saw a large amount of time spent waiting for locks in the ExpiringCache class, specifically in the synchronized get method.

To resolve this issue we introduced the CacheMap class, which uses concurrent hashmaps instead of locks.

snapshot build

However, as you mentioned, the issue persists. Instead of calling get from ExpiringCache we are just calling a different method at the same place in code.

main

To resolve another issue we decided to only update topology for mission critical method calls. This change significantly reduced the number of calls to the CacheMap. While profiling we also noticed another area of improvement, we will be looking into that.

We will be closing this ticket now. We appreciate the feedback and sample application in aiding in root causing this. Please let us know if there is anything else that we can provide support for while your team evaluates the driver.

zromano · 2023-03-15T16:59:38Z

Awesome, glad to hear that and glad I could help!

Thanks for fixing this 😄

zromano added the bug Something isn't working label Jan 6, 2023

sergiyvamz added the Investigating Under investigation label Jan 6, 2023

chenrui333 mentioned this issue Jan 6, 2023

performance between AWS JDBC Connector and MySQL JDBC Connecto #196

Closed

karenc-bq assigned congoamz Jan 6, 2023

hsuamz assigned sergiyvamz and unassigned congoamz Feb 6, 2023

karenc-bq mentioned this issue Feb 14, 2023

Optimize thread locks and expiring cache for Enhanced Monitoring plugin #356

Merged

karenc-bq self-assigned this Mar 13, 2023

karenc-bq closed this as completed Mar 15, 2023

karenc-bq removed the Investigating Under investigation label Mar 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Issues due to Thread Locking #333

Performance Issues due to Thread Locking #333

zromano commented Jan 6, 2023 •

edited

Loading

zromano commented Jan 6, 2023 •

edited

Loading

congoamz commented Jan 6, 2023 •

edited

Loading

congoamz commented Jan 6, 2023

zromano commented Jan 6, 2023

zromano commented Jan 11, 2023

sergiyvamz commented Feb 14, 2023

zromano commented Feb 15, 2023

sergiyvamz commented Feb 15, 2023

zromano commented Feb 15, 2023

sergiyvamz commented Mar 1, 2023

zromano commented Mar 11, 2023

karenc-bq commented Mar 13, 2023

karenc-bq commented Mar 15, 2023

zromano commented Mar 15, 2023

Performance Issues due to Thread Locking #333

Performance Issues due to Thread Locking #333

Comments

zromano commented Jan 6, 2023 • edited Loading

Describe the bug

Expected Behavior

Current Behavior

Reproduction Steps

Possible Solution

Additional Information/Context

The AWS JDBC Driver for MySQL version used

JDK version used

Operating System and version

zromano commented Jan 6, 2023 • edited Loading

congoamz commented Jan 6, 2023 • edited Loading

congoamz commented Jan 6, 2023

zromano commented Jan 6, 2023

zromano commented Jan 11, 2023

sergiyvamz commented Feb 14, 2023

zromano commented Feb 15, 2023

sergiyvamz commented Feb 15, 2023

zromano commented Feb 15, 2023

sergiyvamz commented Mar 1, 2023

zromano commented Mar 11, 2023

karenc-bq commented Mar 13, 2023

karenc-bq commented Mar 15, 2023

v1.1.4

snapshot build

main

zromano commented Mar 15, 2023

zromano commented Jan 6, 2023 •

edited

Loading

zromano commented Jan 6, 2023 •

edited

Loading

congoamz commented Jan 6, 2023 •

edited

Loading