Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the CustomEndpointPlugin #1122

Merged
merged 43 commits into from
Oct 18, 2024
Merged

Implement the CustomEndpointPlugin #1122

merged 43 commits into from
Oct 18, 2024

Conversation

aaron-congo
Copy link
Contributor

Summary

Implement the CustomEndpointPlugin

Description

  • when reviewing, please leave notes on the TODO items if you have input
  • as custom endpoints are used, monitor threads are created to monitor custom endpoint information
  • custom endpoint information is fetched using the RDS API, so users of the CustomEndpointPlugin need to have the AWS Java SDK on their classpath
  • connections register their plugin services with a given monitor if they are connecting to that monitor's custom endpoint
  • when the monitor detects changes in custom endpoint information, it updates the custom endpoint information for all registered plugins
  • added new methods in PluginService to get/set custom endpoint information, and to fetch the current list of hosts minus the hosts that are not allowed by the custom endpoint plugin. The failover and read/write plugin use this subset of hosts when deciding which host to switch to

Additional Reviewers

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@aaron-congo aaron-congo added the wip Pull request that is a work in progress label Sep 18, 2024
Copy link

github-actions bot commented Sep 18, 2024

Qodana Community for JVM

It seems all right 👌

No new problems were found according to the checks applied

💡 Qodana analysis was run in the pull request mode: only the changed files were checked

View the detailed Qodana report

To be able to view the detailed Qodana report, you can either:

  1. Register at Qodana Cloud and configure the action
  2. Use GitHub Code Scanning with Qodana
  3. Host Qodana report at GitHub Pages
  4. Inspect and use qodana.sarif.json (see the Qodana SARIF format for details)

To get *.log files or any other Qodana artifacts, run the action with upload-result option set to true,
so that the action will upload the files as the job artifacts:

      - name: 'Qodana Scan'
        uses: JetBrains/[email protected]
        with:
          upload-result: true
Contact Qodana team

Contact us at [email protected]

}

// The custom endpoint info has changed, so we need to update the info in the registered plugin services.
customEndpointInfoCache.put(this.endpointIdentifier, endpointInfo, this.cacheEntryExpirationNano);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thread might add such item in the cache. Would it be damaging? Can we use computeIfAbsent()?

Copy link
Contributor Author

@aaron-congo aaron-congo Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any other thread will be placing info at this key, the logic in CustomEndpointPlugin should only be creating one thread per custom endpoint which is shared among all connections, which means there is only one thread updating info for a given custom endpoint. computeIfAbsent would allow us to place an initial custom endpoint info object in the cache but it cannot be used to update the info later since it will already exist in the cache

LOGGER.info(Messages.get("CustomEndpointMonitorImpl.interrupted", new Object[]{ this.customEndpointHostSpec }));
Thread.currentThread().interrupt();
} catch (Exception e) {
LOGGER.log(Level.SEVERE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unhandled exception effectively stops monitoring thread. I believe we need to report the issue and continue with the main monitoring loop.

Copy link
Contributor Author

@aaron-congo aaron-congo Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to continue monitoring even if there is an unexpected problem? I adjusted the code to continue monitoring if an exception occurs (unless its an interrupted exception), but I'm not sure if it makes sense to continue or not.

* @return true if the custom endpoint info is available, or false if we timed out while waiting for the info to
* become available.
*/
protected boolean waitForCustomEndpointInfo() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a user doesn't want to wait? Can we not wait and proceed with a custom endpoint and let DNS resolve it for us?

Copy link
Contributor Author

@aaron-congo aaron-congo Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with not waiting is that, if the custom endpoint info isn't discovered yet, there will be no disallowed hosts. Which means that failover and the read write splitting plugin may connect to hosts outside of the custom endpoint. It seems like the user would not want this to happen if they are enabling this plugin. I believe the user will not have to wait long or often, on my machine it takes ~260ms for the API call to complete and the user will only need to wait when info isn't in the cache already (cache entries last 5 minutes but will be extended by active monitors).

{
addAll(SubscribedMethodHelper.NETWORK_BOUND_METHODS);
add("connect");
add("forceConnect");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need forceConnect(). It's usually used by various monitoring threads and failover.

protected void waitForCustomEndpointInfo(CustomEndpointMonitor monitor) throws SQLException {
boolean hasCustomEndpointInfo = monitor.hasCustomEndpointInfo();

if (!hasCustomEndpointInfo) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we always wait if cache is empty? Is it so essential to wait for monitoring thread fetch custom endpoint definition? Can we make it configurable (wait/not wait)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with not waiting is that, if the custom endpoint info isn't discovered yet, there will be no disallowed hosts. Which means that failover and the read write splitting plugin may connect to hosts outside of the custom endpoint. It seems like the user would not want this to happen if they are enabling this plugin. I believe the user will not have to wait long or often, on my machine it takes ~260ms for the API call to complete and the user will only need to wait when info isn't in the cache already (cache entries last 5 minutes but will be extended by active monitors). If you want I can make it configurable, but if the wait is disabled the custom endpoint plugin may allow connections to instances outside the endpoint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to delegate this decision to a user so a new configuration parameter would be great. Also I see that after calling waitForCustomEndpointInfo() the driver continues with connectFunc to get a connection. That means that custom endpoint info isn't directly required at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be required, there are two problem scenarios I can think of:

  1. user has enableConnectFailover set to true, they try to connect but it fails and failover is kicked off, failover connects to a host outside of the custom endpoint info because we did not wait for the info to be found.
  2. user connects successfully and then executes setReadOnly or hits failover while executing some other method, in both cases we could connect to an instance outside of the custom endpoint if we did not wait for the info to be found

@aaron-congo aaron-congo merged commit b367f4e into main Oct 18, 2024
6 checks passed
@aaron-congo aaron-congo deleted the custom-endpoints branch October 18, 2024 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants