Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable buffer pooling settings with a SSL configurable option #5316

Merged
merged 3 commits into from
Nov 5, 2024

Conversation

franz1981
Copy link
Contributor

@franz1981 franz1981 commented Sep 12, 2024

SSL JDK is very memory allocation intensive, see netty/netty#14208: by default Vertx is using unpooled heap buffers of 16 KB for each interaction with SSL/TSL, which can lead to increase significantly the memory footprint.

In order to fix it I've created a new ssl configuration property (which should become the default in vertx 5) to enable pooling of heap buffers using JDK SSL, while keeping the existing behaviour by default.

@franz1981
Copy link
Contributor Author

@cescoffier and @vietj PTAL

I've added some asserts to make it existing tests to fail and make sure the invariant I'm assuming for SSL on vertx are correct.

@franz1981
Copy link
Contributor Author

@cescoffier this is key for both https and vertx sql client - cause the latter will rely heavily on it for the reactive sql drivers (both mysql and postgres AFAIK) -

@tsegismont AFAIK vertx sql driver should rely on NetClientImpl e.g. https://github.com/eclipse-vertx/vertx-sql-client/blob/master/vertx-sql-client/src/main/java/io/vertx/sqlclient/impl/ConnectionFactoryBase.java#L45
hence I would expect that the changes on NetClientImpl's allocator should just do "the right thing", wdyt?

@franz1981 franz1981 marked this pull request as draft September 17, 2024 08:50
@franz1981
Copy link
Contributor Author

After talking with @cescoffier I'm changing this back to use a specific option to enable this feature.

@franz1981 franz1981 force-pushed the 4.x_unified_allocator_fix_ssl branch from 2dc914b to 0fca2ca Compare October 1, 2024 13:24
@franz1981 franz1981 changed the title Adding sys prop to enable heap pools with JDK SSL Enable buffer pooling settings with a SSL configurable option Oct 1, 2024
@franz1981 franz1981 marked this pull request as ready for review October 1, 2024 13:25
@franz1981
Copy link
Contributor Author

franz1981 commented Oct 1, 2024

PTAL @vietj

I've added some test (I could improve thsi further) - but it is not clear to me yet the role of https://vertx.io/docs/apidocs/io/vertx/core/http/HttpClientOptions.html#setSsl-boolean-

And indeed it seems that ssl, whilst false can still build a JDK EngineConfig, see

sslContextFactorySupplier = Future.succeededFuture(new EngineConfig(SslProvider.JDK, sslOptions, () -> new DefaultSslContextFactory(SslProvider.JDK, false), SSLEngineOptions.DEFAULT_USE_WORKER_POOL));

can you help me to understand what's ssl is meant to configure and how to use it?

@franz1981
Copy link
Contributor Author

@tsegismont
I need you here bud: how do I benefit from this change into vert-x sql client?
How we can consume it for vert-x sql client too into Quarkus?

i.e. Here I need users to set https://vertx.io/docs/apidocs/io/vertx/core/net/TCPSSLOptions.html#setSsl-boolean- and https://vertx.io/docs/apidocs/io/vertx/core/net/TCPSSLOptions.html#setSslEngineOptions-io.vertx.core.net.SSLEngineOptions- (passing a JDK SSL option which enable heap buffer pooling), in order to made vertx to do the "right" thing and avoid quarkusio/quarkus#41880 (comment) to happen.

@tsegismont
Copy link
Contributor

I need you here bud: how do I benefit from this change into vert-x sql client?

You can set options with io.vertx.pgclient.PgConnectOptions#setJdkSslEngineOptions

How we can consume it for vert-x sql client too into Quarkus?

In Quarkus, you have to customize pool creation, I think.
See https://quarkus.io/guides/reactive-sql-clients#customizing-pool-creation

@franz1981
Copy link
Contributor Author

Wdyt @cescoffier ? It is ok enough?

Thanks @tsegismont

@vietj
Copy link
Member

vietj commented Oct 21, 2024

@cescoffier can you validate this one ?

@franz1981
Copy link
Contributor Author

@vietj can you paste the additional test you found in our call, bud?

@vietj
Copy link
Member

vietj commented Oct 21, 2024

I think you want to look at NetTest with NetSocketInternal

@franz1981 franz1981 marked this pull request as draft October 22, 2024 12:11
@franz1981
Copy link
Contributor Author

Drafting till fixing the latest tests

@franz1981 franz1981 marked this pull request as ready for review October 22, 2024 12:34
@franz1981 franz1981 force-pushed the 4.x_unified_allocator_fix_ssl branch from 53d55ab to a3393ac Compare October 22, 2024 12:35
@franz1981
Copy link
Contributor Author

many thanks @vietj and @cescoffier for this hard round of review: you were right, there's something terrible hidden on SSL...

In a3393ac I've put some comment for some shocking outcome, and I think we need to have a sync before moving this forward - which BTW is not enabled by default eh

@franz1981
Copy link
Contributor Author

franz1981 commented Oct 22, 2024

While working on this PR I've noticed something odd, exactly 2 things:

  1. the original Vertx code which was setting PartialByteBufAllocator.INSTANCE as Netty allocator for server + SSL or client (SSL or NOT) hides something fishy: given that PartialByteBufAllocator.INSTANCE overrides the buffer() method which forces to allocates heap unpooled buffers and such method is used by https://github.com/netty/netty/blob/4.1/codec-http/src/main/java/io/netty/handler/codec/http/HttpObjectEncoder.java#L328 - it means that Netty, while sending it to the wire has to create an additional copy off-heap (beyond the additional on heap, which will be thrown away), because all transport layers requires direct buffers.
  2. Both the original vertx code or this PR are still hitting some slow-path on the SSL JDK engine, which requires the JDK engine to allocates on the flight some byte[]. This code path is shown below
	at com.sun.crypto.provider.GCTR.update(GCTR.java:212)
	at com.sun.crypto.provider.GCTR.doFinal(GCTR.java:291)
	at com.sun.crypto.provider.GaloisCounterMode$DecryptOp.doFinal(GaloisCounterMode.java:1870)
	at com.sun.crypto.provider.GaloisCounterMode$GCMEngine.doLastBlock(GaloisCounterMode.java:934)
	at com.sun.crypto.provider.GaloisCounterMode$GCMDecrypt.doFinal(GaloisCounterMode.java:1611)
	at com.sun.crypto.provider.GaloisCounterMode.engineDoFinal(GaloisCounterMode.java:458)
	at javax.crypto.Cipher.doFinal(Cipher.java:2543)
	at sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher.decrypt(SSLCipher.java:1908)
	at sun.security.ssl.SSLEngineInputRecord.decodeInputRecord(SSLEngineInputRecord.java:239)
	at sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:196)
	at sun.security.ssl.SSLEngineInputRecord.decode(SSLEngineInputRecord.java:159)
	at sun.security.ssl.SSLTransport.decode(SSLTransport.java:111)
	at sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:736)
	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:691)
	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506)
	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482)
	at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:679)
	at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:308)
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1443)
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1336)
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1385)
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530)
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1407)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:918)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:994)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.runWith(Thread.java:1596)
	at java.lang.Thread.run(Thread.java:1583)

And it refers to https://github.com/openjdk/jdk21/blob/890adb6410dab4606a4f26a942aed02fb2f55387/src/java.base/share/classes/com/sun/crypto/provider/GCTR.java#L212

which shows, for GCTR that the input buffer is not an heap one - maybe is a pretty normal case, but I'll report this here, for further analysis.

Copy link
Contributor

@cescoffier cescoffier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging, I would like to ensure it does not change anything for the native compilation.

@zakkak
Copy link

zakkak commented Oct 30, 2024

Before merging, I would like to ensure it does not change anything for the native compilation.

@cescoffier could you elaborate on what your concerns are for native compilation?

Looking at the changes I don't expect any drastic changes. Trying the change with Quarkus' vertx native IT I am getting

{
  "types": {
    "total": 16096,
    "reflection": 4455,
    "jni": 62,
    "reachable": 14093
  },
  "methods": {
    "foreign_downcalls": -1,
    "total": 125514,
    "reflection": 3555,
    "jni": 55,
    "reachable": 72175
  },
  "classes": {
    "total": 16096,
    "reflection": 4455,
    "jni": 62,
    "reachable": 14093
  },
  "fields": {
    "total": 36087,
    "reflection": 138,
    "jni": 67,
    "reachable": 21524
  }
}

vs

{
  "types": {
    "total": 16075,
    "reflection": 4451,
    "jni": 62,
    "reachable": 14072
  },
  "methods": {
    "foreign_downcalls": -1,
    "total": 125400,
    "reflection": 3550,
    "jni": 55,
    "reachable": 72092
  },
  "classes": {
    "total": 16075,
    "reflection": 4451,
    "jni": 62,
    "reachable": 14072
  },
  "fields": {
    "total": 36038,
    "reflection": 138,
    "jni": 67,
    "reachable": 21480
  }
}

with 4.5.10

which is reflected to an increase of 36kb in the native executable size.

However, I suspect that this IT test doesn't enable SSL. Any hints how I could/should test this with SSL enabled?

@cescoffier
Copy link
Contributor

@zakkak you can use the vertx-http integration tests - it has TLS tests (gRPC has some TLS tests too).

My concern is that in the past, such a change broke the native compilation because the classes were not on the classpath or because loading the class brought many other concerns and initialization issues (in addition to making the overall executable bigger). This should not be the case here, but because if we merge, it will land in the Quarkus LTS, we need to be sure we will not shoot ourselves in the foot.

@zakkak
Copy link

zakkak commented Oct 30, 2024

Thank you @cescoffier. I am appending the results from vertx-http.

TLDR: LGTM.

vertx-core 4.x (tip of vert.x without this PR)

  "types": {
    "total": 14411,
    "reflection": 4023,
    "jni": 62,
    "reachable": 12484
  },
  "methods": {
    "foreign_downcalls": -1,
    "total": 113204,
    "reflection": 3337,
    "jni": 55,
    "reachable": 64071
  },
  "fields": {
    "total": 33089,
    "reflection": 133,
    "jni": 67,
    "reachable": 19472
  }

this PR

  "types": {
    "total": 14419,
    "reflection": 4023,
    "jni": 62,
    "reachable": 12491
  },
  "methods": {
    "foreign_downcalls": -1,
    "total": 113234,
    "reflection": 3338,
    "jni": 55,
    "reachable": 64081
  },
  "fields": {
    "total": 33097,
    "reflection": 133,
    "jni": 67,
    "reachable": 19479
  }

this PR with pooled heap buffers enabled

  "types": {
    "total": 14410,
    "reflection": 4023,
    "jni": 62,
    "reachable": 12483
  },
  "methods": {
    "foreign_downcalls": -1,
    "total": 113201,
    "reflection": 3337,
    "jni": 55,
    "reachable": 64064
  },
  "fields": {
    "total": 33085,
    "reflection": 133,
    "jni": 67,
    "reachable": 19471
  }

To enable the pooled heap buffers I patched Quarkus with:

diff --git a/extensions/vertx-http/runtime/src/main/java/io/quarkus/vertx/http/runtime/options/HttpServerOptionsUtils.java b/extensions/vertx-http/runtime/src/main/java/io/quarkus/vertx/http/runtime/options/HttpServerOptionsUtils.java
index 36fe46a9d3a..872d79a85cb 100644
--- a/extensions/vertx-http/runtime/src/main/java/io/quarkus/vertx/http/runtime/options/HttpServerOptionsUtils.java
+++ b/extensions/vertx-http/runtime/src/main/java/io/quarkus/vertx/http/runtime/options/HttpServerOptionsUtils.java
@@ -82,6 +82,7 @@ public static HttpServerOptions createSslOptions(HttpBuildTimeConfig buildTimeCo
                 serverOptions.setAlpnVersions(Arrays.asList(HttpVersion.HTTP_2, HttpVersion.HTTP_1_1));
             }
         }
+        serverOptions.setSslEngineOptions(new JdkSSLEngineOptions().setPooledHeapBuffers(true));
         setIdleTimeout(httpConfiguration, serverOptions);
 
         TlsConfiguration bucket = getTlsConfiguration(httpConfiguration.tlsConfigurationName, registry);

@cescoffier
Copy link
Contributor

Thanks @zakkak !

All good for me then!

@franz1981
Copy link
Contributor Author

so @vietj here we are: this can be moved forward and I'll create the relevant issues on the quarkus repo for the missing bits to make it available to use there.

I strongly suggest @vietj that vertx users which relies on JDK SSL will enable this feature as well, TBH, as performance with JDK SSL is terrible without it...

@vietj vietj merged commit ccca2b8 into eclipse-vertx:4.x Nov 5, 2024
7 checks passed
@vietj
Copy link
Member

vietj commented Nov 5, 2024

thanks for the contrib @franz1981

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants