Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection prematurely closed cause by the firewalls and target servers idle timeout #498

Closed
georgmittendorfer opened this issue Nov 6, 2018 · 18 comments

Comments

@georgmittendorfer
Copy link

georgmittendorfer commented Nov 6, 2018

I get prematurely closed connections when using reactor-netty in certain scenarios.

The error happens only when doing simultaneous calls to several back-end services with Springs WebClient. I do such calls once per minute but the error does not occur every time.

The same happend with older versions, too (reactor Bismuth and spring boot < 2.1.0 and corresponding netty versions). With the current version the errors are more frequent (and the error message is different). With the old version I had about one error every ten minutes, now I have one every three to five minutes.

There seems to be some concurrency issue because the error only shows up when doing these simultaneous calls. There are no errors when calling the back-end services otherwise.

Sorry for the meager description but I don't have much more details yet. Might be related to #413

Expected behavior

Exception shouldn't happen.

Actual behavior

2018-11-06 16:02:23.511 ERROR [-client-epoll-8] r.n.r.PooledConnectionProvider : [id: 0x97c1f1ad, L:/192.168.0.yyy:yyyyy ! R:192.168.0.xxx/192.168.0.xxx:xxxxx] Pooled connection observed an error

reactor.netty.http.client.HttpClientOperations$PrematureCloseException: Connection prematurely closed BEFORE response

Steps to reproduce

I don't have any test code to reproduce this.

I do the calls by using Mono.zip to execute the calls concurrently: public static <R> Mono<R> zip(final Iterable<? extends Mono<?>> monos, Function<? super Object[], ? extends R> combinator)

Reactor Netty version

  • reactor-netty 0.8.2
  • netty 4.1.29.Final
  • spring-cloud.version: Finchley.SR2
  • reactor.version: Californium-SR2 (3.2.2)
  • spring boot 2.1.0

Same issue, with different error message, happens with earlier versions, too.

JVM version (e.g. java -version)

java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

OS version (e.g. uname -a)

Linux 4.15.18-2-pve #1 SMP PVE 4.15.18-20 (Thu, 16 Aug 2018 11:06:35 +0200) x86_64 GNU/Linux

@violetagg
Copy link
Member

@georgmittendorfer what do you use from this one spring-cloud.version: Finchley.SR2 - spring cloud gateway?

@georgmittendorfer
Copy link
Author

georgmittendorfer commented Nov 6, 2018

@violetagg Sorry that dependency was a copy and paste legacy in my pom.xml. It is not related to this issue. In the meantime I tried to reproduce the issue in a test case but didn't succeed.

@smaldini
Copy link
Contributor

smaldini commented Nov 6, 2018

Do you use SSL @georgmittendorfer ?

@georgmittendorfer
Copy link
Author

georgmittendorfer commented Nov 7, 2018

@smaldini some of the backend services are accessed via https. To my awareness only the calls accessing the non-SSL services have had the PrematureCloseExceptions so far.

@violetagg
Copy link
Member

@georgmittendorfer Is your scenario similar to this one here #413 (comment). If yes - do you provide Content-Length or Transfer-Encoding?

@georgmittendorfer
Copy link
Author

@violetagg setting content-length doesn't help.

Furthermore I have problems reproducing the issues on my dev system. Is it possible to disable pooling of connections (meaning: disabling concurrent reuse of the connections) so that I can verify that it is a concurrency issue at all?

I think it is only fair to close this issue until I can somehow prove that it is related to reactor netty at all. I am going to reopen as soon as I have more information, OK?

@violetagg
Copy link
Member

@georgmittendorfer For disabling connection pool use:

0.7.x

HttpClient.create(opt -> opt.disablePool());

0.8.x

HttpClient.create(ConnectionProvider.newConnection());

I'm closing the issue you can reopen it fi you have more information for the problem.

Regards,
Violeta

@ramirezag
Copy link

I encounter the same issue as well. I created a crawler that will crawl for info in our internal API service and then save the info into the application that is running locally. The crawler will take atleast an hour to crawl. At some point while crawling, I get Connection prematurely closed BEFORE response.

Converting from pooled to non-pooled connector seems to have fixed the issue.

 @Bean
    public ClientHttpConnector clientHttpConnector() {
        TcpClient tcpClient = TcpClient.newConnection()
//        TcpClient tcpClient = TcpClient.create()
            .option(ChannelOption.CONNECT_TIMEOUT_MILLIS, connectTimeoutMills);
        return new ReactorClientHttpConnector(HttpClient.from(tcpClient));
    }

@violetagg
Copy link
Member

@ramirezag Can you try the latest 0.8.4.BUILD-SNAPSHOT? We have fixes for use cases that might cause this exception to appear.

@ramirezag
Copy link

@violetagg, I tried running the crawler using 0.8.4.BUILD-SNAPSHOT and using the pooled connector for a couple of hours. So far, I have not encountered the issue.

@georgmittendorfer
Copy link
Author

georgmittendorfer commented Jan 4, 2019

@violetagg sorry, it took me some time to try out to disable the pool.

Thing is that when the pooling is disabled the errors go away. This means that they are related to the connection pooling, right? I tried with 0.8.3.RELEASE and 0.8.4.BUILD-SNAPSHOT - same result.

What implications has it to turn connection pooling off as a workaround?

To communicate all details (hope it helps somehow): I got two different error messages. One was Connection has been closed BEFORE response, while sending request body and the other one was the well known, more frequent Connection prematurely closed BEFORE response. I am not sure if the first message has to do with the same issue as it affected another back-end service instance than the ones where the later error messages typically occur.

@violetagg
Copy link
Member

@georgmittendorfer Open a new issue with a reproducible scenario or at least explain your scenario. If you can provide some logs/tcpdump.

@georgmittendorfer
Copy link
Author

georgmittendorfer commented Jan 26, 2019

@violetagg I tracked down the problem. reactor-netty code is good and doesn't have a bug. I am sorry for having wasted your time on this issue.

Others with similar errors might look at their network:

Issue in this case was that the firewalls and target servers idle timeout were exactly equal to the check interval. Therefore it came to concurrency issues from time to time as the target server or firewall closed the connection at the same time as the request was made.

After changing firewall timeouts and check interval the errors are gone.

@sganslandt
Copy link

@georgmittendorfer I'm having similar issues and it might very well be related to something similar, would you mind sharing some insights about your findings? What is this check interval, where is it configured, what's the default etc.? Looking around on the HttpClient and underlying TcpClient (netty) I cannot seem to find anything about it 😕

@georgmittendorfer
Copy link
Author

@sganslandt my findings are described above, already. I did requests in 60 seconds intervals but the target app server and firewall idle timeout was 60 seconds, too. Sometimes that lead to problems when either the app server or the firewall closed the connection while I was trying to use it. Changing interval or timeout should help in this case.

@sganslandt
Copy link

sganslandt commented Jan 30, 2019

Misunderstood and thought the check interval was referring to some built-in connection check 😬 Sorry and thanks : )

@ffroliva
Copy link

I am facing this error. Please have a loot at my code:

https://github.com/ffroliva/mimecast-backend
https://github.com/ffroliva/mimecast-frontend

@violetagg
Copy link
Member

@ffroliva Please create a new issue, this one is closed already

@violetagg violetagg changed the title Connection prematurely closed Connection prematurely closed cause by the firewalls and target servers idle timeout Jul 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants