Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection Pooling doesn't work #150

Closed
michaelpog opened this issue Mar 2, 2017 · 19 comments
Closed

Connection Pooling doesn't work #150

michaelpog opened this issue Mar 2, 2017 · 19 comments

Comments

@michaelpog
Copy link

michaelpog commented Mar 2, 2017

I'm trying to implement a connection pool for the http client.
I'm attempting to store for every remote address (host:port) a queue of HTTPUpstreamSessions that I can reuse.
I'm having issues reusing those sessions.
Looking through all handlers and callbacks of HTTPTransaction and HTTPUpstreamSession, I can't find a callback in which I can put the HTTPUpstreamSession back into the pool and have it reused, for a new transaction.
Furthermore, if one of the transactions in the session receives a timeout, the session is destroyed.

The performance for creating a new Session and transaction for every outbound request is pretty bad, which defeats the purposed of switching to Proxygen.
Can anyone please point me to the right direction? Or there needs to be some code change for that?

PS: Using regular HTTP 1.1, no ssl, simple request/response.

Thanks

@michaelpog michaelpog changed the title Connection Pool Connection Pooling doesn't work Mar 2, 2017
@afrind
Copy link
Contributor

afrind commented Mar 2, 2017

The callback you want is HTTPSession::InfoCallback::onTransactionDetached. In this callback you can check to see if the connection is below the maximum transaction limit and move it into the queue.

  auto curTxnCount = session->getNumOutgoingStreams();
  if (curTxnCount == 0) {
    // this session is completely idle
  } else if (curTxnCount < session_->getMaxConcurrentOutgoingStreams()) {
    // this session can accommodate new transactions
  } else {
    // this session cannot accommodate any new transactions
  }

I don't think it's true that a single transaction timeout would cause the whole session to be torn down. Can you give me an example of that?

@michaelpog
Copy link
Author

michaelpog commented Mar 2, 2017

Thanks for the reply.
I've tried putting my code in HTTPSession::InfoCallback::onTransactionDetached, and I see that that never gets called.
I see that the only place that onTransactionDetached callback can be called from is in HTTPSession::detach. But apparently that doesn't happen.
The only callback that is called (that I could think would be a good place ) is HTTPSession::InfoCallback::onDeactivateConnection which is being called from HTTPSession::detach as well, and seems like that happens because there can be only one transaction at a time:
https://github.com/facebook/proxygen/blob/master/proxygen/lib/http/session/HTTPSession.cpp#L1530

I also see that session_->getMaxConcurrentOutgoingStreams() is returning 1 always. Which I guess makes sense for HTTP 1.1?

Regarding the timeout. Here is a stack trace, generated from when the Session is being destroyed on a transactionTimeout:
#0 SessionWrapper::onDestroy (this=0x7fffec5eaab0) at /home/mishapog/development/my_project/httpclient/SessionWrapper.h:63
#1 0x00007ffff74adca7 in proxygen::HTTPSession::~HTTPSession (this=0x7fffec6979d0, __in_chrg=) at session/HTTPSession.cpp:223
#2 0x00000000004d7f79 in proxygen::HTTPUpstreamSession::~HTTPUpstreamSession (this=0x7fffec6979d0, __in_chrg=) at session/HTTPUpstreamSession.cpp:19
#3 0x00007ffff74ae918 in proxygen::HTTPSession::detach (this=0x7fffec6979d0, txn=0x7fffec693fe8) at session/HTTPSession.cpp:1504
#4 0x00007ffff74b37aa in proxygen::HTTPTransaction::onDelayedDestroy (this=0x7fffec693fe8, delayed=) at session/HTTPTransaction.cpp:109
#5 0x00007ffff74ba646 in proxygen::HTTPTransaction::onIngressTimeout (this=this@entry=0x7fffec693fe8) at session/HTTPTransaction.cpp:592
#6 0x00007ffff749f215 in proxygen::HTTPSession::transactionTimeout (this=0x7fffec6979d0, txn=0x7fffec693fe8) at session/HTTPSession.cpp:1262
#7 0x00007ffff7a0c085 in folly::HHWheelTimer::timeoutExpired (this=0x7fffec00a380) at io/async/HHWheelTimer.cpp:237
#8 0x00007ffff79c2328 in folly::AsyncTimeout::libeventCallback (fd=, events=, arg=0x7fffec00a380) at io/async/AsyncTimeout.cpp:160
#9 0x00007ffff428ea0c in event_base_loop () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5
#10 0x00007ffff79f2320 in folly::EventBase::loopBody (this=this@entry=0x7fffec000c00, flags=flags@entry=0) at io/async/EventBase.cpp:301
#11 0x00007ffff79f2d07 in folly::EventBase::loop (this=this@entry=0x7fffec000c00) at io/async/EventBase.cpp:240
#12 0x00007ffff79f3f06 in folly::EventBase::loopForever (this=0x7fffec000c00) at io/async/EventBase.cpp:431
#13 0x00007ffff753e4a7 in wangle::IOThreadPoolExecutor::threadRun(std::shared_ptrwangle::ThreadPoolExecutor::Thread) () from /usr/local/lib/libproxygenhttpserver.so.0
#14 0x00007ffff755570a in void std::_Mem_fn_base<void (wangle::ThreadPoolExecutor::)(std::shared_ptrwangle::ThreadPoolExecutor::Thread), true>::operator()<std::shared_ptrwangle::ThreadPoolExecutor::Thread&, void>(wangle::ThreadPoolExecutor, std::shared_ptrwangle::ThreadPoolExecutor::Thread&) const () from /usr/local/lib/libproxygenhttpserver.so.0
#15 0x00007ffff7553610 in void std::_Bind<std::_Mem_fn<void (wangle::ThreadPoolExecutor::)(std::shared_ptrwangle::ThreadPoolExecutor::Thread)> (wangle::ThreadPoolExecutor, std::shared_ptrwangle::ThreadPoolExecutor::Thread)>::__call<void, , 0ul, 1ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul>) () from /usr/local/lib/libproxygenhttpserver.so.0
#16 0x00007ffff7551ad1 in void std::_Bind<std::_Mem_fn<void (wangle::ThreadPoolExecutor::)(std::shared_ptrwangle::ThreadPoolExecutor::Thread)> (wangle::ThreadPoolExecutor, std::shared_ptrwangle::ThreadPoolExecutor::Thread)>::operator()<, void>() () from /usr/local/lib/libproxygenhttpserver.so.0
#17 0x00007ffff754f410 in void folly::detail::function::FunctionTraits<void ()>::callBig<std::_Bind<std::_Mem_fn<void (wangle::ThreadPoolExecutor::)(std::shared_ptrwangle::ThreadPoolExecutor::Thread)> (wangle::ThreadPoolExecutor, std::shared_ptrwangle::ThreadPoolExecutor::Thread)> >(folly::detail::function::Data&) () from /usr/local/lib/libproxygenhttpserver.so.0
#18 0x0000000000446d19 in folly::detail::function::FunctionTraits<void ()>::operator()() (this=0x757940) at /usr/local/include/folly/Function.h:305
#19 0x0000000000453e1e in std::_Bind_simple<folly::Function<void ()> ()>::_M_invoke<>(std::_Index_tuple<>) (this=0x757940) at /usr/include/c++/5/functional:1531
#20 0x0000000000453d40 in std::_Bind_simple<folly::Function<void ()> ()>::operator()() (this=0x757940) at /usr/include/c++/5/functional:1520
#21 0x0000000000453c90 in std::thread::_Impl<std::_Bind_simple<folly::Function<void ()> ()> >::_M_run() (this=0x757920) at /usr/include/c++/5/thread:115
#22 0x00007ffff64e5c80 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#23 0x00007ffff6bcc6fa in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#24 0x00007ffff5c4bb5d in clone () from /lib/x86_64-linux-gnu/libc.so.6

(The first line SessionWrapper class implements proxygen::HTTPSession::InfoCallback )

How can I prevent that from happening?
Is there a way to have more than one transaction at a time for a given session, that I'm not aware of?

One more question would be regarding the threads. Currently I have a separate pool for every http client thread (which is wasteful). However I have more than 1 thread.
I basically have an IOThreadPoolExecutor, of 7 threads that all send http requests.
Can I reuse an HTTPUpstreamSession created in one thread, in another thread?

Thanks again for the quick reply!

@afrind
Copy link
Contributor

afrind commented Mar 3, 2017

Oops I guess you actually need to implement both callbacks. Kind of ugly that HTTPSession gives you different callback. I think we should probably deliver onTranactionDetached whenever a transaction detaches, and onConnectionDeactivated when it goes completely idle.

That being said, you probably shouldn't be timing out? Either your request or response is not completing.

There are APIs in HTTPUpstreamSession which you can use to detach it from one thread and attach it in another, but I wouldn't recommend using those. It was kind of fragile to implement, and some of the logic required may not be open source yet.

@michaelpog
Copy link
Author

michaelpog commented Mar 3, 2017

Thank you again for your reply.
Regarding the timeout, I believe there is a flaw here.
First of all, timeouts are normal in my case. I'm testing proxygen as a potential replacement for a service that talks to many services in real time, and has to stick to very low latencies. So it imposes a short timeout on all downstream requests. Timeouts from some of those services are normal in my case, and are part of the use case. I have special logic for that. In fact I'm simulating it by a random sleep on the downstream services.

So the first problem I see is that a timeout on one transaction killing the entire connection seems unreasonable to me. A timeout on one transaction doesn't indicate that the next transaction will timeout as well. So I would like to keep the connection alive.

Second problem is much more severe: Testing with a single timing out request I confirmed that, first I get the proxygen::HTTPSession::InfoCallback::onDeactivateConnection callback, and it passes this test:
auto numberOfCurrentTransactions = session->getNumOutgoingStreams();
if(numberOfCurrentTransactions < session->getMaxConcurrentOutgoingStreams()){
//adding the session back to the pool to be reused by next requests.

Then later the proxygen::HTTPSession::InfoCallback::onDestroy callback is called, after I put that session back to the pool already. And at that time the session is destroyed.
So a future request comes in, takes a session from the pool, that already has been destroyed.
It has no idea and way to know about it and gets a seg fault when it's trying to create a transaction out of that destroyed session!!

I tried to work around it by wrapping every HTTPUpstreamSession with an object that implements the proxygen::HTTPSession::InfoCallback (like the SessionWrapper in the ProxyServer example of proxygen). And mark that object as destroyed when I receive the onDestroy callback. So the pool will basically have a bunch of dead objects. And when I try to get a new session from the pool I need to go through a bunch, and pop them out until I find a healthy one.
But I don't think that's solving the problem entirely
There is still a possible race condition, that I cannot fully guarantee will not happen, since I'm not fully familiar with the code.

Consider this:
1)The proxygen::HTTPSession::InfoCallback::onDeactivateConnection is called, so I put the connection back to the pool.
2) I need a new request to send out, so I take that session from the pool, and create a transaction from it.
3)I get the proxygen::HTTPSession::InfoCallback::onDestroy on that session, ( from the previous transaction, not the current). and the session is destroyed
Can I guarantee it will not happen? Am I being paranoid?

I also wanted to ask if you guys implemented your own connection pool that you can share if me? Whether as a patch, or an example?
I was pretty surprised that I could not find a connection pooling code in proxygen and Wangle. It seems like such a crucial part of an http client framework that is intended for high performance. And it's definitely not trivial to implement myself as I see now, with the issues I'm having.

Thanks again for your help and quick replies!

@afrind
Copy link
Contributor

afrind commented Mar 3, 2017

Regarding timeout on one transaction killing the connection: For HTTP/1.1, each request/response must complete in order for the connection to be reusable for the next request/response. So if you timeout waiting for the response, you have no choice but to abort the connection. For HTTP/2, the timed out transaction is individually aborted via RST_STREAM, and the connection continues to be reusable. Let me know if you see this behavior with H2.

Regarding onDeactivateConnection/onDestroy, I think they should be called basically in the same event loop, so it shouldn't be possible for another intervening request to get the session out of the pool. We also check HTTPSession::isPoolable() before placing it in the pool.

We probably should release our connection pool open source. It was recently rewritten at the time we released proxygen to the community, so we didn't think it was mature enough. Now we have several years of operational experience with it, it's probably safe to put out.

I'll get the ball rolling internally but I imagine our security folks will want to take another look at it before it goes out, so it won't be immediate.

@michaelpog
Copy link
Author

Oh, you're right!
I didn't think about the fact that if I don't kill the connection after the timeout, I may still get a response maybe later, which I don't want.
Thank you for clarifying that. I will try Http/2 later as well, currently I'm focusing on HTTP/1.1.

Regarding the race condition, I am aware of the fact that it's all in one even loop, but I was just worried that event will be in that order in the loop, since I'm not sure that onDeactivateConnection and onDestroy will be guaranteed to be called in one event loop task. But maybe I am being paranoid here. I will test.

About the Connection Pool:
Yes please ( about open sourcing the connection pool). The sooner the better. I really like the framework and would like to be able to demonstrate (and be reassured myself) at it's performance superiority and maturity.
Is there any timeline (as rough as it is) that you could give me on when it can be open-sourced? Is there any mailing list I can join to be notified about its progress?
I see that there was some discussion about it already in https://groups.google.com/forum/#!searchin/facebook-proxygen/connection$20pool%7Csort:relevance/facebook-proxygen/m2OAWwGxZTc/Elr8ujCIthwJ
back in 2015.

Also I'm guessing HTTPSession::isPoolable() is part of that code that hasn't been released, since I don't see that method in the repo.

Thanks again for your help!

@afrind
Copy link
Contributor

afrind commented Mar 3, 2017

Oops I mistyped, I meant isReusable(). And we also check isClosing().

I've started asking the questions, but I don't have a timeline. I'll update this issue when I have more info.

@michaelpog
Copy link
Author

michaelpog commented Mar 3, 2017

Thank you!
I'm looking forwards for a update.
Meanwhile I'm having very strange issues with using session->getNumOutgoingStreams().
In HTTPSession::InfoCallback::onDeactiveConnection when I check if session->getNumOutgoingStreams()==0 I'm getting different results between Release and Debug configuration.

I'm using cmake to compile my application.
When I compile in Debug mode, I'm getting session->getNumOutgoingStreams() ==0. Which is what I expect.

However the example same code, same request. When I compile in Release. I'm always getting session->getNumOutgoingStreams()==1.

This is my code:
//my callback implementation
void onDeactivateConnection(const HTTPSession &session) override {
LOG(INFO) <<" on DeactivateConnection";
auto httpUpstreamSession = dynamic_cast<const HTTPUpstreamSession*>(&session);
if(httpUpstreamSession != nullptr) {
this->returnConnectionToPool(httpUpstreamSession);
}
}

void ConnectionPool::returnConnectionToPool(const HTTPUpstreamSession* session) {
auto numberOfCurrentTransactions = session->getNumTransactions();
LOG(INFO) << "session isResuable "<isReusable()<<" session current txs "<<numberOfCurrentTransactions;
if(numberOfCurrentTransactions < session->getMaxConcurrentOutgoingStreams() && session->isReusable()) {
LOG(INFO) << "Adding session back to pool"; // In release I never get here because numberOfCurrentTransactions is 1.
_addressToConnectionPool[session->getPeerAddress()].push_back(const_cast<HTTPUpstreamSession*>(session));
}
}

I'm really confused why that would happen. I understand that there's some kind of wrong optimization that the compiler is doing in Release, but I don't know how to do about figuring it out. I worked around it by using just isReusable() for now.. But I think there's something that should be investigated here.

Thanks again, I really appreciate your help on this!

@afrind
Copy link
Contributor

afrind commented Mar 3, 2017

I'll try to repro as well

@afrind
Copy link
Contributor

afrind commented Mar 6, 2017

Sorry, I wasn't able to repro this behavior using our build tools. I took your patch and merged it into the proxy sample SessionWrapper (just the printing part). I always get "Adding session back to pool" both with our internal build and with an ubuntu 14.04 build.

Can anyone else repro? What happens if you attach the debugger to the opt build?

@michaelpog
Copy link
Author

michaelpog commented Mar 6, 2017

Attaching the debugger to an optimized executable doesn't work. Compiling in Release, the debugger doesn't even hit the break point, and when compiling in RelWithDebugInfo, I hit the break point but can't watch any variables. So all I can do is print the result, and it's always 1. Except in Debug.

I'm on Ubuntu 16.04, I don't think it matters though.

I'll try to create a small repo with just that issue when I get the chance. It's not blocking me at the moment. I'm facing other big performance issues with Proxygen right now. Very high CPU usage on fairly low Requests/sec. I'm going to investigate it a bit further and create separate issue.

@michaelpog
Copy link
Author

michaelpog commented Mar 9, 2017

I've created a repo to show how I reproduce the issue I've described https://github.com/michaelpog/proxygen_proxyserver along with the output I'm seeing both in Debug and Release. I hope it helps

@afrind
Copy link
Contributor

afrind commented Apr 25, 2017

I don't suppose you were able to figure this one out? I really can't see what's going on that would cause a simple integer to report wrong in release builds. Any chance you can try with a different compiler?

@VarsMolta
Copy link

Hi. Is there an update on open sourcing the connection pool feature?

@afrind
Copy link
Contributor

afrind commented Feb 14, 2018

Sadly, not right now. We'll let folks know when we have concrete plans

@stevezhou6
Copy link

i have same question. i see you have open source SessionPool and SessionHolder.
but i dont uandstand how to use it.
is one url & port one SessionPool ?

@afrind
Copy link
Contributor

afrind commented Feb 15, 2020

Connection pooling has been open sourced. See proxygen/lib/http/connpool. I'm not sure of the state of the docs and examples, but we will build those over time. Please migrate to that code and let us know if you have any issues.

@afrind afrind closed this as completed Feb 15, 2020
@stevezhou6
Copy link

SessionPool it can only be used from one thread。
as for client http, i want SessionPool shared between threads。
HTTPTransaction* SessionPoolThread::getTransaction(
HTTPTransaction::Handler* upstreamHandler, ) {
std::lock_guardstd::mutex lock(lock_);

but holder->drain(); will cause multi thread read-write same HTTPSessionBase, cause crash。

@afrind
Copy link
Contributor

afrind commented Feb 19, 2020

See a followup in issue #314

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants