Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hot restart bug due to passed an fd without bind #8426

Merged
merged 5 commits into from
Oct 4, 2019

Conversation

zyfjeff
Copy link
Member

@zyfjeff zyfjeff commented Sep 29, 2019

When the 15001 listener appears at the same time, one of them will bind the socket and listening, and the other will not bind. In this case, the hot restart gets the socket of the previous process and may get the socket that is not listening.

Description: Fix hot restart bug due to passed an fd without bind
Risk Level: low
Testing: Existing tests and new tests
Docs Changes: N/A
Release Notes: N/A
[Optional Fixes #Issue] #8427

@zyfjeff zyfjeff changed the title WIP: Fix hot restart bug due to passed an fd without bind Fix hot restart bug due to passed an fd without bind Sep 29, 2019
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good a at a high level, thanks. Some small comments.

/wait

@@ -166,6 +166,11 @@ Network::SocketSharedPtr ProdListenerComponentFactory::createListenSocket(
? Network::Utility::TCP_SCHEME
: Network::Utility::UDP_SCHEME;
const std::string addr = absl::StrCat(scheme, address->asString());

if (!bind_to_port && socket_type == Network::Address::SocketType::Stream) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? Won't this just return -1 below once you check for not binding in the parent?

Copy link
Member Author

@zyfjeff zyfjeff Oct 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the need to determine whether the bind, if there is no bind socket to the parent process to obtain the socket, may get the already bind socket, in addition, there is no way to handle the traffic without the bind socket, we can not go to the parent process to obtain the socket, directly create it.

@@ -95,6 +95,11 @@ HotRestartingParent::Internal::getListenSocketsForChild(const HotRestartMessage:
Network::Address::InstanceConstSharedPtr addr =
Network::Utility::resolveUrl(request.pass_listen_socket().address());
for (const auto& listener : server_->listenerManager().listeners()) {
if (listener.get().socket().socketType() == Network::Address::SocketType::Stream &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantically, do we need the stream type check? Isn't just checking for bind to port sufficient?

Copy link
Member Author

@zyfjeff zyfjeff Oct 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, I understand it wrong, udp socket does not have bind to port, there is no way to handle traffic., so we just need to judge whether bind to port is enough.

Signed-off-by: tianqian.zyf <[email protected]>
Signed-off-by: tianqian.zyf <[email protected]>
@zyfjeff
Copy link
Member Author

zyfjeff commented Oct 1, 2019

/retest

@repokitteh-read-only
Copy link

🔨 rebuilding ci/circleci: coverage (failed build)

🐱

Caused by: a #8426 (comment) was created by @zyfjeff.

see: more, trace.

@mattklein123
Copy link
Member

@zyfjeff friendly request to please not force push. It makes reviewing more difficult. Thank you!

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this makes sense to me. One test request .Thank you!

/wait

} else {
return std::make_shared<Network::UdpListenSocket>(std::move(io_handle), address, options);

if (bind_to_port) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a listener manager test for this case also?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea of ​​my test case is to test when the bind_to_port is equal to false, not to get the socket from the hot restart, but to create the socket directly.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM with some small comments.

/wait

Network::Address::SocketType socket_type,
const Network::Socket::OptionsSharedPtr& options,
bool bind_to_port) -> Network::SocketSharedPtr {
// When bind_to_port is equal to false, create socket fd directly, and do not get socket
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an explicit expectation with Times(0) to the mock hot restarter to make sure we never call it?

@@ -37,7 +37,7 @@ class ListenerHandle {

class ListenerManagerImplTest : public testing::Test {
protected:
ListenerManagerImplTest() : api_(Api::createApiForTest()) {
ListenerManagerImplTest() : api_(Api::createApiForTest()), real_listener_factory_(server_) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just define this in the test where you use it for now as we don't use it anywhere else?

Signed-off-by: tianqian.zyf <[email protected]>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks.

@mattklein123
Copy link
Member

/azp run envoy-linux

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mattklein123 mattklein123 merged commit fcb3cf1 into envoyproxy:master Oct 4, 2019
danzh2010 pushed a commit to danzh2010/envoy that referenced this pull request Oct 4, 2019
@zyfjeff zyfjeff deleted the fix-hotrestart-bug branch October 12, 2019 02:32
nandu-vinodan pushed a commit to nandu-vinodan/envoy that referenced this pull request Oct 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants