-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling shutdown reason in direct_consumer, and clean up consumer afterwards. #184
Conversation
Hi @benonymus. Could you share a sample test case which reproduces the error? Thanks! |
@ono Hey, it is hard to reproduce, because it doesn't actually cause problems. I think I saw someone else mentioning this as well, I will try to find it |
Cool. yeah, it will be great if you can create a sample project or unit test that spams error logs. By the way, channel is closed explicitly with |
Hi @ono The way the channels are handled inside
So the gen_consumer, executing the def handle_info({:DOWN, _mref, :process, consumer, :normal}, consumer) do
{:ok, consumer}
end
def handle_info({:DOWN, _mref, :process, consumer, info}, consumer) do
{:error, {:consumer_died, info}, consumer}
end because the |
Hey @haljin. Thanks for the detailed explanation! My thoughts:
There is also related discussion in #186. |
@ono But in the race condition I described the user does explicitly close the channel, that's what causes the race condition. In the unexpected situation, the |
are you explaining a same thing @benonymus is experiencing and solving to fix here? or expanding the topic? I'd like to make sure if we want to fix countless error logs or a race condition here. I think user's consumer process shouldn't be linked to any processes when using DirectConsumer. That will avoid the race condition? |
@ono Please re-read my original description. The countless errors logs are caused by a race condition between the amqp_channel closing (because The consumer process is only linked to the DirectConsumer itself as it should be. |
Don't worry - I understand that. I wanted to check if @benonymus is experiencing this despite he closes the channel explicitly on their test cases. Let's use a simple test case as an example: test "simple test" do
{:ok, conn} = AMQP.Connection.open()
{:ok, channel} = AMQP.Channel.open(conn, {AMQP.DirectConsumer, self()})
on_exit(fn ->
AMQP.Channel.close(channel)
end)
end Unfortunately You can avoid the race condition with the following: def ensure_close(%{pid: pid} = chan, retry \\ 0) do
if Process.alive?(pid) && retry < 100 do
:timer.sleep(10)
ensure_close(chan, retry + 1)
else
:ok
end
end
test "avoid a race condition" do
{:ok, conn} = AMQP.Connection.open()
{:ok, channel} = AMQP.Channel.open(conn, {AMQP.DirectConsumer, self()})
on_exit(fn ->
AMQP.Channel.close(channel)
ensure_close(channel)
end)
end I understand it's tedious but I hesitate to change the current behaviour as it can give a side effect to other users. However we can enhance the module. What do you guys think about adding an option? AMQP.Channel.open(conn, {AMQP.DirectConsumer, {self(), [monitor: false]}}) If the |
Maybe we should've clarified that @benonymus is on my team and we have been working on this together. :) I think a flag could be a good solution, although I think it should be called something like |
Got it. Yeah, getting back to the initial discussion point - handling However I would keep the behaviour as similar as possible with amqp_client version so let's use the option for now. I like |
…and fixed deprecation warning in a test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, @benonymus and @haljin. After a bit more thinking, I am thinking of changing the option like below.
opts = [
on_consumer_down: [
cascade: true,
ignore: [:normal, :shutdown]
]
]
with cascade: false
, it simply ignores consumer down for any reasons.
Don't worry to make the further changes though. I will work on them with a separate PR after merging this PR.
Don't worry about the code review comments either - I can fix them on my PR for the option change at the same time.
@@ -61,7 +61,7 @@ defmodule ConnectionTest do | |||
|
|||
test "open connection with uri, name, and options (deprected but still spported)" do | |||
assert {:ok, conn} = | |||
Connection.open("amqp://nonexistent:5672", "my-connection", host: 'localhost') | |||
Connection.open("amqp://nonexistent:5672", name: "my-connection", host: 'localhost') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended. Testing a deprecated option for backward compatibility.
# there's no support for direct connection | ||
# this callback implementation should be added with library support | ||
{:error, :undefined} | ||
end | ||
|
||
@impl true | ||
def handle_info({:DOWN, _mref, :process, consumer, :normal}, consumer) do | ||
{:ok, consumer} | ||
def handle_info({:DOWN, _mref, :process, state, reason}, %{ignore_shutdown: true} = _state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't want to drop the pattern match for consumer pid here and other places.
Hey @ono thanks, sounds good to me! |
Hey there,
We encountered countless
:consumer_died
error logs in our test cases, and this pr aims to rectify that, and additionally to fix another possible headache for other users with disorderly shutdowns.The changes are that we handle :shutdown and :normal exit reasons to avoid messy error logs, and return nil as the consumer. Now we also need to handle that, so that no message is being forwarded anymore in case if the channel was not shut down, if you were to try to send messages then we return an error.
I hope this makes sense.