Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go to "permanent error mode" if ChannelExecutor throws exception #3293

Closed
dapengzhang0 opened this issue Jul 31, 2017 · 7 comments · Fixed by #4023
Closed

Go to "permanent error mode" if ChannelExecutor throws exception #3293

dapengzhang0 opened this issue Jul 31, 2017 · 7 comments · Fixed by #4023
Assignees
Milestone

Comments

@dapengzhang0
Copy link
Member

As mentioned in discussion in #3288, it's questionable only catching instead of throwing exceptions in ChannelExecutor. Should we throw regardless or should we only log in production and throw in test evn?

@dapengzhang0
Copy link
Member Author

@zhangkun83 , @ejona86

@ejona86
Copy link
Member

ejona86 commented Jul 31, 2017

I do think it tends to make sense to catch and log the exception, since propagating the exception will probably only cause additional failures and we know the current stack does not depend on the result of the call (since the execution is not guaranteed to be complete when drain returns.)

I don't think we should throw in a test environment; that's likely to produce confusion. Instead, you can inject a "exception handler" that is passed the exception. Normally, it would simply log. In tests it could save the exception and we can add an @After to assert that no exception was thrown. None of the tests should trigger the exception code path (except maybe a test that is testing the exception code path :) ).

@zhangkun83
Copy link
Contributor

Instead, you can inject a "exception handler" that is passed the exception

Not sure if you meant injecting the exception handler to ChannelExecutor, but I am just thinking about that. ManagedChannelImpl, when capturing a throwable, may switch to an "permanent error mode" which would fail subsequent RPCs, which propagate the error very loudly.

@carl-mastrangelo
Copy link
Contributor

FYI, I have been looking at consolidateing our executors (we have many) including the channel exec. If you make a change before I send out mine, please cc me.

@ejona86
Copy link
Member

ejona86 commented Sep 14, 2017

I'm fine with going into a "permanent error mode" if we get an exception with ChannelExecutor. Although the implementation will need to be very simple.

@ejona86 ejona86 changed the title More thoughts on how ChannelExecutor should handle exceptions Go to "permanent error mode" if ChannelExecutor throws exception Nov 8, 2017
@ejona86 ejona86 added this to the Next milestone Nov 8, 2017
@ejona86
Copy link
Member

ejona86 commented Nov 8, 2017

@zhangkun83, assigning this to you since it seemed you were really interested in getting this done; feel free to unassign/reassign.

@ejona86 ejona86 modified the milestones: Next, 1.11 Feb 22, 2018
ejona86 added a commit to ejona86/grpc-java that referenced this issue Feb 27, 2018
It is now discovering legitimate errors in some tests that need to be
resolved first. Follow progress at grpc#3293.
@ejona86
Copy link
Member

ejona86 commented Feb 27, 2018

#4152 disabled the added panic mode. It doesn't delete it, but leaves it non-functional.

We discovered pre-existing tests that had failures cause by exitIdleMode scheduling a timer, which can trigger Thread creation. The timer is to schedule IDLE_TIMEOUT, which is enabled by default to 30 minutes. In certain test environments, Thread creation is black-listed and throws an exception. Previously the exception was caught and logged, but otherwise things worked fine for the test; the test would not last long enough such that the broken IDLE_TIMEOUT would matter.

I thus disabled panic mode so that we can fix up those tests before re-enabling. These tests would all be using InProcessTrasport, and since the recent #4139 it is now possible to pass your own ScheduledExecutorService. These tests could be fixed by passing a fake service of some sort, like TestingExecutors.noOpScheduledExecutor() in Guava's testlib.

@ejona86 ejona86 reopened this Feb 27, 2018
ejona86 added a commit that referenced this issue Feb 27, 2018
It is now discovering legitimate errors in some tests that need to be
resolved first. Follow progress at #3293.
@ejona86 ejona86 modified the milestones: 1.11, 1.12 Mar 13, 2018
@lock lock bot locked as resolved and limited conversation to collaborators Sep 28, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants