-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry network operations with registries on I/O errors #3351
Retry network operations with registries on I/O errors #3351
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rmannibucau Thanks for the PR, but it looks like some unit tests are broken. Do they pass for you locally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution!
jib-core/src/main/java/com/google/cloud/tools/jib/blob/WritableContentsBlob.java
Outdated
Show resolved
Hide resolved
jib-core/src/main/java/com/google/cloud/tools/jib/blob/Blob.java
Outdated
Show resolved
Hide resolved
jib-core/src/test/java/com/google/cloud/tools/jib/http/FailoverHttpClientTest.java
Show resolved
Hide resolved
jib-core/src/main/java/com/google/cloud/tools/jib/blob/Blob.java
Outdated
Show resolved
Hide resolved
jib-core/src/main/java/com/google/cloud/tools/jib/http/FailoverHttpClient.java
Outdated
Show resolved
Hide resolved
jib-core/src/main/java/com/google/cloud/tools/jib/blob/WritableContentsBlob.java
Outdated
Show resolved
Hide resolved
@@ -312,6 +315,19 @@ private Response call(String httpMethod, URL url, Request request, HttpTransport | |||
httpTransport | |||
.createRequestFactory() | |||
.buildRequest(httpMethod, new GenericUrl(url), request.getHttpContent()) | |||
.setIOExceptionHandler(new HttpBackOffIOExceptionHandler(new ExponentialBackOff()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the maximum number of retries (and the maximum time) until this eventually gives up? We don't want for Jib to run for too long when it's not recoverable (e.g., server is down for a day).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In sec (quoted from the javadoc of the exponential backoff)
request# retry_interval randomized_interval
1 0.5 [0.25, 0.75]
2 0.75 [0.375, 1.125]
3 1.125 [0.562, 1.687]
4 1.687 [0.8435, 2.53]
5 2.53 [1.265, 3.795]
6 3.795 [1.897, 5.692]
7 5.692 [2.846, 8.538]
8 8.538 [4.269, 12.807]
9 12.807 [6.403, 19.210]
10 19.210 BackOff.STOP
Guess another PR can change the retry logic to not use exp backoff but these defaults should tolerate the small hips I saw and just require configuration propagation work so can be done after once the strategies are discussed/decided.
jib-core/src/main/java/com/google/cloud/tools/jib/http/FailoverHttpClient.java
Outdated
Show resolved
Hide resolved
return; | ||
} | ||
// make the test fail | ||
ex.sendResponseHeaders(123, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this works as you intended. 123 actually causes IOException. OTOH, 400 works.
There are some minor style errors, and given above, I'd have the code to look like this:
import static com.google.common.truth.Truth.assertThat;
public void testRetries() throws IOException {
HttpServer server = HttpServer.create(new InetSocketAddress(0), 1);
AtomicBoolean failed = new AtomicBoolean();
server
.createContext("/")
.setHandler(
exchange ->
exchange.sendResponseHeaders(failed.compareAndSet(false, true) ? 123 : 200, 0));
try {
server.start();
int port = server.getAddress().getPort();
List<LogEvent> events = new ArrayList<>();
int returnCode =
new FailoverHttpClient(true, true, events::add)
.get(new URL("http://localhost:" + port), Request.builder().build())
.getStatusCode();
assertThat(returnCode).isEqualTo(200);
assertThat(failed.get()).isTrue();
assertThat(events)
.containsExactly(
LogEvent.warn("GET http://localhost:" + port + " failed and will be retried"));
} finally {
server.stop(0);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take care of this myself in a follow-up PR (#3390).
jib-core/src/test/java/com/google/cloud/tools/jib/http/FailoverHttpClientTest.java
Show resolved
Hide resolved
Also, there are test failures, so please take a look at them too. |
Just a quick message to say I am AFK and will look when back - until someone beats me at it ;). |
Thanks for the update. Feel free to take your time.
I think adding another public static Blob from(WritableContents writable, boolean retryable) {
return new WritableContentsBlob(writable, retryable);
} |
…rHttpClient.java Co-authored-by: Chanseok Oh <[email protected]>
jib-core/src/main/java/com/google/cloud/tools/jib/blob/WritableContentsBlob.java
Outdated
Show resolved
Hide resolved
jib-core/src/main/java/com/google/cloud/tools/jib/http/FailoverHttpClient.java
Show resolved
Hide resolved
Oops, I tried to resolve the merge conflict but it broke the code. It should be -entry, entry.isFile() ? Blobs.from(entry.getPath()) : Blobs.from(ignored -> {}), true);
+entry, entry.isFile() ? Blobs.from(entry.getPath()) : Blobs.from(ignored -> {}, true)); |
I think this LGTM. Thanks for the contribution! Will take another look in the afternoon. There are a few things that I want to clean up, but I can take care of this myself later. |
jib-core/src/main/java/com/google/cloud/tools/jib/image/ReproducibleLayerBuilder.java
Outdated
Show resolved
Hide resolved
jib-core/src/main/java/com/google/cloud/tools/jib/tar/TarStreamBuilder.java
Outdated
Show resolved
Hide resolved
…mBuilder.java Co-authored-by: Chanseok Oh <[email protected]>
…ucibleLayerBuilder.java Co-authored-by: Chanseok Oh <[email protected]>
Kudos, SonarCloud Quality Gate passed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! I think this will significantly reduce user friction. Something we always wanted to do. Thanks again!
Follow-up: #3390 |
@rmannibucau you can test the the retry feature with 3.1.4. I had to tweak it a bit due to #3422. Thanks again for you contribution! |
Follow up of #1409
It enables http client retry when the blob content is re-readable.