-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ETL-656] Limit input-to-intermediate Lambda batch size to 1 #118
Conversation
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mentioned the bottleneck in our integration testing pipeline: If too many S3 events (new exports) are consumed by the S3 to JSON job, it will timeout. Are you saying that 10 events in a bundle will be too much for the S3 to JSON job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, depending on which exports it's trying to process. The timeout is 2 hours, so more than 6 adult exports would be likely to cause a timeout.
I could have simply increased the timeout, but limiting the batch size to 1 will allow us to process exports in parallel and complete our tests faster. Most of the cost-efficiency gains from a larger batch size come from the reduced setup time of worker instances -- which is a pretty negligible component of the total processing time if it takes us 20 minutes to process one export.
Description: >- | ||
The maximum number of S3 messages in a SQS event that Lambda will process in a batch | ||
|
||
LambdaMaximumBatchingWindowInSeconds: | ||
Type: Number | ||
Default: 300 | ||
Default: 60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it seems like we really won't need this anymore since our S3 to JSON job has a maximum concurrency of ~150 to begin with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, it doesn't make much sense to set it at all. But it doesn't hurt to keep it explicitly defined in the template, either.
See ticket for rationale.