You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
10 requests/second quota on GetDocumentTextDetection
This is the function that is used to get paginated results from a scanned document. For long PDFs this may be called many times for a single document. The processing of OCR results happens asynchronously based on when Textract has finished processing the PDF, which means we can't control exactly when this function will be called.
Maximum number of asynchronous jobs per account that can simultaneously exist: 600
We have about 780000 documents to process. Which means we will need to limit the rate at which we start async jobs.
A possible solution:
Limit the number of concurrent lambdas processing documents so we don't exceed the 600 total calls at any time.
Set a high number of retries on the SQS queue so failure simply get rescheduled.
Use a dead letter queue to catch anything that fails after max tries so we can resend.
This may require some testing to get right.
The text was updated successfully, but these errors were encountered:
Textract has several quotas.
Of particular concern are:
10 requests/second quota on
GetDocumentTextDetection
This is the function that is used to get paginated results from a scanned document. For long PDFs this may be called many times for a single document. The processing of OCR results happens asynchronously based on when Textract has finished processing the PDF, which means we can't control exactly when this function will be called.
Maximum number of asynchronous jobs per account that can simultaneously exist: 600
We have about 780000 documents to process. Which means we will need to limit the rate at which we start async jobs.
A possible solution:
This may require some testing to get right.
The text was updated successfully, but these errors were encountered: