[receiver] Create helper for server-like receivers to replace memory_limiter processor #8632

mx-psi · 2023-10-06T16:08:00Z

The memory_limiter processor allows end-users to drop telemetry data when a memory limit is reached. It is up to receivers to apply backpressure and handle retrying of errors.

A large class of receivers spin up one or more servers that listen for telemetry in a particular format, map to pdata and send it along to the next consumer. For protocols like HTTP and gRPC there are standard ways (429, RESOURCE_EXHAUSTED) to report the memory has been exhausted and apply backpressure to clients.

We should build a server receiver helper that handles this. In particular it should:

Provide an easy way to build receivers based on one or more servers that wrap handlers to handle memory limits
Support at least HTTP and gRPC
Be extensible to add other capabilities to the helper in the future (e.g. we could tackle add the option to enable server reflection on otlp grpc receiver #4951 generally for all gRPC-based receivers?)
Be extensible to support fairness across multiple clients reporting to the same receiver endpoint when it comes to throttling them
Be configurable in the same manner as the memory limiter currently is

Having this helper available and tackling this for scraper helpers may remove the need of a memory_limiter processor.

The text was updated successfully, but these errors were encountered:

splunkericl · 2023-11-17T17:43:07Z

Hey @dmitryax and @mx-psi ,we are helping out with the http use cases and want to confirm a few things:

the suggestion was to build a receiver helper method to wrap the log consumption with memory limiter. But this means the memory limiting is handled post HTTP connections creation and receiving the data on the server. This still poses problem on too many resources used.

Instead, there should be some kinds of http interceptor(similar to authInterceptor) with memory limiter baked in. So users can just plug it in when they spawn the http server. Is my understanding correct?

mx-psi · 2023-11-20T16:25:19Z

An interceptor sounds like a good approach. We should keep the requirements in mind (I wrote a list on the original post, we can discuss them if they are too stringent but they seem like a reasonable list for starting to think about this)

bogdandrutu · 2024-01-04T03:24:37Z

@mx-psi I am not sure if this is the right approach, here are some use-cases that would be good the new design (if any) should cover:

If collector is a "side-car", only converts from OTel format to DataDog (for example) and does not have any "buffering". If the DD exporter reports RESOURCE_EXHAUSTED we should actually probably send that back to the client.
For other errors like "Retry-After" if an exporter gets that would be good to back-propagate it.
There may be other components (exporter, processor, connector) that may return errors that the receiver may want to do something special with it (RESOURCE_EXHAUSTED, Retry-After for back-pressure, PERMISSION_DENIED, etc.).

I personally think we should actually adopt the grpc status (which is actually inside Google called Task::Status because they are generic enough for almost all use-cases including disk operations, RPCs, etc.) and return a error like Status from our Consumer funcs. Then we the receiver can read these "canonical" codes and do what is appropriate for them.

I know that in some scenarios, we may be able for example for memory to reduce the pressure earlier (e.g. if it is a http.Handler may reject the request before parsing/uncompressing etc.), but I see this as an optimization of the more generic problem we have right now, which is that we don't have a set of canonical codes/errors that we return from the pipeline components to better understand/react upstream.

dmitryax · 2024-01-04T07:54:17Z

To clarify the reported issue, right now, the memory_limiter processor doesn't drop any data; it just rejects it, and receivers decide what to do with that. Push-based gRPC/HTTP receivers should propagate the error to the caller as a retriable error.

I like the idea of standardizing on the gRPC codes in the collector, as @bogdandrutu suggested. However, I think this probably can be scoped as a separate effort. If we define that using memory_limiter on the receiver side should return 429 and RESOURCE_EXHAUSTED when memory limit is reached, it won't affect the adoption of grpc status codes from the caller perspective, right?

I think the problem that this issue should solve is that the memory_limiter as a processor is not efficient in preventing the collector from running out of memory for non-OTLP/GPRC receivers. Even if the memory_limiter processor reaches the threshold, receivers still accept requests, which can take a significant amount of memory before the memory_limiter can reject the pdata. Moving it to the receiver side introduces better protection from OOMs.

…orylimiter processor (#8964) **Description:** Following #8632, this change introduces memory limiter as an extension. This allows us to place the component to reject incoming connections due to limited memory, providing better protection from running out of memory. missing feature: receiver fairness. issue where a receiver hogs all the resource can happen. **Link to tracking Issue:** <Issue number if applicable> #8632 --------- Co-authored-by: Dmitry Anoshin <[email protected]>

mx-psi · 2024-09-04T16:21:26Z

I am going to remove this from the Collector 1.0 project. This is an useful enhancement but I feel is not in scope because it's an enhancement and can be done without breaking changes and we already have (more limited) way of handling this

mx-psi added enhancement New feature or request area:receiver processor/memory_limiter labels Oct 6, 2023

dmitryax mentioned this issue Oct 19, 2023

Users should be able to configure max HTTP concurrent connections open-telemetry/opentelemetry-collector-contrib#27066

Closed

mx-psi mentioned this issue Oct 19, 2023

Stricter memory limiter #8694

Open

timannguyen mentioned this issue Oct 20, 2023

[confighttp] add max_concurrent_connections to HTTPServerSettings #8633

Closed

mx-psi mentioned this issue Nov 7, 2023

[extension/ballastextension] Deprecate memory_ballast extension #8803

Merged

timannguyen mentioned this issue Nov 20, 2023

[extension/memorylimiter] add memory limiter extension, a copy of memorylimiter processor #8964

Merged

atoulme added the release:required-for-ga Must be resolved before GA release label Dec 19, 2023

timannguyen mentioned this issue Jan 25, 2024

[config/confighttp] add memorylimiterextension to confighttp #9397

Closed

dmitryax mentioned this issue Feb 15, 2024

Applying memory_limiter extension #9591

Open

moh-osman3 mentioned this issue Mar 5, 2024

(configgrpc): configure memorylimiterextension when setting up grpc servers #9673

Closed

codeboten added this to the go.opentelemetry.io/receiver 1.0 milestone Apr 17, 2024

mx-psi added the needs-review-from-stability-wg label Jul 17, 2024

mx-psi removed release:required-for-ga Must be resolved before GA release needs-review-from-stability-wg labels Sep 4, 2024

mx-psi removed this from the go.opentelemetry.io/receiver 1.0 milestone Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[receiver] Create helper for server-like receivers to replace memory_limiter processor #8632

[receiver] Create helper for server-like receivers to replace memory_limiter processor #8632

mx-psi commented Oct 6, 2023

splunkericl commented Nov 17, 2023

mx-psi commented Nov 20, 2023

bogdandrutu commented Jan 4, 2024

dmitryax commented Jan 4, 2024

mx-psi commented Sep 4, 2024

[receiver] Create helper for server-like receivers to replace memory_limiter processor #8632

[receiver] Create helper for server-like receivers to replace memory_limiter processor #8632

Comments

mx-psi commented Oct 6, 2023

splunkericl commented Nov 17, 2023

mx-psi commented Nov 20, 2023

bogdandrutu commented Jan 4, 2024

dmitryax commented Jan 4, 2024

mx-psi commented Sep 4, 2024