Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New component: GRPC processor/connector #20888

Closed
2 tasks
nicolastakashi opened this issue Apr 13, 2023 · 17 comments
Closed
2 tasks

New component: GRPC processor/connector #20888

nicolastakashi opened this issue Apr 13, 2023 · 17 comments
Labels

Comments

@nicolastakashi
Copy link
Contributor

The purpose and use-cases of the new component

The OpenTelemetryContrib is providing an amazing job extending the Otel collector capacities and also providing basic functionalities that might be useful for the entire community.

Although many other engineers would love to extend the OpenTelemetry Collector pipeline without the need to maintain their own OpenTelemetry distribution, which requires some work to keep track of the upstream changes.

With that in mind would be amazing if there's a generic processor to send the telemetry data to a GRPC server, so the engineers can build their own GRPC processors/connectors to manipulate the telemetry data and reply back to the collector, all this integration should be based on the OTEL protocol and a well-defined contract.

This kind of functionality may improve the user experience to extend the OpenTelemetry collector and also provide integrations to the external systems to enrich the telemetry data.

Example configuration for the component

receivers:
  otlp:
    protocols:
      grpc:
      http:

processors:
  grpc/logs:
    endpoint: grpc-server.com:55690 ## just an example

exporters:
  otlp:
    endpoint: otelcol:4317

extensions:
  health_check:
  pprof:
  zpages:

service:
  extensions: [health_check, pprof, zpages]
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch, grpc/logs]
      exporters: [otlp]

Telemetry data types supported

  • Logs
  • Metrics
  • Traces

Is this a vendor-specific component?

  • This is a vendor-specific component
  • If this is a vendor-specific component, I am proposing to contribute this as a representative of the vendor.

Sponsor (optional)

No response

Additional context

We had a short initial conversation on the CNCF Slack channel about this, as you can see here.

@nicolastakashi nicolastakashi added the needs triage New item requiring triage label Apr 13, 2023
@jpkrohling
Copy link
Member

Related: #11772

@nicolastakashi
Copy link
Contributor Author

Just as a source of reference, as you may know, jaeger provides one feature to extend storage backend using a https://github.com/jaegertracing/jaeger/blob/main/plugin/storage/grpc/README.md

@jpkrohling
Copy link
Member

@yurishkuro, if you could, would you please share your experience in supporting gRPC plugins, as well as the observed performance implications? My impression is that it's a good enabler, a good hack, but not something folks would use for serious production cases.

@atoulme atoulme added Sponsor Needed New component seeking sponsor and removed needs triage New item requiring triage labels Apr 13, 2023
@atoulme
Copy link
Contributor

atoulme commented Apr 13, 2023

I'm not sure I follow - can we just use the OTLP gRPC exporter to export to a gRPC server?

@nicolastakashi
Copy link
Contributor Author

We can use OTLP gRPC exporter and receiver to do this, but in my view, this approach has two down sides.

1 - Collector configuration can start to be a mess because you need to have multiple receivers and exporters to avoid since you can't reuse the existing receiver.

Imagine I have a log receiver and I want to use an external server to enrich the log with some useful information, so I can use the OTLP exporter to send it to the external service, and then the external service can send it to the collector again, but I can't receive the log entry back on the log receiver, because this may lead on a loop this will require me to add another receiver to get the data back and this will require users open more ports on their collector.

2 - Using the receiver and exporter and not using the processor/connector seems to me that I'm not handling the data through the pipeline, maybe this can be just a semantic view I have, but looks better if we have a clear flow like Receiver > Process > Export

@jpkrohling
Copy link
Member

@atoulme, I understood the proposal as to building a gRPC processor that would allow people to build gRPC plugins, allowing them to build custom processors and host them in external binaries, likely using https://github.com/hashicorp/go-plugin .

@yurishkuro
Copy link
Member

I would avoid hashicorp plug-in lib for this. It's not significantly different from a plain grpc server, but imposes a sidecar pattern that it manages. There are other ways of running something as a sidecar. Having just a plain grpc interface decouples this from the sidecar consideration.

@atoulme
Copy link
Contributor

atoulme commented Apr 29, 2023

We discussed this during the SIG meeting. Apologies if the conversation was hard to follow @nicolastakashi as I understood you had audio issues. If you watch the recording, there might be more context attached to the discussion you might see.

The folks present in the discussion explained that to them this seemed like a departure from the model followed by the pipeline, where processor would typically be running in process, avoiding to make synchronous calls to external, I/O or network bound processes or devices. The reason is to avoid having to add complexity from working with processors. We try to organize around the lifecycle of the pipeline execution by applying backpressure or retries or other scaling mechanisms at that level.

The same folks expressed that the way to organize what you envision, in their view, consists in using a OTLP exporter and a separate pipeline with a OTLP receiver. This allows asynchronous behaviors and doesn't require implementation work.

Please comment here or feel free to participate in the SIG meetings (we have several ones with different time zones now) if you'd like to follow up, and thank you for putting this idea together.

@yurishkuro
Copy link
Member

yurishkuro commented Apr 29, 2023

Just to clarify my understanding. I think the OP is proposing this:

graph LR
    R1[Receiver] --> |"(1)"| P1[Processor]      
    P1 --> |"(2) Remote gRPC"| P2
    P2 --> |"(3) resp"| P1
    P1 --> |"(4)"| E(Exporter)
    subgraph Main OTEL Collector
      R1
      P1
      E
    end
    subgraph Custom  Server
      P2(Plugin Processor)
    end
Loading

@atoulme's response suggests putting the custom logic in the "Plugin Server" after the main exporter and doing whatever is needed there:

graph LR
    R1 --> P1      
    P1 --> E1
    E1 --> R2
    R2 --> P2

    subgraph Main OTEL Collector
      R1[Receiver]
      P1[Processor]
      E1(Exporter)
    end
    subgraph Custom Server
      R2[Receiver]
      P2[Custom handling]
    end
Loading

The problem with the latter is that if OP wants the custom logic to be implemented in a different language they have to throw away all of the OTEL Collector's standard processing & export infrastructure and roll out their own. Whereas all they way is to introduce an additional enrichment step into the pipeline.

FWIW, the blocking IO concerns inside the main processing pipeline are certainly valid, if the pipeline was not initially designed with that in mind.

@atoulme
Copy link
Contributor

atoulme commented Apr 29, 2023

Thanks for the graphs. This is what I'm describing:

graph TD
    R1 --> P1      
    P1 --> E1
    E1 --> R2
    R2 --> P2
    P2 --> E2
    E2 --> R3
    R3 --> P3
    P3 --> E3

    subgraph Main OTEL Collector
      R1[Receiver]
      P1[Processor]
      E1(OTLP Exporter)

      R3[OTLP Receiver]
      P3[Processor]
      E3(Exporter)
    end
    subgraph Custom Server
      R2[OTLP Receiver]
      P2[Custom handling]
      E2[OTLP Exporter]
    end
Loading

@yurishkuro
Copy link
Member

I think OP already stated that this is not the solution they are looking for. If they had the option of coding custom processing in Go, then why even bother with a separate process.

@atoulme
Copy link
Contributor

atoulme commented Apr 30, 2023

I think OP already stated that this is not the solution they are looking for.

I am just relaying the discussion from the meeting.

If they had the option of coding custom processing in Go, then why even bother with a separate process.

OP mentioned this in their motivation: it is to avoid building their own distribution of the collector.

@nicolastakashi
Copy link
Contributor Author

Hey folks!
@atoulme thanks for all the support and you're right on that day I faced some audio issues, but I got almost all the context from the call.

After some days thinking about that, I do agree with you that including asynchronous calls to the collector pipeline, may require more effort in terms of operations such as state management, to be honest as I said is a fair argument to not include this kind of integration on the collector.

On the other hand, I would like to highlight some other points:

1 - As @yurishkuro mentioned, this kind of integration may let people from other community languages extend the OpenTelemetry Collector without knowing how to code Go.

2 - We open a good door to the community to build their own processors that may be useful for other contexts but not be useful to be available on this repository and track this kind of processor we may build something like awesome-otel-collector.

Again I do understand the decision the working group thinks it's not a use case that OTEL wants to handle, but I think it brings some nice opportunities.

@nicolastakashi
Copy link
Contributor Author

Backing on this folks!
Do you think this might be still pertinent?
i slept on it a bit and thinking about that the risks you raised up are fair enough but the people are thinking about that may be aware of those risks, it will open a huge door for people out side go ecosystem build their pipelines.

@atoulme
Copy link
Contributor

atoulme commented Jul 12, 2023

I brought this up again during the SIG meeting on 7/12 as a discussion item regarding open-telemetry/opentelemetry-collector#7961

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants