Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for logs (and traces from access logs -aka RTR logs- ) on cloudfoundryreceiver #32671

Closed
jriguera opened this issue Apr 24, 2024 · 8 comments

Comments

@jriguera
Copy link
Contributor

Component(s)

receiver/cloudfoundry

Is your feature request related to a problem? Please describe.

Currently the Cloudfoundry receiver supports only metrics, but the Loggregator-firehose subsystem (rlp_gateway endpoint) also process all system and app logs. We would like to be able to read those logs with the OTEL collector (adding a tag by type, eg RTR, CELL, SSH ...). Additionally, if the RTR logs have the tracing fields, add support to export spans.

Describe the solution you'd like

Extend the converter in https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/cloudfoundryreceiver/converter.go#L22 to generate log and traces from logs of type RTR (access logs) in the same way as is done for metrics. Because a log line can have different sources depending on where it was generated (app,container, gorouter, api ...) it makes sense to add flags to select the interested ones. There is also to get events like app crashes, scaling, etc ... More information: https://pkg.go.dev/code.cloudfoundry.org/go-loggregator/rpc/loggregator_v2#pkg-types

Describe alternatives you've considered

This is the best option. This component is already in charge of processing metrics and it makes sense to extend it to receive logs and traces from the same source (loggregator-firehose) . The traditional old way is receiving those logs from a component like this one and emit them using syslog (https://github.com/cloudfoundry-community/firehose-to-syslog) but it adds an unneeded jump and it does not allow to process traces.

Additional context

Loggregator-firehose subsystem is in charge of managing metrics, logs and events in Cloudfoundry.

CloudFoundry http routing layer, also known as gorouter component supports tracing, but it does not emit traces by itself. It decorates the access logs (RTR logs) with tracing information:
my.host.name - [2024-04-24T14:09:44.621845452Z] "GET /health HTTP/1.1" 200 0 15 "-" "NING/1.0" "X.X.X.X:46312" "X.X.X.X:61187" x_forwarded_for:"X.X.X.X, X.X.X.X, X.X.X.X" x_forwarded_proto:"https" vcap_request_id:"07212f87-6146-4207-6d63-84fd1e8cdbc4" response_time:0.003175 gorouter_time:0.000298 app_id:"0e761239-7c71-48e8-bd41-8314d67e74dc" app_index:"1" instance_id:"eb698264-6bc3-4168-58b8-80f6" x_cf_routererror:"-" x_forwarded_host:"my.host.name" x_b3_traceid:"07212f87614642076d6384fd1e8cdbc4" x_b3_spanid:"6d6384fd1e8cdbc4" x_b3_parentspanid:"-" b3:"07212f87614642076d6384fd1e8cdbc4-6d6384fd1e8cdbc4" traceparent:"00-07212f87614642076d6384fd1e8cdbc4-6d6384fd1e8cdbc4-01" tracestate:"gorouter=6d6384fd1e8cdbc4"

@jriguera jriguera added enhancement New feature or request needs triage New item requiring triage labels Apr 24, 2024
Copy link
Contributor

Pinging code owners:

%s See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1
Copy link
Member

Thanks for filing @jriguera, I agree that log support for this receiver would be great. Any help is appreciated if you're interested in contributing!

This is a duplicate of #10025, but we can leave this one open as the other went stale 👍

@crobert-1 crobert-1 removed the needs triage New item requiring triage label Apr 24, 2024
@jriguera
Copy link
Contributor Author

Ops, sorry, for some reason I did not find the old issue :-/.
I will evaluate with my team the contribution to extend the functionality.
Thanks!

@jriguera
Copy link
Contributor Author

Nice!, we will implemented and create a PR. We are also interested in these features:

  1. Configuration parameter to enable/disable each telemetry type: metrics (current implementation), logs and traces (only from RTR logs, see 3).
  2. Configuration flag to filter log by type (source), eg. ingest only logs source RTR or from app (stdout/sdterr).
  3. Ingest RTR logs (aka access logs) as spans instead of logs (if they have the proper tracing attributes).

Regarding the point 1, probably is not needed a configuration parameter per-se. Each telemetry type can be implemented which their own class from the Factory creator which will be automatically instantiated depending on the pipeline type (logs, metrics, traces). We will need to figure out the implementation details of ingesting envelopes in this way, if is not needed, no new configuration parameter will be added.

Regarding the point 2, the reason to filter at this stage instead of using a processor later in the pipeline is only because of trying to reduce resource usage. We need to deal with a lot of logs and for us makes sense to discard the logs we do not need as soon as possible, instead of letting them flow on the pipeline consuming resources.

What do you guys think?

@crobert-1
Copy link
Member

Point 1 shouldn't be necessary, as you pointed out the same receiver configuration would just be placed in each telemetry pipeline that a user wants to ingest.

Point 2 makes sense, it's good functionality for the receiver to have if necessary. 👍

@jriguera
Copy link
Contributor Author

We have created a draft PR: #33044, is still a WIP because the 2nd and 3rd point were not implemented.
We still need to sign the CLA, but I hope you can have a look and tell us what do you think.

andrzej-stencel pushed a commit that referenced this issue Jul 3, 2024
**Description:** 

Adding support receive logs  from Cloudfoundry.

**Link to tracking Issue:** #32671

**Testing:** Basic testing inline with the current tests

**Documentation:**  

* Add  new section for logs and their attributes.
* Update behaviour of ShardID property (`rlp_gateway.shard_id`).

cc @CemDK  @m1rp

---------

Co-authored-by: Cem Deniz Kabakci <[email protected]>
Co-authored-by: Sam Clulow <[email protected]>
Co-authored-by: Cem Deniz Kabakci <[email protected]>
Co-authored-by: Tomás Mota <[email protected]>
Co-authored-by: Tomas Mota <[email protected]>
Co-authored-by: Alex Boten <[email protected]>
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jul 15, 2024
@crobert-1
Copy link
Member

Resolved by #33044

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants