Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal for correlating synthetics traces #825

Merged
merged 7 commits into from
Sep 27, 2023
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 119 additions & 0 deletions specs/integrations/synthetics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
## Synthetics Integration

Synthetic monitors play a crucial role in periodically checking the status of your services and applications on a global scale. General documentation about synthetic monitors can be found in
[Synthetics getting started page](https://www.elastic.co/guide/en/observability/current/synthetics-get-started.html).

This integration goes in to more detail about how the sythetics monitors would
be correlated with the APM traces. Synthetics traces can be categorized in to two
main types
1. HTTP checks - These have one-one mapping with APM transactions
2. Browser checks - These have a one-to-many mapping with APM transactions

### Correlation
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved

The Synthetics agent (including Heartbeat) takes the responsibility of creating the
[`traceparent`](../agents/tracing-distributed-tracing.md#trace_id-parent_id-and-traceparent)
header for each outgoing network request associated with a test during every
monitor execution.

- `trace.id` and `parent.id`
- outgoing requests that are being explicity traced by the synthetics agent
will have the `parent.id` and `trace.id` as part of the trace context.
- must be unique for each step for a browser monitor
- must be unique for a http monitor
- `sampled` Flag
- used to control the sampling decision for all the downstream services.
- 100% sampling when tracing is enabled

#### Browser checks

When executing a Synthetics journey with APM tracing enabled for specific URLs
using the --apm_tracing_urls flag, the Synthetics agent takes the following
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved
actions:

1. Adds the traceparent header to each matching outgoing request.
2. Includes trace.id and parent.id in all the Step Elasticsearch (ES) documents for the journey.

```ts
// run journey
npx @elastic/synthetics --apm_tracing_urls "elastic.co/*"

// example.journey.ts
journey("elastic e2e", ({ page }) => {
step("home page", async() => {
await page.goto("https://www.elastic.co")
})
step("blog page", async() => {
await page.goto("https://www.elastic.co/blog")
})
})
```

Example of the tracing information added to the ES documents for two steps in the journey:

```json
// Step - homepage
{"type":"step/end","journey":{"name":"elastic e2e"},"step":{"name":"home page","index":1,"status":"failed","duration":{"us":17382122}}, "trace.id": "xxx"}
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved
{"type":"journey/network_info","journey":{"name":"elastic e2e"},"step":{"name":"home page","index":1},"http":{"request":{"url":"http://www.elastic.co/","method":"GET"}},"trace.id": "t1", "transaction.id": "tr1"}
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved


// Step - blog page
{"type":"step/end","journey":{"name":"elastic e2e"},"step":{"name":"blog page","index":2,"status":"failed","duration":{"us":17382122}}, "trace.id": "xxx"}
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved
{"type":"journey/network_info","journey":{"name":"elastic e2e"},"step":{"name":"blog page","index":2},"http":{"request":{"url":"http://www.elastic.co/blog","method":"GET"}},"trace.id": "t1", "transaction.id": "tr2"}
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved
```

With this tracing information available in the ES documents for each step's network requests, the Synthetics UI can link back to the individual backend transactions on the APM.
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved

#### HTTP Checks

For the below HTTP monitor

```yml
# heartbeat.yml
heartbeat.monitors:
- type: http
id: test-http
urls: ["https://www.example.com"]
apm:
enabled: true
```

Heartbeat would add the `traceparent` header to the monitored URL and add the
other tracing related information to the ES documents.

```json
{"event":{"action":"monitor.run"},"monitor":{"id":"test-http","type":"http","status":"up","duration":{"ms":112}}, "trace.id": "t1", "transaction.id": "tr1"}
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved
```

It's important to note that there is no dedicated waterfall information for the HTTP checks in the Synthetics UI. Consequently, the linking here will directly take you to the APM transaction if the backend is also traced by Elastic APM or OTEL (OpenTelemetry)-based agents.
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved

**NOTE: The correlation remain applicable even if downstream services are traced by OpenTelemetry (OTEL)-based agents. This ensures a consistent and seamless tracing experience regardless of the underlying tracing infrastructure..**
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved

### Identifying Synthetics trace

Synthetics monitor executions creates `rootless traces` as these traces are not
reported to the APM server. To overcome this limitation on the APM UI, we need
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved
to identify the synthetics traces and explicity link them to the Synthetics
waterfall view.

- `http.headers.user-agent`:
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved
- Contains `Elastic/Synthetics` for all outgoing requests from Synthetis based monitors.
felixbarny marked this conversation as resolved.
Show resolved Hide resolved

There is a limitation with this approach
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another challenge is that native OTel agents don't capture all HTTP headers by default, IINM.

Copy link
Member Author

@vigneshshanmugam vigneshshanmugam Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we cant do much here except the users would see a rootless trace unless Synthetics itself is also instrumented using APM?

I can add this to the limitation list if its a huge concern?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a huge concern but we should still add it to the limitations.
The concern is not necessarily about the trace being rootless. I think it's more about that we aren't able to tell synthetics requests apart from normal requests. I think that's a limitation we can live with but it doesn't allow us to build features like "show only requests from synthetics".

Copy link
Contributor

@gregkalapos gregkalapos Sep 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requirement level of the user_agent.original attribute (which is the Value of the HTTP User-Agent header sent by the client.) is recommended. I also think this is trivial to capture, so I expect most OTel implementations will have this out of the box.

Furthermore, what we could also do is this: for some languages, it's likely we'll end up with custom OTel distributions - with that we have control on how the OTel agent is configured. To be on a safe-side, we could add something like this to this spec:

Custom Elastic OTel agent distributions MUST capture the user_agent.original attribute in order to enable APM-Synthetic correlation.

In most cases, this will have no effect, since I expect vanilla OTel agents already doing so. Nevertheless with this we can mitigate the issue of not being able to recognize synthetic calls in OTel Agents.

- users can override the `User-Agent` header in the monitor configuration which
might lead to users seeing only partial traces on the APM UI.

We can also add a foolproof solution by introducing vendor specific `tracestate`
property.
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved

- `tracestate`:
- Contains `es:origin=synthetics` for all outgoing requests from Synthetis based monitors.


When a trace is confirmed to be originated from Synthetics-based monitors, the
Trace Explorer view can be linked back to the Synthetics waterfall view.

- `/app/synthetics/link-to/<trace.id>`
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved
- links back to the explicit browser waterfall step on the Synthetics UI, and
it follows the format `/monitor/:monitorId/test-run/:runId/step/:stepIndex`.
- `runId` is internal to the Synthetics side which is also available on ES step documents.
vigneshshanmugam marked this conversation as resolved.
Show resolved Hide resolved