Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traces are not available after 30 minutes on Windows #3552

Closed
teyyubismayil opened this issue Apr 9, 2024 · 5 comments · Fixed by #3568
Closed

Traces are not available after 30 minutes on Windows #3552

teyyubismayil opened this issue Apr 9, 2024 · 5 comments · Fixed by #3568

Comments

@teyyubismayil
Copy link
Contributor

Describe the bug
I have simple grafana-agent and tempo setup running on windows as executables. My application pushes traces through grafana-agent and only traces of last 30 minutes are found when querying. After 30 minutes trace is not available. 404 Not Found Body: trace not found is returned when searching for trace that is older than 30 minutes.

To Reproduce
Steps to reproduce the behavior:

  1. Start Tempo on windows as executable
  2. Push traces to tempo
  3. Query for trace that is older than 30 minutes

Expected behavior
Trace should be found but 404 Not Found Body: trace not found is returned.

Environment:

  • OS - Windows
  • Tempo version - 2.4.1 amd 64

Additional Context
Configuration:

stream_over_http_enabled: true
server:
  http_listen_port: 3200
  log_level: info
distributor:
  receivers:
    otlp:
      protocols:
        grpc:
compactor:
  compaction:
    block_retention: 24h
storage:
  trace:
    backend: local
    wal:
      path: D:/somefolder/tempo/wal
    local:
      path: D:/somefolder/tempo/blocks
@joe-elliott
Copy link
Member

I would guess that the querier component is not correctly seeing the blocks. Perhaps there is a polling issue? Can you check metrics such as tempodb_blocklist_length to see if the querier is aware of blocks in the backend?

@teyyubismayil
Copy link
Contributor Author

Following are metric values regarding blocklist:

# HELP tempodb_blocklist_poll_duration_seconds Records the amount of time to poll and update the blocklist.
# TYPE tempodb_blocklist_poll_duration_seconds histogram
tempodb_blocklist_poll_duration_seconds_bucket{le="0"} 0
tempodb_blocklist_poll_duration_seconds_bucket{le="60"} 214
tempodb_blocklist_poll_duration_seconds_bucket{le="120"} 214
tempodb_blocklist_poll_duration_seconds_bucket{le="180"} 214
tempodb_blocklist_poll_duration_seconds_bucket{le="240"} 214
tempodb_blocklist_poll_duration_seconds_bucket{le="300"} 214
tempodb_blocklist_poll_duration_seconds_bucket{le="360"} 214
tempodb_blocklist_poll_duration_seconds_bucket{le="420"} 214
tempodb_blocklist_poll_duration_seconds_bucket{le="480"} 214
tempodb_blocklist_poll_duration_seconds_bucket{le="540"} 214
tempodb_blocklist_poll_duration_seconds_bucket{le="+Inf"} 214
tempodb_blocklist_poll_duration_seconds_sum 13.568723699999998
tempodb_blocklist_poll_duration_seconds_count 214
# HELP tempodb_blocklist_tenant_index_age_seconds Age in seconds of the last pulled tenant index.
# TYPE tempodb_blocklist_tenant_index_age_seconds gauge
tempodb_blocklist_tenant_index_age_seconds{tenant="single-tenant"} 0
# HELP tempodb_blocklist_tenant_index_builder A value of 1 indicates this instance of tempodb is building the tenant index.
# TYPE tempodb_blocklist_tenant_index_builder gauge
tempodb_blocklist_tenant_index_builder{tenant="single-tenant"} 1
# HELP tempodb_blocklist_tenant_index_errors_total Total number of times an error occurred while retrieving or building the tenant index.
# TYPE tempodb_blocklist_tenant_index_errors_total counter
tempodb_blocklist_tenant_index_errors_total{tenant="single-tenant"} 1

@joe-elliott
Copy link
Member

Since we are not seeing a tempodb_blocklist_length that means the querier believes there are no blocks in the backend and that is why you are not seeing older traces.

Can you check the configured paths to see if they have any blocks? or review the logs to see if you catch any polling errors?

I'm not familiar with Windows so it's difficult to comment on the configured paths. Should they be /s?

@teyyubismayil
Copy link
Contributor Author

These are error and warn logs:

level=error ts=2024-04-11T07:55:32.8295609Z caller=poller.go:225 msg="failed to pull bucket index for tenant. falling back to polling" tenant=single-tenant err="does not exist"
level=warn ts=2024-04-11T07:55:32.8928056Z caller=modules.go:259 msg="metrics-generator is not configured." err="no metrics_generator.storage.path configured, metrics generator will be disabled"
level=warn ts=2024-04-11T07:55:32.8938052Z caller=modules.go:287 msg="Worker address is empty in single binary mode. Attempting automatic worker configuration. If queries are unresponsive consider configuring the worker explicitly." address=127.0.0.1:9095
level=warn ts=2024-04-11T07:55:32.9104533Z caller=wal.go:116 msg="unowned file entry ignored during wal replay" file=blocks err=null

Checked tempo data folder and it has data. Following is the structure:

├───tempo
│   ├───blocks
│   │   └───single-tenant
│   │       ├───01a0be7d-9da3-45a1-953b-96e07b795305
│   │       ├───01cd7f2d-b1a8-4caa-94f8-64b15bae01bd
......................................................
│   └───wal
│       ├───551f37dd-38ef-4590-bd2a-f9b9a9996380+single-tenant+vParquet3
│       └───blocks
│           └───single-tenant
│               ├───596f4cfe-e3e7-4cca-9e6b-17e30505eeaf
│               ├───7c4a8965-7250-43d8-9062-72b7184c6bd4
......................................................

Tried with \s in path and the result is the same.

Also another issue is that tempo data folder size is only growing. It looks like old files are not deleted.

@joe-elliott
Copy link
Member

Also another issue is that tempo data folder size is only growing. It looks like old files are not deleted.

Right. It seems like all components are failing to discover the blocks for your tenant. This means querying/compaction/retention will not work. Perhaps there is a bug in the local backend on windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants