Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collector SIGSEGV #2708

Closed
fld-opensource opened this issue Mar 16, 2021 · 2 comments
Closed

collector SIGSEGV #2708

fld-opensource opened this issue Mar 16, 2021 · 2 comments
Labels
bug Something isn't working

Comments

@fld-opensource
Copy link
Contributor

fld-opensource commented Mar 16, 2021

Describe the bug
Using the nginx receiver, where nginx is not present, we dereference invalid pointer.

Steps to reproduce
$> opentelemetry-collector-builder --config the_following_yaml

What did you expect to see?
Not a crash :)

What did you see instead?

2021-03-16T13:01:28.110+0100 INFO service/service.go:267 Everything is ready. Begin running and processing data.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xffc676]

goroutine 112 [running]:
go.opentelemetry.io/collector/consumer/pdata.ResourceMetricsSlice.Len(...)
go.opentelemetry.io/[email protected]/consumer/pdata/generated_metrics.go:52
go.opentelemetry.io/collector/receiver/scraperhelper.metricCount(0x0, 0xc00077f1a0)
go.opentelemetry.io/[email protected]/receiver/scraperhelper/scraper.go:160 +0x26
go.opentelemetry.io/collector/receiver/scraperhelper.resourceMetricsScraper.Scrape(0x34db9e0, 0xc00058b4d0, 0x2e73235, 0x5, 0xc00058b4b0, 0x3520460, 0xc00077f1a0, 0xc0002685c8, 0x5, 0x0, ...)
go.opentelemetry.io/[email protected]/receiver/scraperhelper/scraper.go:153 +0x110
go.opentelemetry.io/collector/receiver/scraperhelper.(*controller).scrapeMetricsAndReport(0xc00042e0e0, 0x35203e0, 0xc000120000)
go.opentelemetry.io/[email protected]/receiver/scraperhelper/scrapercontroller.go:204 +0x12e
go.opentelemetry.io/collector/receiver/scraperhelper.(*controller).startScraping.func1(0xc00042e0e0)
go.opentelemetry.io/[email protected]/receiver/scraperhelper/scrapercontroller.go:186 +0x84
created by go.opentelemetry.io/collector/receiver/scraperhelper.(*controller).startScraping
go.opentelemetry.io/[email protected]/receiver/scraperhelper/scrapercontroller.go:175 +0x3f

What version did you use?
Version: 0.22

What config did you use?
Config:

receivers:
nginx:
endpoint: "http://localhost:44444/nginx_status"
collection_interval: 1s

processors:
batch:

exporters:
logging:
logLevel: debug

service:
pipelines:
metrics:
receivers: [nginx]
processors: [batch]
exporters: [logging]

Environment
OS: RHEL8.0
go version go1.14.12 linux/amd64, same crash with 1.16

Additional context

The nginx scraper returns an error (which is justified), but it is not caught in the calling functions. receiver/scraperhelper/scraper.go::Scrape(), then the code unconditionally calls metricCount(resourceMetrics) on a undefined resourceMetrics.

I have identified the same pattern twice in this file.

Fix
The following patch fixes the problem.

diff --git a/receiver/scraperhelper/scraper.go b/receiver/scraperhelper/scraper.go
index 06e0aa2b..2c5cb7e6 100644
--- a/receiver/scraperhelper/scraper.go
+++ b/receiver/scraperhelper/scraper.go
@@ -111,7 +111,9 @@ func (ms metricsScraper) Scrape(ctx context.Context, receiverName string) (pdata
        ctx = obsreport.ScraperContext(ctx, receiverName, ms.Name())
        ctx = obsreport.StartMetricsScrapeOp(ctx, receiverName, ms.Name())
        metrics, err := ms.ScrapeMetrics(ctx)
-       obsreport.EndMetricsScrapeOp(ctx, metrics.Len(), err)
+       if err == nil {
+               obsreport.EndMetricsScrapeOp(ctx, metrics.Len(), err)
+       }
        return metrics, err
 }

@@ -150,7 +152,9 @@ func (rms resourceMetricsScraper) Scrape(ctx context.Context, receiverName strin
        ctx = obsreport.ScraperContext(ctx, receiverName, rms.Name())
        ctx = obsreport.StartMetricsScrapeOp(ctx, receiverName, rms.Name())
        resourceMetrics, err := rms.ScrapeResourceMetrics(ctx)
-       obsreport.EndMetricsScrapeOp(ctx, metricCount(resourceMetrics), err)
+       if err == nil {
+               obsreport.EndMetricsScrapeOp(ctx, metricCount(resourceMetrics), err)
+       }
        return resourceMetrics, err
 }
@fld-opensource fld-opensource added the bug Something isn't working label Mar 16, 2021
@bogdandrutu
Copy link
Member

@iror00 please open a PR :), I would not avoid calling the obsreport func, but fix calculation of the len, and pass 0 when error happend that did not produce any metric.

fld-opensource pushed a commit to fld-opensource/opentelemetry-collector that referenced this issue Mar 17, 2021
fld-opensource added a commit to fld-opensource/opentelemetry-collector that referenced this issue Apr 8, 2021
fld-opensource added a commit to fld-opensource/opentelemetry-collector that referenced this issue Apr 8, 2021
fld-opensource added a commit to fld-opensource/opentelemetry-collector that referenced this issue Apr 8, 2021
fld-opensource added a commit to fld-opensource/opentelemetry-collector that referenced this issue Apr 8, 2021
fld-opensource added a commit to fld-opensource/opentelemetry-collector that referenced this issue Apr 8, 2021
fld-opensource added a commit to fld-opensource/opentelemetry-collector that referenced this issue Apr 8, 2021
@codeboten
Copy link
Contributor

Fixed by #2902

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants