-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporterhelper] record metric should log the number of log records before the data are sent to the consumers downstream #10402
[exporterhelper] record metric should log the number of log records before the data are sent to the consumers downstream #10402
Conversation
Hey @dmitryax and @atoulme. Just to give a bit more context. Our team was able to reproduce the problem only when we send large amount of data(500GB per day in an instance) through. So the theory in the description fits as this can only happen in very rare race condition. Can you take a look and see if this change is ok? |
The fix seems simple enough, but I'd be interested to try and see if a unit test can still catch this if we provoke this problem. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #10402 +/- ##
=======================================
Coverage 92.24% 92.24%
=======================================
Files 403 403
Lines 18720 18723 +3
=======================================
+ Hits 17268 17271 +3
Misses 1097 1097
Partials 355 355 ☔ View full report in Codecov by Sentry. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
We are facing the same issue with a custom receiver.
Panic happens here:
Can we investigate this? Or maybe mention somewhere in the documentation this will happen? |
@grandwizard28 that seems like a different issue, because it happens outside of exporterhelper. Please open a new issue, post version and pipeline. You can redact your custom receiver and just post the snippet of code you reference here. |
@atoulme I think we can merge this PR as it is, since, the race condition would happen when the request is completed and then the next line is called. Catching this in a unit test would be tricky...
WDYT? |
If we do this:
Catching this in a unit test is not particularly tricky, if we're just looking to reproduce the failure: reset the logs to nil and create a situation where the panic is reproduced. |
@atoulme yeah it is understandable to have a unit test to re-create the failures. Our team looked into it for a while but can't really reproduce the problem with unit test. Even on production it is only reproducible under specific scenario(specific pipeline setup and throughput). We changed our internal pipeline so it doesn't happen again in production for now. We tried and wasn't able to re-create the failure in a unit test. If anyone else has any suggestions please feel free to post them. |
I'm talking about the changes in this PR. In the unit test, mimicking an actual Any other way to mimic it would involve emptying the That's why I said, IMO we can merge it without UT, also internationally we have also patched this in our custom receiver. |
…efore the data are sent to the consumers downstream
716b9be
to
eb1ca23
Compare
That looks good - can you please add the same test and fix for metrics and traces? Thanks! |
c2955ca
to
3aa6181
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hi @dmitryax, mind taking a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
The sender metric within the exporterhelper should measure the number of items coming into the sender, not what was done with the items downstream, if the components downstream are mutable. An example of this is provided as a unit test within this PR.
This PR also addresses nil panics that some users are experiencing.
Link to tracking issue
Fixes #
#10033
Testing
Existing test cases should cover this code change.
Unit test added
Documentation