-
Notifications
You must be signed in to change notification settings - Fork 65
emit transaction failed: error_class=NoMethodError error="undefined method `[]' for nil:NilClass" #54
Comments
@emptywee : can you provide the following?
|
@frankreno absolutely.
I can temporarily upgrade again and post more of them, but they are exactly the same, just the tag changes. Please, let me know if you require more details, I'd be more than happy to provide them. |
@emptywee I am still looking into this, will let you know what I find out ASAP. Good news is I can reproduce so I can now start to debug and figure out the issue. |
just wanted to provide some updates, the stack trace is not stemming from this plugin but is stemming from one of the plugins we package with it. Line 297, based on the error I suspect es is nil. I still need to determine why this is happening, the issue seems to be introduced with the 7b10030. |
Thanks for the update. I used to add something similar to my configs (@timeout label) in v1.6 and it worked fine. |
Yeah it is definitely odd, but in the environment I have, the stack was never thrown before adding the |
Ok so these logs are only happening when the concat plugin kicks in. In my environment, I am running with mostly all default values. The default multiline pattern for the concat plugin is Julian dates. I have one pod who's logging matches that pattern. Everytime there is a multiline event in that pod, there is this stack trace: The above is the final product sent to Sumo. If we go look at the pod logs directly, we can see that this is the concatenation of 2 log lines. And the stack trace is occuring at the exact same time, we get 2 because there are 2 messages being concated. The good news is that this appears to be benign, other than generating noise in the logs. No data is being dropped and the data makes it into Sumo concated together as expected. @emptywee can you confirm that you are seeing the same behavior? E.g. that the stack traces occur during times of ingesting multi line log messages and the behavior above mirrors what you are seeing? |
I'll check if logs aren't being lost. It's kinda tricky, since we have a really busy cluster and I need to find a way to make sure I am seeing what you are seeing :) |
totally get it, i am going to try a few things in parallel and let you know what happens. |
I think I have a fix, it is working locally, but want to let it run overnight to ensure all is good. The fix will not be to this repo, but to the repo we depend on so timeline will depend on when they can merge. if all continues to work as expected, I will submit a PR to that library. We will leave this open for tracking until it is accepted and we can upgrade that library. |
So it's really a fix? Not just a suppression of the error message? |
I traced the issue. When these stack traces appear, the event stream is actually empty in the k8s metadata filter plugin, which is downstream of the concat plugin. I think it is a symptom of the concat plugin merging messages together. Every time I get this stack trace that you are seeing, the stream is empty (I let it run with some debug logging for 24hours and confirmed every time it happened, it was empty). The k8s metadata filter never expects an empty stream, so when it gets one, it causes this error. So I submitted a PR that ensures if the stream is empty, it does not process it (as there is nothing to process anyway). |
update - fix was merged with the other repo. Waiting on new gem to be published with fix. Then I will update the version we use and push a new image. Hang tight, almost there. |
@emptywee - ive pushed v1.11 which addressed the issue in my testing. Before I push latest tag, wanna give her a go? |
Absolutely. I will do it now and post results! |
@frankreno everything looks solid. No issues has been observed so far. I think we can close this one. If anything comes up in the near future, I'll open a new one or we'll re-open this one if it has anything to do with it. |
Thank you for fixing this 👍 |
Great, glad to hear and happy to help! |
With the latest upgrade to v1.8 from v1.6 we began to receive the following errors:
Any clues what that might be?
Thank you.
The text was updated successfully, but these errors were encountered: