You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Telegraf 19.3 vs Telegraf 17.2 of 17.3
Running on relatively vanilla Ubuntu 20.04
Same machine for both configurations, built with same version of go.
Steps to reproduce:
Build Telegraf 19.3, notice that with the same elastic and influx outputs queue lengths are double the size and our drop rate increased by about 35%
Same behaviour has been seen on 19.2, 19.1, 18.3 but not on 17.3
Expected behavior:
Equivalent throughput.
Actual behavior:
Substantially lower throughput and a high rate of metrics dropped
Additional info:
Here is an example image showing 17.2 vs 19.3. The write buffer size is substantially higher:
As a result our drop rate is much higher. Here is a comparison of the drop rate from the older node vs the newer node:
I have slowly been working my way back through telegraf versions. I know the behaviour is not visible on telegraf 17.3 and 17.2 but happens at 18.3, I have yet to work my way back through 18.x versions older than 18.3. Here is 18.3
For now this means that the highest we can upgrade to is 17.3 which is really disappointing.
The text was updated successfully, but these errors were encountered:
Hi! Thanks for taking the time to report this and with lots of details. We believe we just fixed this with #9800. If you want, you might try one of our nightly builds or wait till v1.20.1, which should land next week.
I am going to go ahead and close this as this matches what that issue found and fixed, but if you are able to try with v1.20.1 after next week and still run into issues we would love to know.
I will certainly try this with 1.20.1! Thank you for the quick response.
It was very disheartening to realize we were blocked from upgrading and can't get all the latest telegraf goodness!
Relevant telegraf.conf:
Some things have been removed for clarity.
System info:
Telegraf 19.3 vs Telegraf 17.2 of 17.3
Running on relatively vanilla Ubuntu 20.04
Same machine for both configurations, built with same version of go.
Steps to reproduce:
Expected behavior:
Equivalent throughput.
Actual behavior:
Substantially lower throughput and a high rate of metrics dropped
Additional info:
Here is an example image showing 17.2 vs 19.3. The write buffer size is substantially higher:
As a result our drop rate is much higher. Here is a comparison of the drop rate from the older node vs the newer node:
I have slowly been working my way back through telegraf versions. I know the behaviour is not visible on telegraf 17.3 and 17.2 but happens at 18.3, I have yet to work my way back through 18.x versions older than 18.3. Here is 18.3
For now this means that the highest we can upgrade to is 17.3 which is really disappointing.
The text was updated successfully, but these errors were encountered: