Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

telegraf reload is leaking connection for each reload. #5891

Closed
ashwinbasavaraja opened this issue May 22, 2019 · 0 comments · Fixed by #5912
Closed

telegraf reload is leaking connection for each reload. #5891

ashwinbasavaraja opened this issue May 22, 2019 · 0 comments · Fixed by #5912
Labels
bug unexpected problem or unintended behavior
Milestone

Comments

@ashwinbasavaraja
Copy link

Relevant telegraf.conf:

Default telegraf.conf

[[outputs.influxdb]]
  
  urls = ["http://<INFLUX_DB_IP>:8086"]


System info:

uname -a
Linux tick 4.4.0-148-generic #174-Ubuntu SMP Tue May 7 12:20:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

telegraf -version
Telegraf 1.10.2 (git: HEAD 3303f5c)

lsb_release -a
LSB Version: core-9.20160110ubuntu0.2-amd64:core-9.20160110ubuntu0.2-noarch:security-9.20160110ubuntu0.2-amd64:security-9.20160110ubuntu0.2-noarch
Distributor ID: Ubuntu
Description: Ubuntu 16.04.5 LTS
Release: 16.04
Codename: xenial

Steps to reproduce:

  1. Note down the telegraf pid
    ps -ef | grep telegraf
    telegraf 17998 1 0 09:47 ? 00:00:02 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
    root 18562 18525 0 09:48 ? 00:00:00 telegraf

  2. lsof -p 17998

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
telegraf 17998 telegraf cwd DIR 8,1 4096 2 /
telegraf 17998 telegraf rtd DIR 8,1 4096 2 /
telegraf 17998 telegraf txt REG 8,1 62433344 1452 /usr/bin/telegraf
telegraf 17998 telegraf DEL REG 8,1 2002 /lib/x86_64-linux-gnu/libc-2.23.so
telegraf 17998 telegraf DEL REG 8,1 2001 /lib/x86_64-linux-gnu/libpthread-2.23.so
telegraf 17998 telegraf DEL REG 8,1 2000 /lib/x86_64-linux-gnu/ld-2.23.so
telegraf 17998 telegraf 0r CHR 1,3 0t0 6 /dev/null
telegraf 17998 telegraf 1u unix 0xffff8800da094c00 0t0 77800 type=STREAM
telegraf 17998 telegraf 2u unix 0xffff8800da094c00 0t0 77800 type=STREAM
telegraf 17998 telegraf 3u IPv4 88800 0t0 TCP localhost:60844->localhost:8086 (ESTABLISHED)
telegraf 17998 telegraf 4u a_inode 0,11 0 7002 [eventpoll]
telegraf 17998 telegraf 5u IPv4 90073 0t0 TCP 10.1.1.1:45646->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 6u IPv4 91132 0t0 TCP 10.1.1.1:45678->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 7u IPv4 91509 0t0 TCP 10.1.1.1:45684->10.1.1.3:8086 (ESTABLISHED)

  1. systemctl reload telegraf

telegraf 17998 telegraf 5u IPv4 90073 0t0 TCP 10.1.1.1:45646->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 6u IPv4 91132 0t0 TCP 10.1.1.1:45678->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 7u IPv4 91509 0t0 TCP 10.1.1.1:45684->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 8u IPv4 102202 0t0 TCP 10.1.1.1:45924->10.1.1.3:8086 (ESTABLISHED)

  1. If I reload 10 times, I would see 10 additional connections established with InfluxDB (one for every reload)

telegraf 17998 telegraf 5u IPv4 90073 0t0 TCP 10.1.1.1:45646->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 6u IPv4 91132 0t0 TCP 10.1.1.1:45678->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 7u IPv4 91509 0t0 TCP 10.1.1.1:45684->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 8u IPv4 102202 0t0 TCP 10.1.1.1:45924->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 9u IPv4 102732 0t0 TCP 10.1.1.1:45938->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 10u IPv4 102746 0t0 TCP 10.1.1.1:45940->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 11u IPv4 102836 0t0 TCP 10.1.1.1:45942->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 12u IPv4 103653 0t0 TCP 10.1.1.1:45944->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 13u IPv4 102939 0t0 TCP 10.1.1.1:45946->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 14u IPv4 102966 0t0 TCP 10.1.1.1:45950->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 15u IPv4 103900 0t0 TCP 10.1.1.1:45952->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 16u IPv4 103077 0t0 TCP 10.1.1.1:45954->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 17u IPv4 103157 0t0 TCP 10.1.1.1:45956->10.1.1.3:8086 (ESTABLISHED)
telegraf 17998 telegraf 18u IPv4 103178 0t0 TCP 10.1.1.1:45958->10.1.1.3:8086 (ESTABLISHED)
root@tick:~/sandbox#

Each reload leads to one socket in established state which leaking the connection & the resources.
As the number of agents are high, it is leading instability of influxdb which needs to be restarted frequently to overcome this issue

Expected behavior:

Reload should behave like restart, when the telegraf means only one connection to be established with influxdb

Actual behavior:

Each establishes connection with every reload of telegraf agent leaving behind the old connection. If the telegraf is reloaded 10 times,
it would have 11 established connection state with influxdb

@danielnelson danielnelson added the bug unexpected problem or unintended behavior label May 22, 2019
@danielnelson danielnelson added this to the 1.11.0 milestone May 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants