Segmentation fault on Ubuntu 22.04 #110

kthhrv · 2024-01-15T16:33:46Z

Works on 20.04 but on 22.04 fluentd silently fails.

I'm using latest fluentd

# apt show fluent-package
Package: fluent-package
Version: 5.0.2-1

running fluentd manually with

# FLUENT_SOCKET=/var/run/fluent/fluentd.sock FLUENT_PLUGIN=/etc/fluent/plugin LD_PRELOAD=/opt/fluent/lib/libjemalloc.so GEM_PATH=/opt/fluent/lib/ruby/gems/3.2.0/ GEM_HOME=/opt/fluent/lib/ruby/gems/3.2.0/ FLUENT_CONF=/etc/fluent/fluentd.conf /opt/fluent/bin/fluentd

I get

/opt/fluent/lib/ruby/gems/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal.rb:325: [BUG] Segmentation fault at 0x0000000000000008
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0013 p:---- s:0064 e:000063 CFUNC  :free
c:0012 p:0012 s:0059 e:000058 METHOD /opt/fluent/lib/ruby/gems/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal.rb:325
c:0011 p:0042 s:0053 e:000052 METHOD /opt/fluent/lib/ruby/gems/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal/navigable.rb:13
c:0010 p:0018 s:0047 e:000044 METHOD /opt/fluent/lib/ruby/gems/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:151
c:0009 p:0013 s:0040 e:000039 METHOD /opt/fluent/lib/ruby/gems/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:144
c:0008 p:0032 s:0034 e:000033 METHOD /opt/fluent/lib/ruby/gems/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:121 [FINISH]
c:0007 p:---- s:0030 e:000029 IFUNC 
c:0006 p:0012 s:0027 e:000026 METHOD /opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/timer.rb:80 [FINISH]
c:0005 p:---- s:0022 e:000021 CFUNC  :run_once
c:0004 p:0034 s:0017 e:000016 METHOD /opt/fluent/lib/ruby/gems/3.2.0/gems/cool.io-1.8.0/lib/cool.io/loop.rb:88
c:0003 p:0026 s:0012 e:000011 BLOCK  /opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/event_loop.rb:93
c:0002 p:0050 s:0008 e:000007 BLOCK  /opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/thread.rb:78 [FINISH]
c:0001 p:---- s:0003 e:000002 DUMMY  [FINISH]

-- Ruby level backtrace information ----------------------------------------
/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
/opt/fluent/lib/ruby/gems/3.2.0/gems/cool.io-1.8.0/lib/cool.io/loop.rb:88:in `run'
/opt/fluent/lib/ruby/gems/3.2.0/gems/cool.io-1.8.0/lib/cool.io/loop.rb:88:in `run_once'
/opt/fluent/lib/ruby/gems/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/timer.rb:80:in `on_timer'
/opt/fluent/lib/ruby/gems/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:121:in `run'
/opt/fluent/lib/ruby/gems/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:144:in `watch'
/opt/fluent/lib/ruby/gems/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:151:in `yield_current_entry'
/opt/fluent/lib/ruby/gems/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal/navigable.rb:13:in `cursor'
/opt/fluent/lib/ruby/gems/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal.rb:325:in `read_and_free_outstr'
/opt/fluent/lib/ruby/gems/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal.rb:325:in `free'

-- Machine register context ------------------------------------------------
 RIP: 0x00007f88714a2e50 RBP: 0x00007f8866c267f0 RSP: 0x00007f8864cf8b90
 RAX: 0x0000000000000000 RBX: 0x00007f886a574b40 RCX: 0x000007f886a574b2
 RDX: 0x0000000000000001 RDI: 0x00007f887161ac80 RSI: 0x00007f885c0008e0
  R8: 0x00000000ffffffff  R9: 0x00007f885c0008e0 R10: 0x0000000000004022
 R11: 0x00007f885c000090 R12: 0x00007f887161ac80 R13: 0x0000000000000000
 R14: 0x0000ff10d119b330 R15: 0x00007f886b32a110 EFL: 0x0000000000010246

-- C level backtrace information -------------------------------------------
/opt/fluent/lib/libruby.so.3.2(0x7f8871ac1539) [0x7f8871ac1539]
/opt/fluent/lib/libruby.so.3.2(0x7f8871ac5869) [0x7f8871ac5869]
/opt/fluent/lib/libruby.so.3.2(0x7f887191e384) [0x7f887191e384]
/opt/fluent/lib/libruby.so.3.2(0x7f8871a3e65d) [0x7f8871a3e65d]
/lib/x86_64-linux-gnu/libc.so.6(0x7f8871442520) [0x7f8871442520]
/lib/x86_64-linux-gnu/libc.so.6(0x7f88714a2e50) [0x7f88714a2e50]
/lib/x86_64-linux-gnu/libc.so.6(free+0x73) [0x7f88714a5453]
/lib/x86_64-linux-gnu/libffi.so.8(0x7f886b328e2e) [0x7f886b328e2e]
/lib/x86_64-linux-gnu/libffi.so.8(0x7f886b325493) [0x7f886b325493]
/opt/fluent/lib/ruby/gems/3.2.0/gems/ffi-1.15.5/lib/ffi_c.so(rbffi_CallFunction+0x1af) [0x7f886976229f]
/opt/fluent/lib/ruby/gems/3.2.0/gems/ffi-1.15.5/lib/ffi_c.so(0x7f8869758937) [0x7f8869758937]
/opt/fluent/lib/libruby.so.3.2(0x7f8871aa1fdb) [0x7f8871aa1fdb]
/opt/fluent/lib/libruby.so.3.2(0x7f8871aa5991) [0x7f8871aa5991]
/opt/fluent/lib/libruby.so.3.2(0x7f8871aa72ca) [0x7f8871aa72ca]
/opt/fluent/lib/libruby.so.3.2(rb_vm_exec+0xef) [0x7f8871abcf1f]
/opt/fluent/lib/libruby.so.3.2(rb_vm_call0+0xcd) [0x7f8871ab250d]
/opt/fluent/lib/libruby.so.3.2(rb_method_call_with_block_kw+0x84) [0x7f88719eb254]
/opt/fluent/lib/libruby.so.3.2(0x7f8871aa27f4) [0x7f8871aa27f4]
/opt/fluent/lib/libruby.so.3.2(0x7f8871aa289f) [0x7f8871aa289f]
/opt/fluent/lib/libruby.so.3.2(0x7f8871aa5991) [0x7f8871aa5991]
/opt/fluent/lib/libruby.so.3.2(0x7f8871aa72ca) [0x7f8871aa72ca]
/opt/fluent/lib/libruby.so.3.2(rb_vm_exec+0xef) [0x7f8871abcf1f]
/opt/fluent/lib/libruby.so.3.2(rb_funcallv+0x19c) [0x7f8871ab3c8c]
/opt/fluent/lib/ruby/gems/3.2.0/gems/cool.io-1.8.0/lib/cool.io_ext.so(0x7f886c17362d) [0x7f886c17362d]
/opt/fluent/lib/ruby/gems/3.2.0/gems/cool.io-1.8.0/lib/cool.io_ext.so(0x7f886c1787c2) [0x7f886c1787c2]
/opt/fluent/lib/libruby.so.3.2(0x7f8871aa1fdb) [0x7f8871aa1fdb]
/opt/fluent/lib/libruby.so.3.2(0x7f8871aa5991) [0x7f8871aa5991]
/opt/fluent/lib/libruby.so.3.2(0x7f8871aa72ca) [0x7f8871aa72ca]
/opt/fluent/lib/libruby.so.3.2(rb_vm_exec+0xef) [0x7f8871abcf1f]
/opt/fluent/lib/libruby.so.3.2(rb_vm_invoke_proc+0x5f) [0x7f8871abf97f]
/opt/fluent/lib/libruby.so.3.2(0x7f8871a7450a) [0x7f8871a7450a]
/opt/fluent/lib/libruby.so.3.2(0x7f8871a74de6) [0x7f8871a74de6]
/opt/fluent/lib/libruby.so.3.2(0x7f8871a75a32) [0x7f8871a75a32]
/lib/x86_64-linux-gnu/libc.so.6(0x7f8871494ac3) [0x7f8871494ac3]
/lib/x86_64-linux-gnu/libc.so.6(0x7f8871526850) [0x7f8871526850]

config

# cat /etc/fluent/conf.d/etl.conf 
<source>
    @type systemd
    tag raw.etl
    path /var/log/journal
    matches [{ "_SYSTEMD_UNIT": "etl.service" }, { "_SYSTEMD_UNIT": "heartbeat.service" }]
    read_from_head true
    
    <storage>
       @type local
       path /var/log/fluentd-journald-kubelet-cursor.json
    </storage>

    <entry>
       fields_strip_underscores true
       fields_lowercase true
    </entry>
</source>

<match etl>
    @type cloudwatch_logs
    log_group_name "/timeout/etl/#{ENV['TREEHUGGER_STAGE'] || 'prod'}/app"
    log_stream_name "#{ENV['INSTANCE_ID']}"
    auto_create_stream true
    flush_interval 1s
</match>

<system>
  root_dir /var/log/fluent
</system>
root@ip-10-200-4-48:~#

The text was updated successfully, but these errors were encountered:

redliu312 · 2024-01-28T08:25:08Z

met the same issue on a k8s cluster and trying to use
fluentd image: fluent/fluentd-kubernetes-daemonset:v1.16.3-debian-elasticsearch8-2.1
with systemd plugin

the error :

fluentd-sbpzp fluentd 2024-01-28 08:20:57 +0000 [trace]: #0 [es_kubelet] writing events into buffer instance=2360 metadata_size=1
fluentd-sbpzp fluentd /fluentd/vendor/bundle/ruby/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal.rb:325: [BUG] Segmentation fault at 0x0000000000000008
fluentd-sbpzp fluentd ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
fluentd-sbpzp fluentd
fluentd-sbpzp fluentd -- Control frame information -----------------------------------------------
fluentd-sbpzp fluentd c:0013 p:---- s:0064 e:000063 CFUNC  :free
fluentd-sbpzp fluentd c:0012 p:0012 s:0059 e:000058 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal.rb:325
fluentd-sbpzp fluentd c:0011 p:0042 s:0053 e:000052 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/systemd-journal-1.4.2/lib/systemd/journal/navigable.rb:13
fluentd-sbpzp fluentd c:0010 p:0018 s:0047 e:000044 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:151
fluentd-sbpzp fluentd c:0009 p:0013 s:0040 e:000039 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:144
fluentd-sbpzp fluentd c:0008 p:0032 s:0034 e:000033 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-systemd-1.0.5/lib/fluent/plugin/in_systemd.rb:121 [FINISH]
fluentd-sbpzp fluentd c:0007 p:---- s:0030 e:000029 IFUNC
fluentd-sbpzp fluentd c:0006 p:0012 s:0027 e:000026 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/timer.rb:80 [FINISH]
fluentd-sbpzp fluentd c:0005 p:---- s:0022 e:000021 CFUNC  :run_once
fluentd-sbpzp fluentd c:0004 p:0034 s:0017 e:000016 METHOD /fluentd/vendor/bundle/ruby/3.2.0/gems/cool.io-1.8.0/lib/cool.io/loop.rb:88
fluentd-sbpzp fluentd c:0003 p:0026 s:0012 e:000011 BLOCK  /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/event_loop.rb:93
fluentd-sbpzp fluentd c:0002 p:0050 s:0008 e:000007 BLOCK  /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/plugin_helper/thread.rb:78 [FINISH]
fluentd-sbpzp fluentd c:0001 p:---- s:0003 e:000002 DUMMY  [FINISH]
fluentd-sbpzp fluentd
fluentd-sbpzp fluentd -- Ruby level backtrace information ----------------------------------------

dkulchinsky · 2024-02-05T16:24:17Z

looks like related to ledbettj/systemd-journal#93

and also reported here: fluent/fluent-package-builder#369

fpiriz-inditex · 2024-03-01T11:19:52Z

Workaround suggested in fluent/fluent-package-builder#369 does remove the crashes. However, in my case (same as redliu312 in the previous message) systemd logs are not read with the workaround. Leaving the env variable untouched, fluentd crashes and upon restart, all the systemd logs are flushed and indexed.

daipom · 2024-03-25T03:03:34Z

Currently, we need to avoid this problem by unsetting LD_PRELOAD in some environments.

It is probably not a problem with this plugin, but it seems to be a problem with libjemalloc and systemd-journal.

It's a problem to have a segmentation fault in the default state of some Fluentd distributions, so we need to consider possible solutions.

There is a known bug that combination with jemalloc and fluent-plugin-systemd causes free(): invalid crash for a long time. The problematic code is identified but the root cause is not fixed yet. There is a workaround for this - disable jemalloc explicitly. LD_PRELOAD= stop to use jemalloc. If you want to use jemalloc, set it via env like this: containers: - name: fluentd image: fluent/fluentd-kubernetes-daemonset:v1-debian-forward env: - name: K8S_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: FLUENT_FORWARD_HOST value: "REMOTE_ENDPOINT" - name: FLUENT_FORWARD_PORT value: "18080" - name: LD_PRELOAD value: "/usr/lib/libjemalloc.so.2" Related issues: fluent/fluentd-docker-image#378 fluent/fluent-package-builder#369 fluent-plugins-nursery/fluent-plugin-systemd#110 ledbettj/systemd-journal#93 fluent#1478 Signed-off-by: Kentaro Hayashi <[email protected]>

There is a known bug that combination with jemalloc and fluent-plugin-systemd causes free(): invalid crash for a long time. The problematic code is identified but the root cause is not fixed yet. There is a workaround for this - disable jemalloc explicitly. LD_PRELOAD= stop to use jemalloc. If you want to use jemalloc, set it via env like this: containers: - name: fluentd image: fluent/fluentd-kubernetes-daemonset:v1-debian-forward env: - name: K8S_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName - name: FLUENT_FORWARD_HOST value: "REMOTE_ENDPOINT" - name: FLUENT_FORWARD_PORT value: "18080" - name: LD_PRELOAD value: "/usr/lib/libjemalloc.so.2" Related issues: fluent/fluentd-docker-image#378 fluent/fluent-package-builder#369 fluent-plugins-nursery/fluent-plugin-systemd#110 ledbettj/systemd-journal#93 #1478 Signed-off-by: Kentaro Hayashi <[email protected]>

ledbettj · 2024-08-30T15:01:16Z

Hi, this issue is discussed over at ledbettj/systemd-journal#96

The problem is that the native library, libsystemd allocates memory using the system allocator, and then expects the calling code to free it. In most libraries, when they allocate memory they have a corresponding _free() method which can be called to release it, but libsystemd does not in this case.

The caller must call the corresponding free() implementation that was used for malloc() -- libc, tcmalloc, jemalloc, etc -- otherwise the process will likely crash.

In Ruby/FFI land, I don't know any good way to determine which allocator was used by the native code. Trying to load the jemalloc library when the libC implementation is already in use will cause issues :(

One option might be to provide a separate native library shim that just exposes a free wrapper (which would hopefully pick up on the LD_PRELOAD) but I don't know if that will be foolproof either.

ledbettj · 2024-09-11T01:47:32Z

Hi, this issue is discussed over at ledbettj/systemd-journal#96

The problem is that the native library, libsystemd allocates memory using the system allocator, and then expects the calling code to free it. In most libraries, when they allocate memory they have a corresponding _free() method which can be called to release it, but libsystemd does not in this case.

The caller must call the corresponding free() implementation that was used for malloc() -- libc, tcmalloc, jemalloc, etc -- otherwise the process will likely crash.

In Ruby/FFI land, I don't know any good way to determine which allocator was used by the native code. Trying to load the jemalloc library when the libC implementation is already in use will cause issues :(

One option might be to provide a separate native library shim that just exposes a free wrapper (which would hopefully pick up on the LD_PRELOAD) but I don't know if that will be foolproof either.

Hi friends,

I've published a version 2.0.0 of systemd-journal gem which includes a native shim as described above to attempt to work around this allocator mismatch.

Please give it a try and let me know if you run into any issues.

kenhys · 2024-09-11T07:00:57Z

It seems that this issue was fixed in systemd-journal 2.0.0, so we shipped fluent-plugin-systemd 1.1.0 which adopts systemd-journal 2.0.0 or later.

Please use fluent-plugin-systemd 1.1.0.

kenhys mentioned this issue Jul 17, 2024

Change jemalloc opt-in for now fluent/fluentd-kubernetes-daemonset#1512

Merged

kenhys mentioned this issue Jul 18, 2024

RESOLVED: free(): invalid pointer (segmentation fault with systemd plugin and jemalloc) fluent/fluentd-kubernetes-daemonset#1517

Open

kenhys mentioned this issue Aug 8, 2024

RESOLVED: Workaround: disable jemalloc when image crashed (segmentation fault) with systemd plugin fluent/fluentd-docker-image#385

Open

ashie mentioned this issue Aug 30, 2024

Fix a crash bug when memory allocator implementation was changed ledbettj/systemd-journal#96

Closed

kenhys closed this as completed Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault on Ubuntu 22.04 #110

Segmentation fault on Ubuntu 22.04 #110

kthhrv commented Jan 15, 2024 •

edited

Loading

redliu312 commented Jan 28, 2024

dkulchinsky commented Feb 5, 2024

fpiriz-inditex commented Mar 1, 2024 •

edited

Loading

daipom commented Mar 25, 2024

ledbettj commented Aug 30, 2024

ledbettj commented Sep 11, 2024

kenhys commented Sep 11, 2024

Segmentation fault on Ubuntu 22.04 #110

Segmentation fault on Ubuntu 22.04 #110

Comments

kthhrv commented Jan 15, 2024 • edited Loading

redliu312 commented Jan 28, 2024

dkulchinsky commented Feb 5, 2024

fpiriz-inditex commented Mar 1, 2024 • edited Loading

daipom commented Mar 25, 2024

ledbettj commented Aug 30, 2024

ledbettj commented Sep 11, 2024

kenhys commented Sep 11, 2024

kthhrv commented Jan 15, 2024 •

edited

Loading

fpiriz-inditex commented Mar 1, 2024 •

edited

Loading