Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Neo-Tree reproducibly segfaults on macOS with follow_current_file enabled #1126

Closed
3 tasks done
klmr opened this issue Aug 30, 2023 · 6 comments
Closed
3 tasks done
Labels
bug Something isn't working

Comments

@klmr
Copy link

klmr commented Aug 30, 2023

Did you check docs and existing issues?

  • I have read all the docs.
  • I have searched the existing issues.
  • I have searched the existing discussions.

Neovim Version (nvim -v)

NVIM v0.8.3–v0.9.1

Operating System / Version

macOS (multiple versions, incl. 12 & 13)

Describe the Bug

On macOS (but not on Linux!) I can reproducibly segfault NeoVim when Neo-Tree is opened and follow_current_file is enabled, by switching between different file buffers. It takes a bit of time, but after several buffer switches, NeoVim closes without a message. Via Console.app I can find that the cause of the crash is always due to an invalid pointer access (KERN_INVALID_ADDRESS). The invalid pointer address varies, but occasionally the addresses are wildly invalid, e.g. 0x0000000000000040 — my guess therefore is that this is due to a buffer overflow which overwrites the pointer memory, rather than off-by-one errors.

The error seems to happen inside readdir, called from inside libuv. Here’s a typical stack trace of the crashed thread:

0   libsystem_pthread.dylib       	       0x1a13b2558 pthread_mutex_lock + 12
1   libsystem_c.dylib             	       0x1a129ebc8 readdir + 32
2   libuv.1.dylib                 	       0x1034ccc6c uv__fs_work + 2344
3   libuv.1.dylib                 	       0x1034c73e4 worker + 388
4   libsystem_pthread.dylib       	       0x1a13b7fa8 _pthread_start + 148
5   libsystem_pthread.dylib       	       0x1a13b2da0 thread_start + 8

The function on the top of the stack isn’t always the same — sometimes it’s _readdir_unlocked instead of pthread_mutex_lock. Occasionally, the actual crash instead happens inside the calling thread in pthread_kill, or inside luv_push_dirent, with the following stack trace:

0   libluv.1.43.0.dylib           	       0x102d97634 luv_push_dirent + 48
1   libluv.1.43.0.dylib           	       0x102d97488 push_fs_result + 780
2   libluv.1.43.0.dylib           	       0x102d970d8 luv_fs_cb + 44
3   libuv.1.dylib                 	       0x102feaff0 uv__work_done + 192
4   libuv.1.dylib                 	       0x102fee3c4 uv__async_io + 320
5   libuv.1.dylib                 	       0x102ffe1e0 uv__io_poll + 1748
6   libuv.1.dylib                 	       0x102fee7bc uv_run + 244
7   nvim                          	       0x102a4b298 loop_uv_run + 136
8   nvim                          	       0x102b202fc os_breakcheck + 64
9   nvim                          	       0x102b8a924 state_handle_k_event + 152
10  nvim                          	       0x102afc2e0 nv_event + 60
11  nvim                          	       0x102af506c normal_execute + 4616
12  nvim                          	       0x102b8a864 state_enter + 356
13  nvim                          	       0x1029865e0 main + 10228
14  dyld                          	       0x1a105ff28 start + 2236

I’ve attached an exemplary macOS crash report, and I am happy to supply others on request.

Screenshots, Traceback

⬇️ nvim-2023-08-30-164412.ips.log

Steps to Reproduce

  1. Create a new directory with at least two files in it:
    mkdir x && cd x
    echo foo>foo; echo bar>bar
  2. Launch nvim with the minimal configuration from below, and the files, and open Neo-Tree:
    nvim -u repro.lua * +:Neotree
  3. Start switching between file buffers (to make this easier I rebound Return to switch to the next buffer, but manually using e.g. :bn/:bp etc. works as well). The behaviour is nondeterministic, so it might require several dozen buffer switches before nvim crashes. However, I have never needed more than ~50, and usually only around 10.

(The steps above aim to make the example self-contained; obviously you don’t need to create a new directory and files, it works equally well in any existing, non-empty directory.)

Instead of the self-contained repro.lua, the following minimal.lua also reproduces the issue:

vim.opt.runtimepath:append('.repro/plugins/neo-tree.nvim')
vim.opt.runtimepath:append('.repro/plugins/nui.nvim')
vim.opt.runtimepath:append('.repro/plugins/plenary.nvim')

require("neo-tree").setup({
  filesystem = {
    follow_current_file = { enabled = true },
  },
})

vim.keymap.set('n', '<cr>', '<cmd>bn<cr>')

Expected Behavior

No segfault occurs.

Your Configuration

-- DO NOT change the paths and don't remove the colorscheme
local root = vim.fn.fnamemodify("./.repro", ":p")

-- set stdpaths to use .repro
for _, name in ipairs({ "config", "data", "state", "cache" }) do
  vim.env[("XDG_%s_HOME"):format(name:upper())] = root .. "/" .. name
end

-- bootstrap lazy
local lazypath = root .. "/plugins/lazy.nvim"
if not vim.loop.fs_stat(lazypath) then
  vim.fn.system({ "git", "clone", "--filter=blob:none", "https://github.com/folke/lazy.nvim.git", lazypath, })
end
vim.opt.runtimepath:prepend(lazypath)

-- install plugins
local plugins = {
  "folke/tokyonight.nvim",
  -- add any other plugins here
}

local neotree_config = {
  "nvim-neo-tree/neo-tree.nvim",
  dependencies = { "MunifTanjim/nui.nvim", "nvim-tree/nvim-web-devicons", "nvim-lua/plenary.nvim" },
  cmd = { "Neotree" },
  keys = {
    { "<Leader>e", "<Cmd>Neotree<CR>" }, -- change or remove this line if relevant.
  },
  opts = {
    filesystem = {
      follow_current_file = { enabled = true },
    },
  },
}

table.insert(plugins, neotree_config)
require("lazy").setup(plugins, {
  root = root .. "/plugins",
})

vim.cmd.colorscheme("tokyonight")
-- add anything else here

vim.keymap.set('n', '<cr>', '<cmd>bn<cr>')
@klmr klmr added the bug Something isn't working label Aug 30, 2023
@cseickel
Copy link
Contributor

Thanks @klmr for an excellent and very complete bug report. Unfortunately I can't do much with this because I don't have access to a Mac.

Can you tell me if the directory you are in has a particularly large amount of files/folders or if it is within a very large git repo? Is there anything unusual about the hardware (very old or very new?)

I think that in the case of a segfault, the fault is ultimately in Neovim itself. Neo-tree may be doing something to surface that problem, but I don't think the lua code should be able to cause a segfault. Have you checked the issues in the neovim repo?

@klmr
Copy link
Author

klmr commented Aug 30, 2023

I’ve been able to test and reproduce this on two different macOS models (both running an ARM chip, M2 — apologies, I should have mentioned this!). There’s nothing special about the folder structure. Any folder will do, including something directly in the home directory; no deep nesting, and no large subdirectory structure.

I don't think the lua code should be able to cause a segfault

Yeah, I actually agree with this. Unfortunately I haven’t been able to find any issue that looks related.

… I’m actually puzzled by this lack of bug reports, since the behaviour is fairly disruptive and has been happening for months (that’s how long it took me to be able to narrow the issue down and make it reproducible). I’m sure other people must have stumbled across it; the only reason it took me so long was that I am mostly using Linux.

Should I cross-post the issue to the NeoVim repo?

@cseickel
Copy link
Contributor

Should I cross-post the issue to the NeoVim repo?

I think so, after checking existing issues of course.

I suppose I could definitely see how only neo-tree could find a problem with readdir because we will spawn multiple asynchronous reads. Certainly only another tree plugin would behave in this way. It would be interesting to see if Nvim-tree causes segfaults as well.

@miversen33
Copy link
Collaborator

miversen33 commented Aug 30, 2023

@klmr I would be curious, are you running one of the M1 ARM chips? Edit: I should really learn how to read. I am firing up a pi to see if I can recreate this on linux on ARM

This smells like a libuv issue as opposed to neovim directly (though of course, Neovim provides libuv and it is used heavily in the filesystem source within Neo-tree). I ask about the architecture because I have seen a handful of other weird issues in Neovim land related to running on a non-x86 architecture. I haven't tried yet, but I wonder if this can be recreated on something like a raspberry pi (also running ARM).

@miversen33
Copy link
Collaborator

Tested this on a raspberry pi 4 running Manjaro and I was unable to replicate. So there must be something with the ARM architecture and how libuv is relaying instructions to the processor through Apples Kernel (all well beyond me). In any case, I believe this is below Neo-tree specifically :(

@klmr
Copy link
Author

klmr commented Aug 31, 2023

In the meantime I have tried and failed to reproduce the issue with the official NeoVim Universal build. Turns out, the issue only seems to exist with the build from MacPorts, so I will re-report this bug to MacPorts. They have their own build infrastructure, and they must have done something slightly differently.

I agree with the assessment that this is probably ultimately a libuv issue. In fact, there is a (fixed, luvit/luv#640) issue which sounds suspiciously similar: neovim/neovim#22694.

@klmr klmr closed this as completed Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants