Skip to content
This repository has been archived by the owner on Jan 25, 2024. It is now read-only.

Consuming 30G when used via noevim coc #33

Closed
ony opened this issue Mar 29, 2021 · 25 comments · Fixed by #54
Closed

Consuming 30G when used via noevim coc #33

ony opened this issue Mar 29, 2021 · 25 comments · Fixed by #54
Assignees
Labels
bug Something isn't working
Milestone

Comments

@ony
Copy link

ony commented Mar 29, 2021

Hello,

I had two incidents already when with rnix-lsp consumed almost all RAM.
It is interesting that in both cases I had in parallel other terminal running nix repl while editing home.nix (home-manager) with error syntax error.

# home.nix
let
  # first evaluated yesterday
  rnix-lsp = import "${builtins.fetchTarball https://github.com/nix-community/rnix-lsp/archive/master.tar.gz}";
in {
  # ...
  # Effectively generating ~/.config/nvim/coc-settings.json
    nix = {
      command = "${rnix-lsp}/bin/rnix-lsp";
      filetypes = ["nix"];
    };
}
...
Mar 29 15:49:04 ony kernel: Mem-Info:
Mar 29 15:49:04 ony kernel: active_anon:7466684 inactive_anon:441570 isolated_anon:0
                             active_file:949 inactive_file:1626 isolated_file:158
                             unevictable:0 dirty:0 writeback:0 unstable:0
                             slab_reclaimable:54731 slab_unreclaimable:58579
                             mapped:9174 shmem:12085 pagetables:38810 bounce:0
                             free:50700 free_pcp:1207 free_cma:0
..
Mar 29 15:49:04 ony kernel: Tasks state (memory values in pages):
Mar 29 15:49:04 ony kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
...
Mar 29 15:49:04 ony kernel: [  24361]  1000 24361    76923     1754   114688     1148             0 nvim
Mar 29 15:49:04 ony kernel: [  24363]  1000 24363   160515      596  1519616     6984             0 node
...
Mar 29 15:49:04 ony kernel: [  24381]  1000 24381 16812507  7618497 117530624  7033451             0 rnix-lsp
Mar 29 15:49:04 ony kernel: [  14255]  1000 14255   145768        0   368640    30960             0 nix
Mar 29 15:49:04 ony kernel: [  14868]     0 14868   114793        1   122880     1369             0 nix-daemon
...
Mar 29 15:49:04 ony kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-5.scope,task=rnix-lsp,pid=24381,uid=1000
Mar 29 15:49:04 ony kernel: Out of memory: Killed process 24381 (rnix-lsp) total-vm:67250028kB, anon-rss:30473988kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:114776kB oom_score_adj:0
Mar 29 15:49:04 ony kernel: oom_reaper: reaped process 24381 (rnix-lsp), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
@Ma27 Ma27 mentioned this issue Jun 19, 2021
11 tasks
@Ma27 Ma27 self-assigned this Jul 5, 2021
@Ma27 Ma27 added this to the 0.2.0 milestone Jul 5, 2021
@Ma27
Copy link
Member

Ma27 commented Jul 5, 2021

There's a certain chance that this got fixed in #37, but I really want to look at this before getting a 0.2.0 out.

@Ma27 Ma27 added the bug Something isn't working label Jul 5, 2021
@ony
Copy link
Author

ony commented Jul 19, 2021

I was not able to reproduce issue for past 2 days (effective run-time ~7 minutes). I tried keeping all-packages.nix open and tried putting rec and do some auto-completion. During open it was taking ~16M and after completion ~32M with grow to ~33M over time, but later went down to ~31.7M. Virtual memory/space is ~160M.
In first hour I tried latest commit. But after seeing no issues switched to 6c12d7f to confirm that I still can reproduce on original commit for report.
Note that when I was reporting this issue I was using stable Nix and now I use Flakes (without NIX_PATH injection, but with registry pin for nixpkgs). As well I was on NixOS 20.09 and now on 21.05.

@Ma27
Copy link
Member

Ma27 commented Jul 20, 2021

THanks a lot for reporting back! In case you stumble upon this again, please let us know! I'll leave this open for another two weeks to see if you, me or somebody else can reproduce this.

@fufexan
Copy link

fufexan commented Jul 25, 2021

I've had this same problem, but it seems like the incremental expression-updates commit has fixed it. Thanks!

@Ma27
Copy link
Member

Ma27 commented Jul 26, 2021

In case somebody still stumbles upon the issue on a recent master of rnix-lsp: it would be really awesome if you could share a coredump of the process then, because I'd really like to look at the memory itself then.

One (hacky, I guess?) way to do this is to kill rnix-lsp via kill -11 (to "simulate" a SIGSEGV) which usually causes the creation of a coredump.

@Ma27
Copy link
Member

Ma27 commented Jul 29, 2021

Damn it, I just experienced this again :/ Unfortunately my kernel was faster with OOMing rnix-lsp than I could kill -11 it to get a coredump. Luckily I'm write rather much Nix code currently, so hopefully I'll be faster next time.

@ony
Copy link
Author

ony commented Jul 31, 2021

Any chance to instrument some memory debugging into rnix-lsp itself?
E.g. force memory limit to 4GB and handle ENOMEM with SIGABRT-ing self.

@Ma27 , any chance that it happened for you on Nix 2.3.x?

@Ma27
Copy link
Member

Ma27 commented Jul 31, 2021

E.g. force memory limit to 4GB and handle ENOMEM with SIGABRT-ing self.

Good idea :)

any chance that it happened for you on Nix 2.3.x?

Nope, Nix 2.4

@fufexan
Copy link

fufexan commented Aug 16, 2021

In case somebody still stumbles upon the issue on a recent master of rnix-lsp: it would be really awesome if you could share a coredump of the process then, because I'd really like to look at the memory itself then.

One (hacky, I guess?) way to do this is to kill rnix-lsp via kill -11 (to "simulate" a SIGSEGV) which usually causes the creation of a coredump.

@Ma27 I have finally managed to catch it before freezing my PC. Here's a screenshot of the coredump as my terminal somehow got messed up and I couldn't copy the text from it.
rnix-dump

The circumstances are: I was editing a fetchFromGitHub call that used lib.fakeHash and modifying the sha256 to the proper hash. As soon as I pasted the hash, rnix-lsp (or kak-lsp? not sure) started eating memory and I managed to kill it with doas pkill -11 rnix-lsp which I had prepared beforehand in a terminal nearby.

I could send the file if it helps.

@Ma27
Copy link
Member

Ma27 commented Aug 16, 2021

I could send the file if it helps.

Can you still access the core file? That one would be quite helpful! :)

@fufexan
Copy link

fufexan commented Aug 16, 2021

I'm not sure whether you mean the coredump file (I don't know where it's located), but this is the nix file that caused it to leak. https://gist.github.com/fufexan/61244c3949ec631d6f1483bbbc682b61

@Ma27
Copy link
Member

Ma27 commented Aug 16, 2021

@fufexan are you on NixOS? In that case coredumpctl gdb should give you the answer (the Storage-property).

@fufexan
Copy link

fufexan commented Aug 16, 2021

I have found it. https://gofile.io/d/WJElaV

@Ma27
Copy link
Member

Ma27 commented Aug 16, 2021

@fufexan one last question: do you have a custom overlay for your rnix-lsp? Or from which nixpkgs revision (nixos-version should help) did you build it from?

@fufexan
Copy link

fufexan commented Aug 16, 2021

@Ma27 I'm using the flake in this repo along with nixpkgs-unstable (21.11.20210729.dd98b10).

Ma27 added a commit to Ma27/rnix-lsp that referenced this issue Aug 31, 2021
The main motiviation is to get `crossbeam-channel` at at least 0.5.1 as
this appears to fix a memory like[1] which is probably the root-cause of
our own memory issues[2].

This also included some changes to the LSP type-system as a few more
changes in the LSP protocol are now incorporated, such as a
clarification that numbers are usually unsigned 32bit rather than
64bit[3].

[1] https://github.com/crossbeam-rs/crossbeam/blob/crossbeam-channel-0.5.1/crossbeam-channel/CHANGELOG.md#version-051
[2] nix-community#33
[3] gluon-lang/lsp-types@f654090
@Ma27 Ma27 mentioned this issue Aug 31, 2021
@Ma27
Copy link
Member

Ma27 commented Aug 31, 2021

@ony @fufexan there's now #49 which contains a few dependency updates (including a fix for crossbeam which resolves a severe memory leak in their code). Since I still cannot reproduce the bug reliably (although I get bitten by it occasionally), this is only a theory, unfortunately.

In order to find out whether this is an actual fix, it'd be really awesome if you could also test this PR :)

@fufexan
Copy link

fufexan commented Sep 1, 2021

@Ma27 tried building it, but it failed. Here's the build log https://gist.github.com/fufexan/f82d75bae6cfcfb95f984bb92b07ebc4.

@Ma27
Copy link
Member

Ma27 commented Sep 1, 2021

You'll need at least cargo+rustc 1.52 (should be on available nixos 21.05).

@fufexan
Copy link

fufexan commented Sep 1, 2021

I'm on unstable from a week ago, I think it should work even with that.

@Ma27
Copy link
Member

Ma27 commented Sep 1, 2021

Your log says [naersk] cargo_version (read): 1.47.0 though. Not sure why, though. Locally, it appears to use the correct versions for the build.

@fufexan
Copy link

fufexan commented Sep 1, 2021

Manually overriding rnix-lsp's naersk input with an updated one made it work.

@fufexan
Copy link

fufexan commented Sep 2, 2021

@Ma27 it seems to leak again, managed to freeze my computer while I was writing inherits.

@Ma27
Copy link
Member

Ma27 commented Sep 2, 2021

Oh damn it! Anyway, thanks a lot for testing!

The thing is, I hit that issue twice in the last two months, so reproducing is rather hard on my side (I tried to re-do my steps, but without any success, so I'd say that it's not related to certain files or patterns).

You use kakoune, right? Do you have a Nix expression for your setup that I can build locally? Just a wild guess, but perhaps it's easier to trigger with your setup, but idk.

Anyway, I'm afraid, I'll have to leave it for a few days and get back to it with a fresh mind, perhaps I come up with something else then.

@fufexan
Copy link

fufexan commented Sep 2, 2021

I use Kakoune, Neovim and Emacs 😄
So far I've managed to trigger the leak with kak and nvim, as I haven't used emacs extensively.
My configs are here.

Ma27 added a commit to nix-community/rnix-parser that referenced this issue Sep 19, 2021
…ions

First of all, big thanks to @fufexan who helped me to reliably reproduce
this.

Originally discovered in `rnix-lsp`[1], but I confirmed that
`nixpkgs-fmt` is also affected.

Basically, when having an expression such as

    let
      inherit

the parser would wait for a `TOKEN_SEMICOLON` indefinitely. The actual
problem however is that `self.parse_val()` always detects the SAME
syntax-error, i.e. "unexpected EOF". This will be written indefinetely
into `self.errors`. However, `errors` is of type `Vec<ParseError>` and a
vector in Rust grows in an amortized fashion[2] which means that if an
entry is pushed and the vector exceeds the currently allocated size, it
will be ~doubled (though the exact growth-factor isn't constant).

This essentially means that the buffer is growing exponentially pretty fast
and - according to KDE heaptrack - my system allocated ~9.5GB after 20s
while running some tests.

I added an exit-condition to the loop traversing through
`inherit`-subexpressions to avoid that. Checking for an "unexpected EOF"
is actually sufficient here:

* There's either a `;` later in the expression causing the loop to
  terminate and causing an actual "unexpected token" error then.

* Otherwise, `parse_val` will go through the tokens until a matching
  semicolon is found (which is not the case) and then reach the end of
  the file. In that case, `unexpected EOF` is returned by `parse_val`.

[1] nix-community/rnix-lsp#33
[2] https://www.cs.cornell.edu/courses/cs3110/2011sp/Lectures/lec20-amortized/amortized.htm
@Ma27
Copy link
Member

Ma27 commented Sep 19, 2021

-> https://github.com/nix-community/rnix-parser/pull/31/files

Big thanks to @fufexan for helping me to reproduce this! Feel free to remind me on the next in-person NixCon (or something similar) that I owe you a beverage of your choice :)

Ma27 added a commit to Ma27/rnix-lsp that referenced this issue Sep 23, 2021
This release incorporates a fix for the severe memleaks we're
experiencing.

Fixes nix-community#33
Ma27 added a commit to Ma27/rnix-lsp that referenced this issue Sep 23, 2021
* Write a dedication to jD91mZM2
* Say thank you to fufexan who helped us to reproduce and investigate
  the memory leak from nix-community#33.
@Ma27 Ma27 mentioned this issue Sep 23, 2021
@Ma27 Ma27 closed this as completed in #54 Sep 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants