Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when encountering a TemplateHaskell splice #800

Closed
istathar opened this issue Feb 20, 2020 · 48 comments
Closed

Segmentation fault when encountering a TemplateHaskell splice #800

istathar opened this issue Feb 20, 2020 · 48 comments
Labels
can-workaround component: ghcide type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc..

Comments

@istathar
Copy link

We've had ghcide crashing, and the editor trying to reload it leads to a tight loop that the editor can't escape. Trying the ghcide binary from the command-line in test mode leads to a segfault:

Step 6/6: Type checking the files
Segmentation fault (core dumped)

This is triggered when using TemplateHaskell. We have a splice fromPackage :: Q Exp which returns some metadata from the project's .cabal file:

version :: Version
version = $(fromPackage)

removing this definition from the source file (in this case RenderMain.hs lines 13-14) allows the ghcide binary to complete its test run:

Step 6/6: Type checking the files

Completed (1 file worked, 0 files failed)

I had earlier speculated that QuasiQuotes were to blame, but the file checks ok with those [quote| ... |] blocks still in the file.

Other files in the project load and check fine, so long as you avoid having the problematic file open in an editor tab you can avoid the crash. Not ideal, but at least a workaround.

@ndmitchell
Copy link
Collaborator

Another workaround is to use #ifdef __GHCIDE__ around the problematic bit.

@istathar
Copy link
Author

Fencing the splice with a preprocessor directive (and enabling CPP) indeed works to prevent ghcide from crashing.

version :: Version
#ifdef __GHCIDE__
version = "0"
#else
version = $(fromPackage)
#endif

Do you want a backtrace from when it segfaults? I've not had much success getting GDB to play nice with GHC but I'll give it a try if you want.

@ndmitchell
Copy link
Collaborator

If it's easy then sure, but I'm not sure if anyone is likely to look at it. TH related things aren't our strong spot.

@mpickering
Copy link
Contributor

Is this reproducible @afcowie ?

@istathar
Copy link
Author

istathar commented Feb 21, 2020

@mpickering yes, very reproducible. I only have one Template Haskell splice in my collection of projects—only time I've ever written TH code—but I hazard a guess any Template Haskell splice would trigger this.

(this is to say I have three separate projects doing this in one file each and ghcide segfaults with all of them)

@mpickering
Copy link
Contributor

Template Haskell does generally work so I think it is probably something specific to do with this splice.
What platform are you running on? What GHC version?

@mpickering
Copy link
Contributor

mpickering commented Feb 21, 2020

fromPackage looks excessively dodgy. Perhaps the issue is that it is calling error which manifests as a segfault somehow. If you modify it to call fail does that improve the situation?

istathar referenced this issue in istathar/technique Feb 21, 2020
For fascinating reasons the **ghcide** Language Server Protocol server
crashes when faced with a Template Haskell splice. Guard the code
snippet which gets the application version from the package metadata
with a CPP directive so we can open this source file in VS Code without
it killing the IDE.

Upstream issue is https://github.com/digital-asset/ghcide/issues/444
@istathar
Copy link
Author

@mpickering I'm sorry you find our code excessively dodgy. We've avoided TemplateHaskell for years, seems to cause more problems that its worth but worked for us in this case. I'll try your suggestion.

@istathar
Copy link
Author

@mpickering No, changing to fail doesn't change the outcome; still segfaults. I'll poke around in there to see if there's something else going on though.

@mpickering
Copy link
Contributor

@afcowie Can you produce a standalone repo with a hie.yaml file which reproduces this and I will investigate.

istathar referenced this issue in istathar/hello Feb 22, 2020
This builds against `lts-15.0` with GHC 8.8.2; unfortunately **ghcide**
segfaults when checking it for errors. Demonstrates
https://github.com/digital-asset/ghcide/issues/444 as requested by
@mpickering whose help we hugely appreciate.
@istathar
Copy link
Author

@mpickering done. If you clone https://github.com/afcowie/hello/tree/ghcide-issue-444 you will get a essentially bare repository with a single program that demonstrates the problem. stack build should succeed; stack exec -- hello and/or stack run should result in "Hello World" on your console:

$ stack exec -- hello
Hello World
$

however

$ ghcide src/HelloMain.hs
Step 6/6: Type checking the files
Segmentation fault (core dumped)
$

I should note that this is with GHC 8.8.2 so you'll need the a version of ghcide built against that, as I understand it. The problem was exhibited with earlier versions of GHC. If you're using Stack but GHC 8.6.5, say, you can use:

resolver: lts-14.27

extra-deps:
  - core-text-0.2.3.3
  - core-data-0.2.1.5
  - core-program-0.2.4.2

and get the same demonstration.

@mpickering
Copy link
Contributor

The segfault is coming from calling length in Data.Text.Short. Here is a more minimal example.

https://gist.github.com/d335ac940a97a574891c947578636fa1

@mpickering
Copy link
Contributor

The ByteArray# and length look correct to me so something is going wrong with the call to c_text_short_length.

@mpickering
Copy link
Contributor

To further deepen the mystery, if I add text-short as a local package, then it works, if it comes from the global cabal store then it doesn't. I'm going to stop investigating this bug now!

@istathar
Copy link
Author

@hvr This is extraordinary debugging work by @mpickering. If you look at the Haskell source for HelloMain.hs in Matt's Gist he worked down to a very minimal example. Do you think you might have any insight about what's happening here? Amazing bug, really.

@hvr
Copy link
Member

hvr commented Mar 28, 2020

@mpickering @afcowie I tried to reproduce this by cloning the gist (fixing up the .cabal file) and cabal run-ing with a couple GHC versions (on Ubuntu 18.04, with GHCs from my PPA) but wasn't able to reproduce yet; what environment did the segfaults occur on?

PS: Does this bug-repro really need ghcide to reproduce -- i.e. it can't be reproduced without it?

@mpickering
Copy link
Contributor

@hvr I couldn't reproduce it with a normal GHC executable, only when using ghcide, which is why I didn't raise an upstream bug report. I tried a few things like normal GHC, ghci, ghci with -fobject-code and so on but couldn't get the segfault to trigger.

Btw, I never said this in the issue but the segfault only happens if the string is at least 16 bytes long. Strings under 16 bytes seem to work.

@hvr
Copy link
Member

hvr commented Mar 29, 2020

@mpickering well, I tried with ghcide too and still wasn't able to reproduce:

$ cabal build -w ghc-8.8.3
Resolving dependencies...
Build profile: -w ghc-8.8.3 -O1
In order, the following will be built (use -v for more details):
 - hello-0.1.1 (exe:hello) (first run)
Configuring executable 'hello' for hello-0.1.1..
Preprocessing executable 'hello' for hello-0.1.1..
Building executable 'hello' for hello-0.1.1..
[1 of 2] Compiling Main             ( src/HelloMain.hs, /home/hvr/Haskell/bugs/text-short-mpickering/dist-newstyle/build/x86_64-linux/ghc-8.8.3/hello-0.1.1/x/hello/build/hello/hello-tmp/Main.o )
[2 of 2] Compiling Paths_hello      ( /home/hvr/Haskell/bugs/text-short-mpickering/dist-newstyle/build/x86_64-linux/ghc-8.8.3/hello-0.1.1/x/hello/build/hello/autogen/Paths_hello.hs, /home/hvr/Haskell/bugs/text-short-mpickering/dist-newstyle/build/x86_64-linux/ghc-8.8.3/hello-0.1.1/x/hello/build/hello/hello-tmp/Paths_hello.o )
Linking /home/hvr/Haskell/bugs/text-short-mpickering/dist-newstyle/build/x86_64-linux/ghc-8.8.3/hello-0.1.1/x/hello/build/hello/hello ...
$ PATH=/opt/ghc/8.8.3/bin:$PATH ghcide src/HelloMain.hs 
ghcide version: 0.1.0 (GHC: 8.8.3) (PATH: /stuff/dot-cabal/store/ghc-8.8.3/ghcide-0.1.0-e016e4e555362fde9aa6e177d286bd58be2d80403925ffef6459d40d32cb0a0e/bin/ghcide)
Ghcide setup tester in /home/hvr/Haskell/bugs/text-short-mpickering.
Report bugs at https://github.com/digital-asset/ghcide/issues…

Step 1/6: Finding files to test in /home/hvr/Haskell/bugs/text-short-mpickering
Found 1 files

Step 2/6: Looking for hie.yaml files that control setup
Found 1 cradle

Step 3/6, Cradle 1/1: Loading /home/hvr/Haskell/bugs/text-short-mpickering/hie.yaml

Step 4/6, Cradle 1/1: Loading GHC Session
> Resolving dependencies...
> Build profile: -w ghc-8.8.3 -O1
> In order, the following will be built (use -v for more details):
>  - hello-0.1.1 (exe:hello) (configuration changed)
> Configuring executable 'hello' for hello-0.1.1..
> Preprocessing executable 'hello' for hello-0.1.1..

Step 5/6: Initializing the IDE

Step 6/6: Type checking the files

Completed (1 file worked, 0 files failed)

@istathar
Copy link
Author

Bummer. Thanks for trying @hvr. I just tried with Matt's minimal example and it still segfaults the same way.

@mpickering
Copy link
Contributor

@afcowie @hvr I am using NixOS, what about your Andrew? Perhaps I should try with a different C compiler?

@mpickering
Copy link
Contributor

I just tried the example again with 8.8.1 and it worked, are you building a recent commit from master @afcowie ?

@istathar
Copy link
Author

@mpickering Fedora 31 Linux on an Intel i9; GHC 8.8.3 as specified by the lts-15.5 Stackage snapshot (and brought in by stack) which brings in text-short 0.1.3. ghcide was master as at 27 March.

@domenkozar
Copy link
Contributor

@afcowie if you're using ghcide-nix, try uncommenting https://github.com/cachix/ghcide-nix/blob/master/nix/default.nix#L19 and check if that fixes it.

@istathar
Copy link
Author

istathar commented Apr 3, 2020

I have had some odd behaviours with -dynamic in the past - I had to stop [trying to] use that. Perhaps something built that way is lurking in my caches. I will blow away my ~/.stack and rebuild fresh.

@istathar
Copy link
Author

istathar commented Apr 4, 2020

I will blow away my ~/.stack and rebuild fresh.

So I did a full from scratch re-installation of all the Haskell infrastructure, and unfortunately the problem endures.

@jneira
Copy link
Member

jneira commented Oct 5, 2020

Maybe this one is fixed by haskell/ghcide#836? @istathar could you give a try, please?

@istathar
Copy link
Author

istathar commented Oct 5, 2020

@jneira alas. I just tried it with ghcide from 'master' and I'm afraid to report that when I remove the #ifdef workaround it once again:

> Configuring GHCi with the following packages: publish
> /home/andrew/src/aesiniath/publish/.stack-work/install/x86_64-linux-tinfo6/bdcec458283ae25b661c440111c3bb5c26aa34a6c34f4902d47dcef25a0df0b9/8.8.4/pkgdb:/home/andrew/.stack/snapshots/x86_64-linux-tinfo6/bdcec458283ae25b661c440111c3bb5c26aa34a6c34f4902d47dcef25a0df0b9/8.8.4/pkgdb:/home/andrew/.stack/programs/x86_64-linux/ghc-tinfo6-8.8.4/lib/ghc-8.8.4/package.conf.d
[INFO] Using interface files cache dir: /home/andrew/.cache/ghcide/main-f9d60a41ddf9df52f88d70bf88584bd1798823c3
[INFO] Making new HscEnv[main]
Segmentation fault (core dumped)

@DylanSp
Copy link

DylanSp commented Nov 7, 2020

I'm also getting this error with a Stack project on Linux (no Nix), initialized with stack new todomvc-backend servant; the problematic line is $(deriveJSON defaultOptions ''User). My comment here has the logs from running haskell-language-server against my code.

@FinleyMcIlwaine
Copy link
Collaborator

I'm also running into this with a cabal project on MacOS. The specific thing that's causing it for me is a typed TH quotation. Using GHC 8.10.2 and HLS 0.9.0. I can provide more information if it may help lead to a fix!

@mpickering
Copy link
Contributor

@FinleyMcIlwaine That is probably unrelated to this problem. Please make a new ticket.

@anmolitor
Copy link

I'm running into a segfault problem as well using postgresql-typed's template haskell. Tried with GHC 8.8.4, 8.10.2 and 8.10.4.
#1297 (comment)
Intellij Haskell does not seem to have problems with this, probably because it isn't ghcide-based?

@jneira
Copy link
Member

jneira commented Apr 19, 2021

Intellij Haskell does not seem to have problems with this, probably because it isn't ghcide-based?

Yeah it does not use ghcide, comparing with intellij could be a investigation path, thanks

@NSilv
Copy link

NSilv commented Jun 14, 2021

Just chiming in to say I get segfaults as well, only with template haskell on (specifically, quasiquotes from the path library). Although in my case, it only happens inside a nix-shell. (GHC 8.10.4, on WSL2+Fedora)

@franleplant
Copy link

I got this same problem and the only way I got to make it work was running hls inside nix-shell, in particular, I am developing plutus smart contracts and so I used the nix-shell config for plutus, find it here

@ZabelTech
Copy link

Hi, I got the same problem with HLS 1.3.0.0 and GHC 8.8.4.
HLS 1.1.0.0 and GHC 8.10.4 works fine.

@vst
Copy link

vst commented Sep 23, 2021

+1

HLS: 1.4.0.0
GHC: 8.10.7
Stackage LTS: lts-18.10

This one is related to file-embed library usage. Plain vanilla usage works:

$(makeRelativeToProject` "<relative-path-to-file>" >>= embedStringFile)

... but this helper causes issues (I guess at the call-site):

includeFile :: FilePath -> Q Exp
includeFile = makeRelativeToProject >=> embedStringFile

This was a quick hack to mitigate #481. So I am not sure if it is separate from #481, but thought I should report.

Also, in the past, I experienced issues with quasiquotes from the path library as reported earlier. I did not check with the recent versions, though.

@DylanSp
Copy link

DylanSp commented Oct 1, 2021

I'm running what I suspect might be this problem when trying to use HLS and VS Code on Wasp's Haskell component. The symptoms I see in VS Code are a crash loop, similar to #1297, with no errors reported from the server in the VS Code output.

When I run HLS 1.4.0 from an isolated install (installed with ghcup install hls --isolate ~/ghcup-isolated/hls, ran from the waspc directory with ~/ghcup-isolated/hls/haskell-language-server-8.10.4 .), I get a segfault, which is why I think this issue might be the problem. I tried inspecting the core dump with gdb, but backtracing didn't show any source info. If there's any other analysis I can run on the core dump, let me know, it's not my area of expertise.

I tried to replicate by building from source; I cloned this repo, and checked out the 1.4.0 tag. Building with Stack with or without debug symbols, with or without optimization, I wasn't able to replicate the issue; running the built executable against waspc ran to completion successfully every time. I tried:

  • stack build
  • stack build --ghc-options="-O2"
  • stack build --no-strip --ghc-options="-g"
  • stack build --no-strip --ghc-options="-g O2"

I haven't tried building from source with Cabal yet. I don't know if there are any other options/flags I should use to replicate what's used for the binary fetched by ghcup.

@jneira
Copy link
Member

jneira commented Oct 1, 2021

@DylanSp many thanks for the detailed update and for trying reproduce the problem.
It seems build from source fix the issue so i guess linking locally the needed libraries are the cause of it: what is your os?
It would be great to confirm building from source is a workaround for this.
Nowadays ghcup allows you building hls from source (see ghcup compile hls --help) just in case you could find it useful

@DylanSp
Copy link

DylanSp commented Oct 1, 2021

@jneira My OS is Linux Mint 18, based on Ubuntu 16.04, running on x64. It's entirely possible that something's weird with linking with local libraries; I had errors building an unrelated Haskell project due to GHC 8.10.7/base 4.10.3.0 wanting a newer glibc than what my OS provides. I'll take a stab at using ghcup to compile from source and see if I can replicate the issue that way.

@DylanSp
Copy link

DylanSp commented Oct 1, 2021

Tried building HLS from source with ghcup compile. That required installing GHC 8.10.4 first with ghcup install ghc 8.10.4; that failed due to

/lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.27' not found (required by libraries/base/dist-install/build/libHSbase-4.14.1.0-ghc8.10.4.so)

Will try using a Docker container to build against a more recent libc.

@jneira
Copy link
Member

jneira commented Oct 1, 2021

The prebuilt binary is being built with ubuntu 18 so it seems is could be related, other issues about crashes with TH are not reproduced building hls from source so i think they are related:

@DylanSp
Copy link

DylanSp commented Oct 3, 2021

I was able to reproduce the core dump when installing the prebuilt HLS binary in an Ubuntu 18.04 container. I tried building HLS from source in an 18.04 container as well; that ran successfully to completion. This C assertion failure fired after HLS finished analyzing everything:

[...]
Completed (179 files worked, 3 files failed)
haskell-language-server-8.10.4: allocatestack.c:384: advise_stack_range: Assertion `freesize < size' failed.

Aborted (core dumped)

https://gist.github.com/DylanSp/aea8643a19e883a19f974e652aada853 can reproduce the error.

$ docker build -t hls_prebuilt:18.04 .
$ docker run -it hls_prebuilt:18.04 bash
root@[container ID]:/wasp/wasp/waspc# $ stack build
root@[container ID]:/wasp/wasp/waspc# $ /hls_isolated/haskell-language-server 8.10.4 .

@DylanSp
Copy link

DylanSp commented Oct 3, 2021

After further investigation, I was able to get the same assertion failure with HLS built from source, in both Ubuntu 18.04 and 20.04 environments. The problem seems to be related to the prebuilt HLS binary using static linking; the problem didn't occur when HLS was built without the --enable-executable-static flag. Furthermore, when building with that flag enabled, the linker reported several warnings about needing " at runtime the shared libraries from the glibc version used for linking" for various functions. Dockerfiles for reproduction and the full text of the linker warnings are in https://gist.github.com/DylanSp/19a3b076bbf53320cbae5573790cd50e.

@jneira
Copy link
Member

jneira commented Oct 4, 2021

@DylanSp many thanks for the anaylisis and reproduction, so we could state that building from source with dynamic linking seems to workaround the issue (as other ones): if any of reporters can reproduce it with that setup, please update the issue, thanks!

@DylanSp
Copy link

DylanSp commented Oct 4, 2021

@jneira No problem; glad I can help by suggesting a workaround, and I hope with my info you can track down the root cause.

@jneira
Copy link
Member

jneira commented Nov 26, 2021

This issue was really useful to make progress on diagnostic template haskell related problems, thanks very much all for the effort put in investigations and reproductions.
I think we can close it as we have newer issues with more up to date info about the possible fix and the dynamic workaround (see #1431 and #2000)

@jneira jneira closed this as completed Nov 26, 2021
@istathar
Copy link
Author

istathar commented Jan 19, 2022

I hate to say it but I think we need to re-open this (cc @jneira).

I'm hitting the segfaut again, with haskell-language-server-1.5.1-linux-8.10.7 ; command-line build with (in our case stack build succeeds; hls fails unfortunately

$ ~/.config/Code/User/globalStorage/haskell.haskell/haskell-language-server-1.5.1-linux-8.10.7 
haskell-language-server version: 1.5.1.0 (GHC: 8.10.7) (PATH: /home/andrew/.config/Code/User/globalStorage/haskell.haskell/haskell-language-server-1.5.1-linux-8.10.7) (GIT hash: 745ef26f406dbdd5e4a538585f8519af9f1ccb09)
 ghcide setup tester in...

Step 1/4: Finding files to test...

...

2022-01-19 12:44:41.728446358 [ThreadId 252] INFO hls:	Making new HscEnv[main,main,main]
Segmentation fault (core dumped)
  • The CPP workaround above to #ifdef out the TH splice gets the tester and the IDE past the failure!

  • Using an explicit hie.yaml or the implicit one had no impact.

  • Things were working fine without this workaround for months if not longer, so something has changed that has caused this to regress a wee touch.

@jneira jneira reopened this Jan 19, 2022
@jneira
Copy link
Member

jneira commented Jan 31, 2022

I am gonna close this issue as all compiler crashes seems to have the same root cause:

If any of you think the issue should not be included generically feel free to reopen it (with a brief explanation if possible)
Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can-workaround component: ghcide type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc..
Projects
None yet
Development

No branches or pull requests