Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The case of the mysterious segfault loop #2314

Closed
GavinRay97 opened this issue Oct 29, 2021 · 17 comments
Closed

The case of the mysterious segfault loop #2314

GavinRay97 opened this issue Oct 29, 2021 · 17 comments
Labels
type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc..

Comments

@GavinRay97
Copy link

GavinRay97 commented Oct 29, 2021

I do not know anything about/write Haskell, but I have been trying to make the tooling and experience better for contributors and folks on our team.

Part of our Codebase is in Haskell. I wrote a development Dockerfile that reproducibly creates an environment with needed deps for our Haskell app -- but I am unable to get HLS to function properly in it ☹️

Probably user error, but taking an informal poll shows:

  • Approx ~30% of our Haskell devs are unable to get HLS working
  • 100% of devs who responded said they use or would like to use HLS, if it worked
  • Trend appears to be that those who have issues are running Linux. Mac/OSx users don't report trouble.
CLICK TO SHOW IMAGE 👇

image

What happens is that it builds, and then gets stuck in a segfault loop.

To make this easy to reproduce, I've containerized everything -- you should be able to open it in your browser or use a VS Code Devcontainer locally to get an identical environment to the one that is broken.

Your environment

Output of haskell-language-server --probe-tools or haskell-language-server-wrapper --probe-tools:

@GavinRay97 ➜ /workspaces/graphql-engine (master ✗) $ ~/.ghcup/bin/haskell-language-server-wrapper --probe-tools
haskell-language-server version: 1.4.0.0 (GHC: 8.10.4) (PATH: /home/codespace/.ghcup/bin/haskell-language-server-wrapper-1.4.0) (GIT hash: 253547816ee216c53ee7dacc0ad3cac43e863d30)
Tool versions found on the $PATH
cabal:          3.6.2.0
stack:          Not found
ghc:            8.10.2

Which OS do you use:

Which lsp-client do you use:

  • VS Code

Describe your project (alternative: link to the project):


Steps to reproduce

CLICK TO SHOW INSTRUCTIONS 👇

Without VS Code or Codespaces at all

  • Use the Dockerfile at .devcontainer/Dockerfile
  • Set up a bind mount over /graphql-engine, and run the Docker image (either manually or with Compose)

In a browser

  1. Go here: https://github.com/GavinRay97/graphql-engine
  2. Press "New Codespace", as in image below

image

In VS Code locally/offline

In VS Code locally, but connected to a remote Codespace ("Thin Client")

  • Install the Codespaces extension in VS Code
  • On the navigation panel thing, click the Remote Explorer icon (circled in red), then from the dropdown at the top (circled in red) select Codespaces, and either press + to create a new one or click to connect to an existing:
    • image

In a text editor like vim, emacs, etc, connected to remote Codespace ("Thin client"):**

Expected behaviour

It doesn't segfault, or instead of segfaulting it prints helpful debug info before dying (I have tried turning verbose logging on, no dice 🙁)

Actual behaviour

It starts, segfaults at random (no pattern), and restarts itself, repeating the loop.

Include debug information

Execute in the root of your project the command haskell-language-server --debug . and paste the logs here:

Debug output:
<paste your logs here>

Paste the logs from the lsp-client, e.g. for VS Code

LSP logs:
<paste your logs here>
@pepeiborra
Copy link
Collaborator

Is HLS running inside the container or locally?

@GavinRay97
Copy link
Author

Hey, thanks for the response =D

HLS is running from inside the container -- everything has been kept in-container to be reproducible. The hope being once it works once, then it works forever 🤞 And everyone wanting a working setup can use that image or Devcontainer or Codespace.

@jneira jneira added type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc.. type: setup labels Oct 30, 2021
@pepeiborra
Copy link
Collaborator

If the environment is reproducible, then I have no idea why HLS would segfault for only ~30% of the users.

@GavinRay97
Copy link
Author

GavinRay97 commented Oct 30, 2021

Ahh -- maybe some miscommunication on my end there, sorry.

Traditionally, everyone has set up the project locally.

We don't have very extensive docs on how to do this. There are a lot of implicit apt libs needed, and specific versions, plus specific versions of GHC, etc.

So the onboarding/setup process for contributors and devs can be somewhat painful.

The Dev container I've posted here is an attempt to help that -- nobody is using it yet, though there is interest around it.

I am unable to get HLS working inside of the Dev container.

And I figured that starting from a reproducible container environment would make it much easier to talk about/debug this, since everyone can be on the same page 🙂

If it's possible to get HLS working in this/a container setup, then everyone will have 100% HLS success rate that wishes to use it 🎉 🥳

@GavinRay97
Copy link
Author

GavinRay97 commented Oct 30, 2021

Also feel free to tell me "sorry, you're on your own"/"not my problem", I wouldn't take it personally.
I just felt like I had to at least give reaching out a shot, you know? 😅

@pepeiborra
Copy link
Collaborator

I don't have time to go through the contributing notes. Can you explain what is preventing HLS from working in the Dev container? Have you tried cabal install haskell-language-server?

@GavinRay97
Copy link
Author

GavinRay97 commented Oct 30, 2021

I don't have time to go through the contributing notes.

No worries, that was more just an attempt to point out the motivation behind the container

Have you tried cabal install haskell-language-server?

I have haskell-language-server in ~/.ghcup/bin (not sure if this is the same effect? I know NOTHING about Haskell or it's ecosystem/tooling):

@GavinRay97 ➜ /workspaces/graphql-engine (master) $ ls ~/.ghcup/bin
cabal          ghci-8.10.2     haddock-8.10.2                        haskell-language-server-8.10.5        haskell-language-server-8.6.4~1.4.0  haskell-language-server-9.0.1          hpc            runghc-8.10
cabal-3.6.2.0  ghc-pkg         haskell-language-server-8.10.2        haskell-language-server-8.10.5~1.4.0  haskell-language-server-8.6.5        haskell-language-server-9.0.1~1.4.0    hpc-8.10       runghc-8.10.2
ghc            ghc-pkg-8.10    haskell-language-server-8.10.2~1.4.0  haskell-language-server-8.10.6        haskell-language-server-8.6.5~1.4.0  haskell-language-server-wrapper        hpc-8.10.2     runhaskell
ghc-8.10       ghc-pkg-8.10.2  haskell-language-server-8.10.3        haskell-language-server-8.10.6~1.4.0  haskell-language-server-8.8.3        haskell-language-server-wrapper-1.4.0  hsc2hs         runhaskell-8.10
ghc-8.10.2     ghcup           haskell-language-server-8.10.3~1.4.0  haskell-language-server-8.10.7        haskell-language-server-8.8.3~1.4.0  hp2ps                                  hsc2hs-8.10    runhaskell-8.10.2
ghci           haddock         haskell-language-server-8.10.4        haskell-language-server-8.10.7~1.4.0  haskell-language-server-8.8.4        hp2ps-8.10                             hsc2hs-8.10.2
ghci-8.10      haddock-8.10    haskell-language-server-8.10.4~1.4.0  haskell-language-server-8.6.4         haskell-language-server-8.8.4~1.4.0  hp2ps-8.10.2                           runghc
@GavinRay97 ➜ /workspaces/graphql-engine (master ✗) $ ~/.ghcup/bin/haskell-language-server-wrapper --probe-tools
haskell-language-server version: 1.4.0.0 (GHC: 8.10.4) (PATH: /home/codespace/.ghcup/bin/haskell-language-server-wrapper-1.4.0) (GIT hash: 253547816ee216c53ee7dacc0ad3cac43e863d30)
Tool versions found on the $PATH
cabal:          3.6.2.0
stack:          Not found
ghc:            8.10.2

Can you explain what is preventing HLS from working in the Dev container?

Sure:

  • HLS starts, segfaults at random (no pattern), and restarts itself, repeating the loop.

I have run with --debug and maximum verbosity, there's no apparent pattern or specific file.

When it segfaults, no errors/warnings are printed beforehand. It just terminates.
Let me collect some logfiles for both the LSP in VS Code and the HLS binary, will upload here.

Here are comments from teammates, one mentions something about building it from source with some flags fixing it for him:

image

@GavinRay97
Copy link
Author

GavinRay97 commented Oct 30, 2021

CLICK TO EXPAND LOGFILE DOWNLOAD LINKS 👇

Okay, I have collected a lot of logs, and also noticed some behavior:

  • HLS for VS Code does not use the project GHC + ghcup version. There is GHC-8.6.5 installed in /bin, and it chooses to use that one. Even if manually setting the haskell-language-server binary in settings -- it does not change it:
    • Running haskell-language-server-wrapper detects it properly. I think there might be some disconnect between what the VS Code extension is doing, and what haskell-language-server-wrapper is doing:
    • Consulting the cradle to get project GHC version...
      Project GHC version: 8.10.2
      haskell-language-server exe candidates: ["haskell-language-server-8.10.2","haskell-language-server"]
      Launching haskell-language-server exe at:/home/codespace/.ghcup/bin/haskell-language-server-8.10.2
// .vscode/settings.json
{
	"haskell.logFile": "/workspaces/graphql-engine/haskell-vscode-logs.txt",
	"haskell.trace.client": "debug",
	"haskell.trace.server": "messages",
    "haskell.serverExecutablePath": "~/.ghcup/bin/haskell-language-server-8.10.2"
}

Here is relevant output from first startup of VS Code HLS in the container:

  • Notice: The GHC version for the project or file: 8.6.5
  • What is strange is that if I run the haskell-language-server-wrapper-1.4.0-linux --project-ghc-version command myself, it even reports 8.10.2
    • @GavinRay97/workspaces/graphql-engine (master ✗) $ /home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-wrapper-1.4.0-linux --project-ghc-version
      No 'hie.yaml' found. Try to discover the project type!
      8.10.2
  • If you change the haskell.serverExecutablePath, it won't take effect unless you install/re-install the extension
    • Even then, it appears to be assuming that GHC is installed in /usr/lib/ghc:

      • setInitialDynFlags: Can't parse "/usr/lib/ghc/platformConstants"
      • @GavinRay97/workspaces/graphql-engine (master ✗) $ ls ~/.ghcup/ghc/8.10.2/lib/ghc-8.10.2 | grep platform
        platformConstants
    • CLICK TO EXPAND 8.10.2 LOG👇
      [client][INFO] Trying to find the server executable in: ~/.ghcup/bin/haskell-language-server-8.10.2
      [client][INFO] Location after path variables subsitution: /home/codespace/.ghcup/bin/haskell-language-server-8.10.2
      [client][INFO] Activating the language server in the workspace folder: /workspaces/graphql-engine
      [client][INFO] run command: /home/codespace/.ghcup/bin/haskell-language-server-8.10.2 --lsp -d -l /workspaces/graphql-engine/haskell-vscode-logs.txt
      [client][INFO] debug command: /home/codespace/.ghcup/bin/haskell-language-server-8.10.2 --lsp -d -l /workspaces/graphql-engine/haskell-vscode-logs.txt
      [client][INFO] document selector patten: /workspaces/graphql-engine/**/*
      [client][INFO] Starting language server
      haskell-language-server version: 1.4.0.0 (GHC: 8.10.2) (PATH: /home/codespace/.ghcup/bin/haskell-language-server-8.10.2~1.4.0) (GIT hash: 253547816ee216c53ee7dacc0ad3cac43e863d30)
      Starting (haskell-language-server)LSP server...
        with arguments: GhcideArguments {argsCommand = LSP, argsCwd = Nothing, argsShakeProfiling = Nothing, argsTesting = False, argsExamplePlugin = False, argsDebugOn = True, argsLogFile = Just "/workspaces/graphql-engine/haskell-vscode-logs.txt", argsThreads = 0, argsProjectGhcVersion = False}
        with plugins: [PluginId "pragmas",PluginId "floskell",PluginId "fourmolu",PluginId "tactics",PluginId "ormolu",PluginId "stylish-haskell",PluginId "retrie",PluginId "brittany",PluginId "callHierarchy",PluginId "class",PluginId "haddockComments",PluginId "eval",PluginId "importLens",PluginId "refineImports",PluginId "moduleName",PluginId "hlint",PluginId "splice",PluginId "ghcide-hover-and-symbols",PluginId "ghcide-code-actions-imports-exports",PluginId "ghcide-code-actions-type-signatures",PluginId "ghcide-code-actions-bindings",PluginId "ghcide-code-actions-fill-holes",PluginId "ghcide-completions",PluginId "ghcide-type-lenses",PluginId "ghcide-core"]
        in directory: /workspaces/graphql-engine
       Starting LSP server...
      If you are seeing this in a terminal, you probably should have run WITHOUT the --lsp option!
      Started LSP server in 0.00s
      setInitialDynFlags cradle: Cradle {cradleRootDir = "/workspaces/graphql-engine", cradleOptsProg = CradleAction: Default}
      setInitialDynFlags: Can't parse "/usr/lib/ghc/platformConstants"
      Output from setting up the cradle Cradle {cradleRootDir = "/workspaces/graphql-engine", cradleOptsProg = CradleAction: Default}
[client][INFO] Searching for server executables haskell-language-server-wrapper,haskell-language-server in $PATH
[client][INFO] Downloading haskell-language-server
[client][INFO] Fetching the latest release from GitHub or from cache
[client][INFO] The latest release is 1.4.0
[client][INFO] Figure out the ghc version to use or advertise an installation link for missing components
[client][INFO] Working out the project GHC version. This might take a while...
[client][INFO] Executing '/home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-wrapper-1.4.0-linux --project-ghc-version' in cwd '/workspaces/graphql-engine' to get the project or file ghc version
[client][INFO] Execution of '/home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-wrapper-1.4.0-linux --project-ghc-version' terminated with code 0
[client][INFO] The GHC version for the project or file: 8.6.5
[client][INFO] Search for binary haskell-language-server-Linux-8.6.5 in release assests
[client][INFO] Downloading haskell-language-server 1.4.0 for GHC 8.6.5
[client][INFO] Activating the language server in the workspace folder: /workspaces/graphql-engine
[client][INFO] run command: /home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-1.4.0-linux-8.6.5 --lsp -d -l ~/haskell-vscode-logs
[client][INFO] debug command: /home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-1.4.0-linux-8.6.5 --lsp -d -l ~/haskell-vscode-logs
[client][INFO] document selector patten: /workspaces/graphql-engine/**/*
[client][INFO] Starting language server
haskell-language-server version: 1.4.0.0 (GHC: 8.6.5) (PATH: /home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-1.4.0-linux-8.6.5) (GIT hash: 253547816ee216c53ee7dacc0ad3cac43e863d30)
Couldnt open log file ~/haskell-vscode-logs; falling back to stderr loggingStarting (haskell-language-server)LSP server...
  with arguments: GhcideArguments {argsCommand = LSP, argsCwd = Nothing, argsShakeProfiling = Nothing, argsTesting = False, argsExamplePlugin = False, argsDebugOn = True, argsLogFile = Just "~/haskell-vscode-logs", argsThreads = 0, argsProjectGhcVersion = False}
  with plugins: [PluginId "pragmas",PluginId "floskell",PluginId "fourmolu",PluginId "tactics",PluginId "ormolu",PluginId "stylish-haskell",PluginId "retrie",PluginId "brittany",PluginId "callHierarchy",PluginId "class",PluginId "haddockComments",PluginId "eval",PluginId "importLens",PluginId "refineImports",PluginId "moduleName",PluginId "hlint",PluginId "splice",PluginId "ghcide-hover-and-symbols",PluginId "ghcide-code-actions-imports-exports",PluginId "ghcide-code-actions-type-signatures",PluginId "ghcide-code-actions-bindings",PluginId "ghcide-code-actions-fill-holes",PluginId "ghcide-completions",PluginId "ghcide-type-lenses",PluginId "ghcide-core"]
  in directory: /workspaces/graphql-engine
 Starting LSP server...
If you are seeing this in a terminal, you probably should have run WITHOUT the --lsp option!
Started LSP server in 0.00s
setInitialDynFlags cradle: Cradle {cradleRootDir = "/workspaces/graphql-engine", cradleOptsProg = CradleAction: Default}
@GavinRay97/workspaces/graphql-engine (master ✗) $ ls /bin | grep ghc
ghc
ghc-8.6.5
ghci
ghci-8.6.5
ghc-pkg
ghc-pkg-8.6.5
haddock-ghc-8.6.5
runghc
runghc-8.6.5

Here is the output of both HLS with --debug, and the LSP startup.
(Important bit here seems to be this as the last line, but I'm not sure)

Output to stdout from running haskell-language-server-wrapper --debug --logfile <file> .:

CLICK TO EXPAND 👇
@GavinRay97/workspaces/graphql-engine (master ✗) $ haskell-language-server-wrapper --debug --logfile ./haskell-language-server-logfile-debug-enabled.txt .
No 'hie.yaml' found. Try to discover the project type!
Run entered for haskell-language-server-wrapper(haskell-language-server-wrapper) Version 1.4.0.0, Git revision 253547816ee216c53ee7dacc0ad3cac43e863d30 (dirty) x86_64 ghc-8.10.4
Current directory: /workspaces/graphql-engine
Operating system: linux
Arguments: ["--debug","--logfile","./haskell-language-server-logfile-debug-enabled.txt","."]
Cradle directory: /workspaces/graphql-engine
Cradle type: Cabal

Tool versions found on the $PATH
cabal:          3.6.2.0
stack:          Not found
ghc:            8.10.2


Consulting the cradle to get project GHC version...
Project GHC version: 8.10.2
haskell-language-server exe candidates: ["haskell-language-server-8.10.2","haskell-language-server"]
Launching haskell-language-server exe at:/home/codespace/.ghcup/bin/haskell-language-server-8.10.2
haskell-language-server version: 1.4.0.0 (GHC: 8.10.2) (PATH: /home/codespace/.ghcup/bin/haskell-language-server-8.10.2~1.4.0) (GIT hash: 253547816ee216c53ee7dacc0ad3cac43e863d30)
 ghcide setup tester in /workspaces/graphql-engine.
Report bugs at https://github.com/haskell/haskell-language-server/issues

Step 1/4: Finding files to test in /workspaces/graphql-engine
Found 391 files

Step 2/4: Looking for hie.yaml files that control setup
Found 1 cradle
  ()

Step 3/4: Initializing the IDE

Step 4/4: Type checking the files
Output from setting up the cradle Cradle {cradleRootDir = "/workspaces/graphql-engine", cradleOptsProg = CradleAction: Cabal}

<THE ABOVE REPEATED>

COMMON symbol, size 96 name batch_point_buffer allocated at 0x419ea000
haskell-language-server-wrapper: callProcess: /home/codespace/.ghcup/bin/haskell-language-server-8.10.2 "--debug" "--logfile" "./haskell-language-server-logfile-debug-enabled.txt" "." (exit -11): failed
COMMON symbol, size 96 name batch_point_buffer allocated at 0x419ea000
haskell-language-server-wrapper: callProcess: /home/codespace/.ghcup/bin/haskell-language-server-8.10.2 "--debug" "--logfile" "./haskell-language-server-logfile-debug-enabled.txt" "." (exit -11): failed

Also I get this, which I think again might be related to something about the VSC extension giving preference to /bin instead of the project GHC version? 🤔

image


image

@GavinRay97/workspaces/graphql-engine (master ✗) $ cabal --version
cabal-install version 3.6.2.0
compiled using version 3.6.2.0 of the Cabal library 

@GavinRay97/workspaces/graphql-engine (master ✗) $ file /home/codespace/.ghcup/bin/cabal
/home/codespace/.ghcup/bin/cabal: symbolic link to cabal-3.6.2.0

@GavinRay97/workspaces/graphql-engine (master ✗) $ file /home/codespace/.ghcup/bin/cabal-3.6.2.0
/home/codespace/.ghcup/bin/cabal-3.6.2.0: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped

@GavinRay97/workspaces/graphql-engine (master ✗) $ whoami
codespace

@GavinRay97/workspaces/graphql-engine (master ✗) $ stat /home/codespace/.ghcup/bin/cabal-3.6.2.0
  File: /home/codespace/.ghcup/bin/cabal-3.6.2.0
  Size: 31840440        Blocks: 62192      IO Block: 4096   regular file
Device: 32h/50d Inode: 1310957     Links: 1
Access: (0755/-rwxr-xr-x)  Uid: ( 1000/codespace)   Gid: ( 1000/codespace)
Access: 2021-10-30 15:29:55.000000000 +0000
Modify: 2021-10-30 15:29:55.000000000 +0000
Change: 2021-10-30 15:30:43.062521383 +0000
 Birth: -

setInitialDynFlags cradle: Cradle {cradleRootDir = "/workspaces/graphql-engine", cradleOptsProg = CradleAction: Default}

Couldnt load cradle for libdir: (CradleError {cradleErrorDependencies = [], cradleErrorExitCode = ExitSuccess, cradleErrorStderr = ["Couldn't execute ghc --print-libdir"]},"/workspaces/graphql-engine",Nothing,Cradle {cradleRootDir = "/workspaces/graphql-engine", cradleOptsProg = CradleAction: Default})
@GavinRay97/workspaces/graphql-engine (master ✗) $ ghc --print-libdir
/home/codespace/.ghcup/ghc/8.10.2/lib/ghc-8.10.2

Example segfault when running haskell-language-server-wrapper:

File:     /workspaces/graphql-engine/server/src-lib/Hasura/Backends/Postgres/Connection.hs
Hidden:   no
Range:    382:26-382:54
Source:   hlint
Severity: DsInfo
Message:  Redundant bracketFound:(object ["from_env" .= var])Why not:object ["from_env" .= var]
File:     /workspaces/graphql-engine/server/src-lib/Hasura/Backends/Postgres/Connection.hs
Hidden:   no
Range:    408:61-408:85
Source:   hlint
Severity: DsInfo
Message:  Redundant bracketFound:f <$> (pgcSslPassword pgCerts)Why not:f <$> pgcSslPassword pgCerts
Segmentation fault (core dumped)

@jneira
Copy link
Member

jneira commented Nov 3, 2021

Thanks for the detailed bug report. To help myself understand the issue, there are two problems:

  • The vscode extension is not picking the correct hls version automatically
  • Once you get it to pick the correct one, installed by ghcup and suited to ghc-8.10.2, it crashes randomly

About the first one, it seems to me that the env for the vscode extension and the env in the shell is not the same. Maybe it is due they are using different profile files setting the PATH. The cli usually uses .bashrc and the graphical env where vscode is launched maybe is using /etc/profile. So i would try to double check it and source .bashrc in /etc/profile if that is the problem. The extension just run hls-wrapper --project-ghc-version as you did in the cli, but it seems that execution within the extension is returning the default system ghc, without taking in account ghcup. So it drives me to think ghcup is not in PATH for the vscode gui. Also the fact cabal is not being found, etc, etc.

The problem seems the reported one here: #236

I am gonna add debug statements about the env vars, specially the PATH, in the vscode extension, to help trace those kind of issues.

About the second one: it is unfortunate that hls crashes with no further info and that is something we have to fix. But i would try to disable all plugins, specially hlint, as i see lot of warnings emitted by that plugin. Other problematic plugins could be eval and tactics. Then, if it works without any plugin enabled, i would enable them until you get the offending one.

The full config to disable all plugins is here: #2151 (comment)

@jneira
Copy link
Member

jneira commented Nov 3, 2021

Approx ~30% of our Haskell devs are unable to get HLS working

I am curious about that: the problems of those devs without using docker (i suppose) are related with the problems you are getting using docker? do they get random crashes as well? It would be a signal the project itself could have some charateristic which triggers the bug

@jneira
Copy link
Member

jneira commented Nov 3, 2021

Also the use of template haskell usually is the cause of segfaults, have you identified if HLS crashes when loading modules using it (or with dependant modules using it)

@GavinRay97
Copy link
Author

GavinRay97 commented Nov 7, 2021

Apologies for the delayed response @jneira!

Thanks for the reply -- and you're absolutely right about it being two separate issues.
The VS Code ENV/PATH thing makes a lot of sense, since this doesn't happen when running the wrapper binary directly.

But i would try to disable all plugins, specially hlint, as i see lot of warnings emitted by that plugin. Other problematic plugins could be eval and tactics. Then, if it works without any plugin enabled, i would enable them until you get the offending one.

The full config to disable all plugins is here: #2151 (comment)

Got it -- I should have thought to try with plugins disabled (I noticed quite a number of them are enabled when the log starts) so that's a good idea. Can go through this systematically, disabling all, and seeing if your suspected extensions cause the crash after some time

Also the use of template haskell usually is the cause of segfaults, have you identified if HLS crashes when loading modules using it (or with dependant modules using it)

Somewhat embarassingly, I do not know enough about Haskell to be able to give you a great answer to this.
I know more about setting up Haskell build tooling and dev environments than I do the language! 😅

Am pretty sure we DO use Template Haskell, have heard it mentioned before. I could tag some of my colleagues here as well if it would be helpful.

Quick search reveals (at least these):

-- | This module defines all basic Template Haskell functions we use in the rest
-- of this folder, to generate code that deals with all possible known
-- backends.
--
-- Those are all "normal" Haskell functions in the @Q@ monad: they deal with
-- values that represent Haskell code. Those functions are used, in other
-- modules, within Template Haskell splices.
-- | A singleton-like GADT that associates a tag to each backend.
-- It is generated with Template Haskell for each 'Backend'. Its
-- declaration results in the following type:
--
--   data BackendTag (b :: BackendType) where
--     PostgresVanillaTag :: BackendTag ('Postgres 'Vanilla)
--     PostgresCitusTag   :: BackendTag ('Postgres 'Citus)
--     MSSQLTag           :: BackendTag 'MSSQL
--     ...
$( let name = mkName "BackendTag"

And anecdotally, I believe HLS would segfault more often in areas related to our DB backend + SQL gen stuff. So that would line up with what you're saying.

I am curious about that: the problems of those devs without using docker (i suppose) are related with the problems you are getting using docker? do they get random crashes as well? It would be a signal the project itself could have some charateristic which triggers the bug

Yes, none of them use Docker-based environments AFAIK. The majority are on Linux, with some on Macbooks.
The segfaults seem to be an issue primarily for the devs on Linux.

image

The distribution of the Haskell engineers OS-wise is something like:
(Ref: https://user-images.githubusercontent.com/26604994/139504577-28289dd5-d3c7-4fe6-bfd4-3789848c9408.png)

OS Percentage
Ubuntu/Debian 33%
MacOS 22%
Arch 16%
NixOS or Other 28%

What do you make of this line/what does this "mean"?

image

Not sure if it should impact anything, but we link/use a decent number of C libraries during the build.

Something like:

        libpq-dev libssl-dev postgresql-client-${postgres_ver}
        postgresql-client-common
        unixodbc-dev freetds-dev
        default-libmysqlclient-dev libpcre3-dev libkrb5-dev 

@jneira
Copy link
Member

jneira commented Nov 7, 2021

What do you make of this line/what does this "mean"?

It is referring to a workaround for template haskell problems wihch consists in get a haskell-language-server binary building it from source instead use a prebuilt binary. The built should use the option -dynamic for ghc, the haskell compiler. There are several ways to do it but you can consult here: #1431 (comment)
The problem is not directly related with c libraries used.

Am pretty sure we DO use Template Haskell, have heard it mentioned before. I could tag some of my colleagues here as well if it would be helpful.

Template haskell is a way to add "macros" to the language, to write code that generates code at compile time. I would bet there is the direct cause of the segfaults in your environment. So use a custom hls executable built with -dynamic might help

@o1lo01ol1o
Copy link

o1lo01ol1o commented Nov 7, 2021 via email

@GavinRay97
Copy link
Author

GavinRay97 commented Nov 7, 2021

It is referring to a workaround for template haskell problems wihch consists in get a haskell-language-server binary building it from source instead use a prebuilt binary. The built should use the option -dynamic for ghc, the haskell compiler. There are several ways to do it but you can consult here: #1431 (comment)

Ahh okay, understood -- thank you! I will build a Linux AMD64 binary with that flag following the comments in the issue and see if that makes a difference, in addition to systematically working through enabled plugins.

Template haskell is a way to add "macros" to the language, to write code that generates code at compile time. I would bet there is the direct cause of the segfaults in your environment. So use a custom hls executable built with -dynamic might help

Brilliant! Well, that's a great lead to follow.
Any ideas from an implementors point of view why HLS might be struggling with it -- or is that the Ten Million Dollar question we're all asking? 😅


@o1lo01ol1o Would seem like this -dynamic thing is certainly worth a shot then.

(And, fwiw, simply using the latest HLS from nixpkgs-2105 on my Darwin machine fixed my template Haskell crashes. I don’t know what your infra is like, and doubt the suggestion of “use nix” is super helpful, but if there’s a project amenable to Haskell.nix or similar, it might help differential diagnostics. And if it works, I believe you can also build a docker container from that derivation fairly painlessly.)

I don't know much ABOUT Nix, but I am a fan in theory of Nix/Guix (Guix seems easier syntactically, Nix lang is a bit hard to follow IMO)

But a quick google leads to this:

And it turns out a colleague has also written this, which I found during the same google:

I think many folks on our team already use Nix, so it may be something worth investigating

@jneira
Copy link
Member

jneira commented Nov 26, 2021

could we state this would be related with template haskell as well?

@jneira
Copy link
Member

jneira commented Jan 31, 2022

I am gonna close this issue as all compiler crashes seems to have the same root cause:

If any of you think the issue should not be included generically feel free to reopen it (with a brief explanation if possible)
Thanks all!

@jneira jneira closed this as completed Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't right: doesn't work as intended, documentation is missing/outdated, etc..
Projects
None yet
Development

No branches or pull requests

4 participants