-
-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The case of the mysterious segfault
loop
#2314
Comments
Is HLS running inside the container or locally? |
Hey, thanks for the response =D HLS is running from inside the container -- everything has been kept in-container to be reproducible. The hope being once it works once, then it works forever 🤞 And everyone wanting a working setup can use that image or Devcontainer or Codespace. |
If the environment is reproducible, then I have no idea why HLS would segfault for only ~30% of the users. |
Ahh -- maybe some miscommunication on my end there, sorry. Traditionally, everyone has set up the project locally. We don't have very extensive docs on how to do this. There are a lot of implicit So the onboarding/setup process for contributors and devs can be somewhat painful. The Dev container I've posted here is an attempt to help that -- nobody is using it yet, though there is interest around it. I am unable to get HLS working inside of the Dev container. And I figured that starting from a reproducible container environment would make it much easier to talk about/debug this, since everyone can be on the same page 🙂 If it's possible to get HLS working in this/a container setup, then everyone will have 100% HLS success rate that wishes to use it 🎉 🥳 |
Also feel free to tell me |
I don't have time to go through the contributing notes. Can you explain what is preventing HLS from working in the Dev container? Have you tried |
No worries, that was more just an attempt to point out the motivation behind the container
I have @GavinRay97 ➜ /workspaces/graphql-engine (master) $ ls ~/.ghcup/bin
cabal ghci-8.10.2 haddock-8.10.2 haskell-language-server-8.10.5 haskell-language-server-8.6.4~1.4.0 haskell-language-server-9.0.1 hpc runghc-8.10
cabal-3.6.2.0 ghc-pkg haskell-language-server-8.10.2 haskell-language-server-8.10.5~1.4.0 haskell-language-server-8.6.5 haskell-language-server-9.0.1~1.4.0 hpc-8.10 runghc-8.10.2
ghc ghc-pkg-8.10 haskell-language-server-8.10.2~1.4.0 haskell-language-server-8.10.6 haskell-language-server-8.6.5~1.4.0 haskell-language-server-wrapper hpc-8.10.2 runhaskell
ghc-8.10 ghc-pkg-8.10.2 haskell-language-server-8.10.3 haskell-language-server-8.10.6~1.4.0 haskell-language-server-8.8.3 haskell-language-server-wrapper-1.4.0 hsc2hs runhaskell-8.10
ghc-8.10.2 ghcup haskell-language-server-8.10.3~1.4.0 haskell-language-server-8.10.7 haskell-language-server-8.8.3~1.4.0 hp2ps hsc2hs-8.10 runhaskell-8.10.2
ghci haddock haskell-language-server-8.10.4 haskell-language-server-8.10.7~1.4.0 haskell-language-server-8.8.4 hp2ps-8.10 hsc2hs-8.10.2
ghci-8.10 haddock-8.10 haskell-language-server-8.10.4~1.4.0 haskell-language-server-8.6.4 haskell-language-server-8.8.4~1.4.0 hp2ps-8.10.2 runghc @GavinRay97 ➜ /workspaces/graphql-engine (master ✗) $ ~/.ghcup/bin/haskell-language-server-wrapper --probe-tools
haskell-language-server version: 1.4.0.0 (GHC: 8.10.4) (PATH: /home/codespace/.ghcup/bin/haskell-language-server-wrapper-1.4.0) (GIT hash: 253547816ee216c53ee7dacc0ad3cac43e863d30)
Tool versions found on the $PATH
cabal: 3.6.2.0
stack: Not found
ghc: 8.10.2
Sure:
I have run with When it segfaults, no errors/warnings are printed beforehand. It just terminates. Here are comments from teammates, one mentions something about building it from source with some flags fixing it for him: |
CLICK TO EXPAND LOGFILE DOWNLOAD LINKS 👇
Okay, I have collected a lot of logs, and also noticed some behavior:
// .vscode/settings.json
{
"haskell.logFile": "/workspaces/graphql-engine/haskell-vscode-logs.txt",
"haskell.trace.client": "debug",
"haskell.trace.server": "messages",
"haskell.serverExecutablePath": "~/.ghcup/bin/haskell-language-server-8.10.2"
} Here is relevant output from first startup of VS Code HLS in the container:
[client][INFO] Searching for server executables haskell-language-server-wrapper,haskell-language-server in $PATH
[client][INFO] Downloading haskell-language-server
[client][INFO] Fetching the latest release from GitHub or from cache
[client][INFO] The latest release is 1.4.0
[client][INFO] Figure out the ghc version to use or advertise an installation link for missing components
[client][INFO] Working out the project GHC version. This might take a while...
[client][INFO] Executing '/home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-wrapper-1.4.0-linux --project-ghc-version' in cwd '/workspaces/graphql-engine' to get the project or file ghc version
[client][INFO] Execution of '/home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-wrapper-1.4.0-linux --project-ghc-version' terminated with code 0
[client][INFO] The GHC version for the project or file: 8.6.5
[client][INFO] Search for binary haskell-language-server-Linux-8.6.5 in release assests
[client][INFO] Downloading haskell-language-server 1.4.0 for GHC 8.6.5
[client][INFO] Activating the language server in the workspace folder: /workspaces/graphql-engine
[client][INFO] run command: /home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-1.4.0-linux-8.6.5 --lsp -d -l ~/haskell-vscode-logs
[client][INFO] debug command: /home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-1.4.0-linux-8.6.5 --lsp -d -l ~/haskell-vscode-logs
[client][INFO] document selector patten: /workspaces/graphql-engine/**/*
[client][INFO] Starting language server
haskell-language-server version: 1.4.0.0 (GHC: 8.6.5) (PATH: /home/codespace/.vscode-remote/data/User/globalStorage/haskell.haskell/haskell-language-server-1.4.0-linux-8.6.5) (GIT hash: 253547816ee216c53ee7dacc0ad3cac43e863d30)
Couldnt open log file ~/haskell-vscode-logs; falling back to stderr loggingStarting (haskell-language-server)LSP server...
with arguments: GhcideArguments {argsCommand = LSP, argsCwd = Nothing, argsShakeProfiling = Nothing, argsTesting = False, argsExamplePlugin = False, argsDebugOn = True, argsLogFile = Just "~/haskell-vscode-logs", argsThreads = 0, argsProjectGhcVersion = False}
with plugins: [PluginId "pragmas",PluginId "floskell",PluginId "fourmolu",PluginId "tactics",PluginId "ormolu",PluginId "stylish-haskell",PluginId "retrie",PluginId "brittany",PluginId "callHierarchy",PluginId "class",PluginId "haddockComments",PluginId "eval",PluginId "importLens",PluginId "refineImports",PluginId "moduleName",PluginId "hlint",PluginId "splice",PluginId "ghcide-hover-and-symbols",PluginId "ghcide-code-actions-imports-exports",PluginId "ghcide-code-actions-type-signatures",PluginId "ghcide-code-actions-bindings",PluginId "ghcide-code-actions-fill-holes",PluginId "ghcide-completions",PluginId "ghcide-type-lenses",PluginId "ghcide-core"]
in directory: /workspaces/graphql-engine
Starting LSP server...
If you are seeing this in a terminal, you probably should have run WITHOUT the --lsp option!
Started LSP server in 0.00s
setInitialDynFlags cradle: Cradle {cradleRootDir = "/workspaces/graphql-engine", cradleOptsProg = CradleAction: Default} @GavinRay97 ➜ /workspaces/graphql-engine (master ✗) $ ls /bin | grep ghc
ghc
ghc-8.6.5
ghci
ghci-8.6.5
ghc-pkg
ghc-pkg-8.6.5
haddock-ghc-8.6.5
runghc
runghc-8.6.5 Here is the output of both HLS with Output to
|
Thanks for the detailed bug report. To help myself understand the issue, there are two problems:
About the first one, it seems to me that the env for the vscode extension and the env in the shell is not the same. Maybe it is due they are using different profile files setting the PATH. The cli usually uses .bashrc and the graphical env where vscode is launched maybe is using /etc/profile. So i would try to double check it and source .bashrc in /etc/profile if that is the problem. The extension just run The problem seems the reported one here: #236 I am gonna add debug statements about the env vars, specially the PATH, in the vscode extension, to help trace those kind of issues. About the second one: it is unfortunate that hls crashes with no further info and that is something we have to fix. But i would try to disable all plugins, specially hlint, as i see lot of warnings emitted by that plugin. Other problematic plugins could be eval and tactics. Then, if it works without any plugin enabled, i would enable them until you get the offending one. The full config to disable all plugins is here: #2151 (comment) |
I am curious about that: the problems of those devs without using docker (i suppose) are related with the problems you are getting using docker? do they get random crashes as well? It would be a signal the project itself could have some charateristic which triggers the bug |
Also the use of template haskell usually is the cause of segfaults, have you identified if HLS crashes when loading modules using it (or with dependant modules using it) |
Apologies for the delayed response @jneira! Thanks for the reply -- and you're absolutely right about it being two separate issues.
Got it -- I should have thought to try with plugins disabled (I noticed quite a number of them are enabled when the log starts) so that's a good idea. Can go through this systematically, disabling all, and seeing if your suspected extensions cause the crash after some time
Somewhat embarassingly, I do not know enough about Haskell to be able to give you a great answer to this. Am pretty sure we DO use Template Haskell, have heard it mentioned before. I could tag some of my colleagues here as well if it would be helpful. Quick search reveals (at least these): -- | This module defines all basic Template Haskell functions we use in the rest
-- of this folder, to generate code that deals with all possible known
-- backends.
--
-- Those are all "normal" Haskell functions in the @Q@ monad: they deal with
-- values that represent Haskell code. Those functions are used, in other
-- modules, within Template Haskell splices. -- | A singleton-like GADT that associates a tag to each backend.
-- It is generated with Template Haskell for each 'Backend'. Its
-- declaration results in the following type:
--
-- data BackendTag (b :: BackendType) where
-- PostgresVanillaTag :: BackendTag ('Postgres 'Vanilla)
-- PostgresCitusTag :: BackendTag ('Postgres 'Citus)
-- MSSQLTag :: BackendTag 'MSSQL
-- ...
$( let name = mkName "BackendTag" And anecdotally, I believe HLS would segfault more often in areas related to our DB backend + SQL gen stuff. So that would line up with what you're saying.
Yes, none of them use Docker-based environments AFAIK. The majority are on Linux, with some on Macbooks. The distribution of the Haskell engineers OS-wise is something like:
What do you make of this line/what does this "mean"? Not sure if it should impact anything, but we link/use a decent number of C libraries during the build. Something like: libpq-dev libssl-dev postgresql-client-${postgres_ver}
postgresql-client-common
unixodbc-dev freetds-dev
default-libmysqlclient-dev libpcre3-dev libkrb5-dev |
It is referring to a workaround for template haskell problems wihch consists in get a haskell-language-server binary building it from source instead use a prebuilt binary. The built should use the option
Template haskell is a way to add "macros" to the language, to write code that generates code at compile time. I would bet there is the direct cause of the segfaults in your environment. So use a custom hls executable built with |
The workaround using -dynamic is at least used to address a problem with the Darwin linker and template haskell builds on Catalina. It may be used for other issues as well, but that has been my experience.
(And, fwiw, simply using the latest HLS from nixpkgs-2105 on my Darwin machine fixed my template Haskell crashes. I don’t know what your infra is like, and doubt the suggestion of “use nix” is super helpful, but if there’s a project amenable to Haskell.nix or similar, it might help differential diagnostics. And if it works, I believe you can also build a docker container from that derivation fairly painlessly.)
…On Nov 7, 2021, at 8:28 PM, Gavin Ray ***@***.***> wrote:
Apologies for the delayed response @jneira!
Thanks for the reply -- and you're absolutely right about it being two separate issues.
The VS Code ENV/PATH thing makes a lot of sense, since this doesn't happen when running the wrapper binary directly.
But i would try to disable all plugins, specially hlint, as i see lot of warnings emitted by that plugin. Other problematic plugins could be eval and tactics. Then, if it works without any plugin enabled, i would enable them until you get the offending one.
The full config to disable all plugins is here: #2151 (comment)
Got it -- I should have thought to try with plugins disabled (I noticed quite a number of them are enabled when the log starts) so that's a good idea. Can go through this systematically, disabling all, and seeing if your suspected extensions cause the crash after some time
Also the use of template haskell usually is the cause of segfaults, have you identified if HLS crashes when loading modules using it (or with dependant modules using it)
Somewhat embarassingly, I do not know enough about Haskell to be able to give you a great answer to this.
I know more about setting up Haskell build tooling and dev environments than I do the language! 😅
Am pretty sure we DO use Template Haskell, have heard it mentioned before. I could tag some of my colleagues here as well if it would be helpful.
Quick search reveals (at least these):
https://github.com/hasura/graphql-engine/blob/11a454c2d69bb05c3471be0d04d2282cc93a557e/server/src-lib/Hasura/SQL/TH.hs#L1-L7
-- | This module defines all basic Template Haskell functions we use in the rest
-- of this folder, to generate code that deals with all possible known
-- backends.
--
-- Those are all "normal" Haskell functions in the @q@ monad: they deal with
-- values that represent Haskell code. Those functions are used, in other
-- modules, within Template Haskell splices.
https://github.com/hasura/graphql-engine/blob/11a454c2d69bb05c3471be0d04d2282cc93a557e/server/src-lib/Hasura/SQL/Tag.hs#L16-L25
-- | A singleton-like GADT that associates a tag to each backend.
-- It is generated with Template Haskell for each 'Backend'. Its
-- declaration results in the following type:
--
-- data BackendTag (b :: BackendType) where
-- PostgresVanillaTag :: BackendTag ('Postgres 'Vanilla)
-- PostgresCitusTag :: BackendTag ('Postgres 'Citus)
-- MSSQLTag :: BackendTag 'MSSQL
-- ...
$( let name = mkName "BackendTag"
And anecdotally, I believe HLS would segfault more often in areas related to our DB backend + SQL gen stuff. So that would line up with what you're saying.
I am curious about that: the problems of those devs without using docker (i suppose) are related with the problems you are getting using docker? do they get random crashes as well? It would be a signal the project itself could have some charateristic which triggers the bug
Yes, none of them use Docker-based environments AFAIK. The majority are on Linux, with some on Macbooks.
The segfaults seem to be an issue primarily for the devs on Linux.
The distribution of the Haskell engineers OS-wise is something like:
(Ref: https://user-images.githubusercontent.com/26604994/139504577-28289dd5-d3c7-4fe6-bfd4-3789848c9408.png)
OS Percentage
Ubuntu/Debian 33%
MacOS 22%
Arch 16%
NixOS or Other 28%
What do you make of this line/what does this "mean"?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Ahh okay, understood -- thank you! I will build a Linux AMD64 binary with that flag following the comments in the issue and see if that makes a difference, in addition to systematically working through enabled plugins.
Brilliant! Well, that's a great lead to follow. @o1lo01ol1o Would seem like this
I don't know much ABOUT Nix, but I am a fan in theory of Nix/Guix (Guix seems easier syntactically, Nix lang is a bit hard to follow IMO) But a quick google leads to this: And it turns out a colleague has also written this, which I found during the same google:
I think many folks on our team already use Nix, so it may be something worth investigating |
could we state this would be related with template haskell as well? |
I am gonna close this issue as all compiler crashes seems to have the same root cause:
If any of you think the issue should not be included generically feel free to reopen it (with a brief explanation if possible) |
I do not know anything about/write Haskell, but I have been trying to make the tooling and experience better for contributors and folks on our team.
Part of our Codebase is in Haskell. I wrote a development Dockerfile that reproducibly creates an environment with needed deps for our Haskell app -- but I am unable to get HLS to function properly in it☹️
Probably user error, but taking an informal poll shows:
CLICK TO SHOW IMAGE 👇
What happens is that it builds, and then gets stuck in a
segfault
loop.To make this easy to reproduce, I've containerized everything -- you should be able to open it in your browser or use a VS Code Devcontainer locally to get an identical environment to the one that is broken.
Your environment
Output of
haskell-language-server --probe-tools
orhaskell-language-server-wrapper --probe-tools
:Which OS do you use:
Which lsp-client do you use:
Describe your project (alternative: link to the project):
Steps to reproduce
CLICK TO SHOW INSTRUCTIONS 👇
vim
,emacs
, etc, connected to remote Codespace ("Thin client")Without VS Code or Codespaces at all
.devcontainer/Dockerfile
/graphql-engine
, and run the Docker image (either manually or with Compose)In a browser
In VS Code locally/offline
In VS Code locally, but connected to a remote Codespace ("Thin Client")
Codespaces
extension in VS CodeCodespaces
, and either press+
to create a new one or click to connect to an existing:In a text editor like
vim
,emacs
, etc, connected to remote Codespace ("Thin client"):**ssh
command to connect to the codespace and then run it through your editor of choiceExpected behaviour
It doesn't segfault, or instead of segfaulting it prints helpful debug info before dying (I have tried turning verbose logging on, no dice 🙁)
Actual behaviour
It starts, segfaults at random (no pattern), and restarts itself, repeating the loop.
Include debug information
Execute in the root of your project the command
haskell-language-server --debug .
and paste the logs here:Debug output:
Paste the logs from the lsp-client, e.g. for VS Code
LSP logs:
The text was updated successfully, but these errors were encountered: