Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

building amazonka-ec2 with stack using ~7GB #549

Closed
juhp opened this issue Oct 7, 2019 · 9 comments
Closed

building amazonka-ec2 with stack using ~7GB #549

juhp opened this issue Oct 7, 2019 · 9 comments
Labels

Comments

@juhp
Copy link

juhp commented Oct 7, 2019

We are trying to build amazonka-ec2-1.6.1 with ghc-8.8.1 on Linux for Stackage Nightly using stack-2.1.3 and ghc is using a lot of memory: about 6.8GB, which may be causing problems.
Not sure if anything can be done, but thought I would report it here.

@LeifW
Copy link
Contributor

LeifW commented Oct 7, 2019

Stack builds packages in parallel by default I believe - try turning that off so only one package is built at a time?

@juhp
Copy link
Author

juhp commented Oct 7, 2019

    [280 of 280] Compiling Network.AWS.EC2
    gcc: error: .stack-work/dist/x86_64-linux/Cabal-3.0.0.0/build/Network/AWS/EC2/Types/Product.dyn_o: No such file or directory
    `gcc' failed in phase `Linker'. (Exit code: 1)

@juhp
Copy link
Author

juhp commented Oct 7, 2019

and another time:

--  While building package amazonka-ec2-1.6.1 using:
      /var/stackage/.stack/setup-exe-cache/x86_64-linux/Cabal-simple_mPHDZzAJ_3.0.0.0_ghc-8.8.1 --builddir=.stack-work/dist/x86_6
4-linux/Cabal-3.0.0.0 build lib:amazonka-ec2 test:amazonka-ec2-test --ghc-options ""
    Process exited with code: ExitFailure (-9) (THIS MAY INDICATE OUT OF MEMORY)
    Logs have been written to: /var/stackage/work/unpack-dir/.stack-work/logs/amazonka-ec2-1.6.1.log

    Preprocessing library for amazonka-ec2-1.6.1..
    Building library for amazonka-ec2-1.6.1..
    [  3 of 280] Compiling Network.AWS.EC2.Types.Product

juhp added a commit to commercialhaskell/stackage that referenced this issue Oct 7, 2019
    [280 of 280] Compiling Network.AWS.EC2
    gcc: error: .stack-work/dist/x86_64-linux/Cabal-3.0.0.0/build/Network/AWS/EC2/Types/Product.dyn_o: No such file or directory
    `gcc' failed in phase `Linker'. (Exit code: 1)

also

--  While building package amazonka-ec2-1.6.1 using:
      /var/stackage/.stack/setup-exe-cache/x86_64-linux/Cabal-simple_mPHDZzAJ_3.0.0.0_ghc-8.8.1 --builddir=.stack-work/dist/x86_64-linux/Cabal-3.0.0.0 build lib:amazonka-ec2 test:amazonka-ec2-test --ghc-options ""
    Process exited with code: ExitFailure (-9) (THIS MAY INDICATE OUT OF MEMORY)

    Logs have been written to: /var/stackage/work/unpack-dir/.stack-work/logs/amazonka-ec2-1.6.1.log

    Preprocessing library for amazonka-ec2-1.6.1..
    Building library for amazonka-ec2-1.6.1..
    [  3 of 280] Compiling Network.AWS.EC2.Types.Product
@brendanhay
Copy link
Owner

This is fairly well trodden territory by now - there's little short term I can do to affect how much memory varying versions of GHC use when compiling the libraries. There is discussion around splitting the graph of generated types into modules based on their SCCs or some other boundary to alleviate the fact Product.hs is too large for GHC to compile with reasonable memory.

@bgamari
Copy link

bgamari commented Jan 12, 2021

There was a GHC ticket raised about this recently. While we will try to improve things in the long term. However, given that the current situation seems barely tenable there are a few things that you might consider doing in the short term as well:

  1. Disable call-arity analysis (-fno-call-arity): As noted in the ticket cited above, the Product (and to a lesser extent the Sum modules` tend to hit a pathological case in this analysis. Disabling the analysis will slightly reduces residency and reduces compilation time in my measurements.
  2. Disable optimisation entirely in the generated modules (e.g. perhaps via {-# OPTIONS_GHC -O0 #-}). I would not expect any of this code to be a computational bottleneck in realistic applications; there is little sense in spending cycles optimising it.
  3. Break up larger modules, using hs-boot files to break cycles (avoiding the need for any SCC cleverness). Simply move each type (and its instances and lenses) into its own module, producing an hs-boot file for each with a simple data declaration. Ideally each module would then need only to SOURCE import the hs-boot files for types that that depends upon (e.g. has fields of). However, if that is too hard to arrange, you could go one step simpler: define a single Types module which SOURCE imports and reexports each of the types. This can then be import'd by each of the types' modules.

I will say that even with -O0 this module is disappointingly slow. I do hope that we can improve this in 9.2. This package is clearly a great stress test for GHC. Nevertheless, I do think that it could be made significantly less painful with a bit of restructuring (suggestion (3) above); compiling a single 10kLoC with hundreds of derived instances is simply asking for trouble. If that is too much work I would probably instead try suggestion (2), as this code really doesn't seem performance-critical.

@rossabaker
Copy link
Collaborator

The large Types modules are split as of #591, which substantially improved our compile time. I have not analyzed memory improvements.

@bgamari
Copy link

bgamari commented Jan 13, 2021

The large Types modules are split as of #591, which substantially improved our compile time. I have not analyzed memory improvements.

Right. I have been building amazonka master in my GHC measurements. However, even Product and Sum are very large given the how many non-trivial instances they derive. One module per type may seem excessive, but I suspect it will work quite well, especially given that this would provide plenty of module-level parallelism for GHC's parallel up-sweep to exploit.

@endgame
Copy link
Collaborator

endgame commented Sep 27, 2021

@bgamari Is this still bad, now that the pattern synonym stuff is in develop?

@brendanhay
Copy link
Owner

The pattern synonyms change itself has little to no impact on compilation performance - it's the quadratic core which is detailed in Edkso's Avoiding quadratic core code size with large records that we've always been subject to.

We are already generating one-type-per-module on main, which was introduced in #591 - this should be sufficient to ameliorate this particular issue. Further work/improvements to be followed up via #717

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants