Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design a 'Hooks' build type to replace 'Custom' #9292

Closed
mpickering opened this issue Sep 29, 2023 · 81 comments
Closed

Design a 'Hooks' build type to replace 'Custom' #9292

mpickering opened this issue Sep 29, 2023 · 81 comments

Comments

@mpickering
Copy link
Collaborator

mpickering commented Sep 29, 2023

This ticket exists to track the design of the a new 'Hooks' build-type. The 'Hooks' build type is the successor to the 'Custom' build type.

The Custom build type allows the build process to be completely modified, for example, a user can provide whatever buildHook they want, and so higher-level tools such as cabal-install and stack have to treat packages with Custom build type as black boxes.

In practice, no one uses this full power of Custom to completely change a phase, custom Setup.hs scripts augment rather than replace the build phases. The hooks interface we are designing will contain pre/post hooks for specific phases which
support this augmentation in a specific way.

The Hooks build type will be designed to subsume all known uses of Custom Setup.hs script.

Subsequently we hope to also fix a number of open tickets relating to the UserHooks interface by considering them in our design of the new SetupHooks interface.

@andreabedini
Copy link
Collaborator

Is this a proposal or there is some work in progress?

I wonder if this is worth doing at all. From my POV custom setups should just disappear and be forgotten. If there are use-cases of custom setups that Cabal cannot yet express, we should rather work in the direction of supporting those use-cases.

Note that adding a new build-type does not simplify building any package that currently use a custom setup; and packages using a custom setup should be better moving to a simple setup (or any other declarative setup, if there were any other).

My 2c.

@angerman
Copy link
Collaborator

angerman commented Oct 2, 2023

How will build-type: Hook work in the cross compilation setting?

@mpickering
Copy link
Collaborator Author

Is this a proposal or there is some work in progress?

I wonder if this is worth doing at all. From my POV custom setups should just disappear and be forgotten. If there are use-cases of custom setups that Cabal cannot yet express, we should rather work in the direction of supporting those use-cases.

Note that adding a new build-type does not simplify building any package that currently use a custom setup; and packages using a custom setup should be better moving to a simple setup (or any other declarative setup, if there were any other).

My 2c.

There is work in progress, we are writing a design document which can be shared for review once it is finished.

Part of the work is to identify if there are some common features of current Setup.hs which can or should be made into declarative features.

As long as the declarative feature involves executing some arbitrary Haskell executable, I don't see that has a significant benefit over a Setup.hs script.

@mpickering
Copy link
Collaborator Author

How will build-type: Hook work in the cross compilation setting?

In the same way as build-type: Custom is broken with cross-compilers, so is build-type: Hooks. This is orthogonal to this work as cabal-install does not understand about cross-compilation at all.

@angerman
Copy link
Collaborator

angerman commented Oct 2, 2023

Build-type custom is broken in cabal-install. Not strictly with Setup.hs. So we can assume that build-type: Hook will work with setup.hs as well?

@hasufell
Copy link
Member

hasufell commented Oct 2, 2023

This seems large and significant enough to create a HF tech proposal to get input from a wider range of experts?

I know it's hard to source community opinions and I'd rather not ask on discourse. Hence maybe involving HF is worthwhile?

@mpickering
Copy link
Collaborator Author

Build-type custom is broken in cabal-install. Not strictly with Setup.hs. So we can assume that build-type: Hook will work with setup.hs as well?

How is Custom broken in cabal-install? Ticket?

@angerman
Copy link
Collaborator

angerman commented Oct 2, 2023

Custom is broken in cabal-install for cross compilation because cabal-install doesn't know about multiple compiler. For non cabal-install using cross compilation pipelines Custom does work (because it's effectively just Setup.hs.

@mpickering
Copy link
Collaborator Author

mpickering commented Oct 2, 2023

For non cabal-install using cross compilation pipelines Custom does work (because it's effectively just Setup.hs.

What does that mean? If you are not using cabal-install can't you compile the ./Setup.hs with the compiler which targets the host and then execute ./Setup configure, ./Setup build?

@angerman
Copy link
Collaborator

angerman commented Oct 2, 2023

You can use the bootstrap compiler for your cross compiler. And most of the time you want to build your cross compiler from the same source as your bootstrap compiler. So both are the same version. This often gives a good enough approximation.

Some others use the cross compiler to build the Setup.hs and evaluate it in a cross context (qemu, wine, ...).

The second approach could potentially work for cabal-install, if we had setup-wrapper; depending on the target might not be the preferred approach though.

I am not saying custom setup.hs blackboxes are great. I'd much prefer we didn't have them. I am pointing out that we have practical approaches to deal with them in the cross compilation setting.

Hence my question if this new build-type will make the current situation better or worse.

@mpickering
Copy link
Collaborator Author

Are you saying that there are two options:

  • Build the Setup.hs with a host compiler, but that has to be the same version as the cross compiler (why the same version?)
  • Build the Setup.hs with a cross-compiler and then run the executable in an emulator.

It seems that you are suggesting you have to run the Setup.hs script in the cross context, I don't understand why you have to do that?

As imagined, the new build-type won't make things better or worse, just the same as before (for cross-compilation).

@angerman
Copy link
Collaborator

angerman commented Oct 2, 2023

Yes. Those are the two options that I've seen being used. Why the same version? Because you effectively want your cross compiler to be a stage3 compiler for sanity and behaviour reasons.

Option (a) is somewhat dishonest to the platform configure is run on (after all we don't really have this specified anywhere but the assumption occasionally is that the configure phase runs on the same host as the final build product).
Option (b) is more honest to the platform, but brings with it the need for a full target toolchain, and tooling available to execute in the target context.

So I'll take that Hooks build-type will have the same drawbacks as the current Custom build type, and none that make cross compilation harder. That was the statement I was after.

@andreabedini
Copy link
Collaborator

andreabedini commented Oct 5, 2023

As long as the declarative feature involves executing some arbitrary Haskell executable, I don't see that has a significant benefit over a Setup.hs script.

Maybe my brain needs more coffee but I do see a difference: a part from custom setups, we don't currently run arbitrary Haskell executables during the build process. Is that right?

What makes custom setups undesiderable is not (only) the security concern in running arbitrary code during build but the fact that they can change the build process as we go. They cannot change the plan but they can change how we call GHC and (if I understand correclty) this is what causes so much trouble to other tools like HLS (see the huge thread in #7489).

(Just following my thoughts now, no idea where I will end up :P)

In terms of use cases, we need to draw a line in the sand and decide what is and is not cabal responsibility. There is no way cabal itself can support all possible ways to build code. It is not tool as general as nix, bazel or even cmake or meson.

I might even say pkgconfig-depends was a strategical mistake, since it leads to DevEx issues we cannot solve inside cabal-install. One for all: pkg-config packages don't have global namespace and not even unique names; this means the name listed in pkgconfig-depends does not always help and fixing this is harder than specifying build flags and paths for a given package at projec or system level).

You could say that supporting setup hooks is the way to solve these kind of problem but I think I still disagree. cabal is not something users and developers can use to avoid having to configure their system toolchains, packages or settings.

I am -1 on extending the setup hooks mechanism.

@andreabedini
Copy link
Collaborator

andreabedini commented Oct 5, 2023

Maybe my brain needs more coffee but I do see a difference: a part from custom setups, we don't currently run arbitrary Haskell executables during the build process. Is that right?

@angerman correctly points at TemplateHaskell but I think this does not undermine what I am trying to express here. TemplateHaskell does not affect the build process; at least not at cabal's level and excluding some perversions like runIO (callCommand "apt install libxyz") (correction: this would not affect the build since planning is already done, maybe doing some IO that influences subsequent custom setups? custom setups seem to be the root of all evil :P).

@dcoutts
Copy link
Contributor

dcoutts commented Oct 5, 2023

@angerman it'd be good to have input on which hooks ought to run where, i.e. host vs target and the information flow between them.

@angerman
Copy link
Collaborator

angerman commented Oct 6, 2023

@dcoutts which hooks are we talking about? The existing ones?

It largely depends on what you are actually doing with them.

  • if you end up calling out to other processes, or read files, ... you most of the time want this to be on the build machine. (Git-rev, include file, ...)

  • if you want to inspect properties about the target (detect word size, build and run some code, ...)

We have no semantics around this yet. Having two distinctly named options for each hook and the relevant documentation might help to start having those semantics by encoding them in the respective hook names.

Even having them does not guarantee correct use, but it at least provides the foundation. Right now we have nothing.

@andreabedini andreabedini changed the title Implement the 'Hooks' build type, a successor to 'Custom' build type Design a 'Hooks' build type to replace 'Custom' in most scenarios Oct 6, 2023
@andreabedini
Copy link
Collaborator

@mpickering I allowed myself to edit the title and the description. I understand the design is still private and it has to discussed in the open before we can track its implementation.

@dcoutts
Copy link
Contributor

dcoutts commented Oct 6, 2023

@angerman not the existing ones. Similar idea but starting from a clean slate. The goal in a spec for this build type is to define the semantics of all the hooks, both what info is passed in and out of each hook, but also when (and where) build systems are expected to call the hooks. This spec would be the agreement between packages and build systems for this build type. And since we'll never get it all, or get it all right first time, we want a design that is easier to evolve than the existing UserHooks which are essentially frozen in time (from about 10 years ago) because any changes are backwards incompatible.

@gbaz
Copy link
Collaborator

gbaz commented Nov 2, 2023

I confess I keep coming back to this comment much earlier from @mpickering

As long as the declarative feature involves executing some arbitrary Haskell executable, I don't see that has a significant benefit over a Setup.hs script.

I feel we're still in that situation here. The new hooks type proposed is really just a refinement of the Custom approach with a new hooks design. I would like to see some purely declarative way to replace all uses of custom scripts. The changes proposed don't seem to improve the architecture for any other users than custom script developers themselves, which get better ergonomics out of it. This does not seem a sufficient improvement to warrant this work, as we can get those ergonomic changes purely in userland by letting existing custom setups depend on an improved library interface.

The work in gathering requirements has been and continues to be valuable -- but I would much prefer a proposal that allowed purely declarative ways to replace the need to compile and run arbitrary haskell code.

@gbaz
Copy link
Collaborator

gbaz commented Nov 2, 2023

More to the point, the high-level motivation says explicitly:

  • As much as possible, we should encourage package authors to replace uses of the Custom build-type with declarative features.
  • However, sometimes it is not feasible to remove the need for custom build logic entirely. Thus we plan to introduce a successor to the Custom build-type, which will allow a package author to augment (but not replace) build phases.

The proposal does not advance the first point at all -- in terms of extending cabal with further declarative features. It only introduces this new build-type. And the arguments for this new build-type, outside of ergonomics, are not clear to me.

Two concrete issues raised are HLS and multi-repl. How does this proposal concretely improve the situation for HLS? Further, how does this proposal concretely improve the situation regarding mult-repl?

@Ericson2314
Copy link
Collaborator

OK now that I read the actual proposal (thanks again for the link)

I don't think I understand this comment, could you please suggest how you would modify the proposal in order to achieve this whilst also supporting all the identified use cases?

I looked over the approach yeah it's alright. If the interface can use a severely slimmed down version of Cabal-syntax that would be an improvement. However, per my point the build hook interface still seems very course-grained.

I am very sympathetic to what @gbaz is saying too. In an ideal world, a better "Safe Haskell" could be our small declarative language, but we don't have that today --- we only have Big Haskell not Small Haskell. Sad, but true. So an arbitrary Haskell program is unfortunately still not very nice.

Two concrete issues raised are HLS and multi-repl. How does this proposal concretely improve the situation for HLS? Further, how does this proposal concretely improve the situation regarding mult-repl?

Yes I too think it is productive to put aside the design for a moment and hone in on the desiderata. Here's nice situation:

  • I am developing two packages at once with GHCi / HIE, and so using multi-repl.
  • Both packages use happy and alex files, but I don't want to use any built-in hack rules for .x and .y
  • GHCi/HIE uses inotify and similar to track writes to files
  • When I change my .x and .y files minimal rebuilds happen pushing forward through a dependency graph

I think this is what you want, @mpickering and it is what I want too. (Including Nix with inotify :).) I think in your phase 2 we have a better fine-grained build hook, and HIE hot-loads these library modules (important its not main) to augment its build graph with it. Let's spell that our and then work backwards to a phase1.

My hunch is that buildinfo files are serialized and isomorphic to configure hooks, and something like ninja files is serialized and isomorphic to "fine-grained phase 2 build hooks". So you can think of this as "if this is the internal interpreter way, what is the iserv way?".

@dcoutts
Copy link
Contributor

dcoutts commented Nov 3, 2023

This has become a long discussion, so I'll pike out a few issues...

@Ericson2314 says:

What I really want to do is per module builds, and similarly fine-grained derivations.

And that's exactly one of the things that this design change enables, which is currently not easy or not possible.

and related, @andreabedini asks:

If this is the feature you want to build, let's talk about this instead!

  • How would per-module building work for each of the existing build-types?
  • What would be the issues with e.g. custom or configure build-types?
  • Which parts of the code would have to change?

Currently, with custom setup packages, in principle it is impossible to do per-module builds (especially across packages). That is simply because it is the Setup.hs that does the details of the build, and so it gets to choose how to build, and whether it uses ghc --make, or make or whatever it wants to do. All you can do is invoke ./Setup build and it will do whatever it wants.

And the problem is that this is infectious.

Maybe the problem, not mentioned in this discussion so far, is that the simple build-type is actually implemented on top of the custom build-type. In the sense that

defaultMain = getArgs >>= defaultMainHelper simpleUserHooks

Because cabal-install has to support Custom build types, then it pretty much has to treat all packages via that same interface. Doing anything else would require two radically different code paths, which would be a maintenance headache. And we all know that Cabal has a maintenance, which we want to make better, not worse.

Yes, that ./Setup CLI is the lowest common denominator interface.

Whereas by changing the design to reverse who provides the build system (builder rather than author), then we can actually change that lowest common denominator and have build tools have just one code path where they provide their own build system across all package, and can thus implement features like per-module build graphs.

@dcoutts
Copy link
Contributor

dcoutts commented Nov 3, 2023

@gbaz let me try and convince you... :-)

I feel we're still in that situation here. The new hooks type proposed is really just a refinement of the Custom approach with a new hooks design.

I think of it as being a radical change. It changes who (in principle) provides the build system. Is it the package author, or is it the package builder? Classically Cabal says it's the package author, but this design (plus deprecating and removing Custom) changes that so that it is the package builder that provides the build system.

The actual amount of code we need to change to swizzle this architectural decision around isn't all that much and perhaps looks like a refinement. There are certainly similarities with the old UserHooks.

I would like to see some purely declarative way to replace all uses of custom scripts. The changes proposed don't seem to improve the architecture for any other users than custom script developers themselves, which get better ergonomics out of it. This does not seem a sufficient improvement to warrant this work, as we can get those ergonomic changes purely in userland by letting existing custom setups depend on an improved library interface.

Ah, no, it makes a big improvement. It's not about the authors of custom scripts at all. For them it's mostly just churn. By having the package builder provide the build system, rather than the package author, suddenly we can do cross-cutting features in a simple and direct way: by building them into the one build system that we (as the builder) chooses to use.

Right now, lots of cross cutting features are in principle impossible, and in practice partial and horrible. Multi-repl is a good example. In principle this is impossible. It's impossible in the sense that it cannot possibly work with all Cabal-spec compliant packages. And that's because multi-repl just can't be implemented in terms of the CLI that Cabal specifies, because the CLI is limited to things like building the whole package., but multi-repl relies on gathering all the info for multiple components in multiple packages at once and passing that all to GHC. Yes, multi-repl does work in practice, but with limitations, and it is inherently hackey and complicated. It sort-of works because in practice custom Setup scripts use the Cabal lib, and we have control over that and can add features in there. But that results in arbitrary limitations like not being able to work with packages with custom setups that rely on older versions of the Cabal lib.

If it's the package builder that provides the build system then that hackyness falls away. The cross-cutting features can just be implemented directly in one place, without having to jump through hoops.

I would like to see some purely declarative way to replace all uses of custom scripts.

What benefit would that bring us? I would argue that that would bring us precisely this benefit of allowing a single build system to work across all packages, and thus allow non-hacky cross cutting build system features.

This design with hooks is doing the same thing, but accepting the inevitability that we can never quite get to 100% with declarative features covering every possible use case.

We've been trying to add declarative features to cover more use cases for well over a decade. It's a good thing to do, because it's generally easier to use. But I think we must accept that we cannot hit 100%. It's a never-ending treadmill. But we still want to be able to eventually get away from Custom setup scripts and be able to have a single build system across all packages, with all of the benefits and opportunities that that brings. That's what this design does.

@gbaz
Copy link
Collaborator

gbaz commented Nov 4, 2023

That's helpful. Can we nail down specifically what benefit this provides to which tools?

It seems in particular like it makes certain information available to tools for interacting with haskell packages -- cabal's multi-repl support, and also support for parallel builds in particular. And the way it does this is that it allows those tools to see a full inventory of all the "real" modules in a build, including those that may have been generated in the course of running a Setup. For multi-repl it would probably suffice if we just made sure that custom setups made that list available as a product when run, no? Would that also suffice for parallel builds?

Additionally, let me push on the three different types of hooks proposed -- configure hooks, build hooks, and install hooks. Do we really need all three? It seems to me that code-generation can be done as part of configure hooks, and maybe makes most sense there. This means of course that the files aren't generated on a per-component basis, but that isn't so bad. So what use cases are there at all (if any) for prebuild hooks? As for postbuild hooks, could they not be substituted by simply providing a command line to pass the built executable after it is built?

As for copy/install hooks: it seems to me that if we have a specific directory where all generated files are copied from -- or perhaps even an enumerated list of them in the cabal file, that these can be made declarative as well...

I really feel that if we inventory the specific use cases we have, and put some more thought and elbow grease into it, we'll see that there are declarative features that would subsume a lot of what we're still considering "necessitating hooks".

@dcoutts
Copy link
Contributor

dcoutts commented Nov 4, 2023

@gbaz in the first instance, let me refer you to the various main docs:

https://well-typed.com/blog/2023/10/sovereign-tech-fund-invests-in-cabal/#motivation
https://github.com/well-typed/hooks-build-type/blob/main/design.md
https://github.com/well-typed/hooks-build-type/blob/main/survey.md

I should note that for the funded part of the project, we are not promising to get to adapting downstream tools to take advantage of the architectural change, we're just aiming to get the architectural change prepared, plus surveying existing packages and doing test migrations to the new build type.

That said, the class of things that this enables or makes simpler, is any build system feature that is cross-package, and it applies to anything acting as a build tool / build system.

So what does that include, certainly for tools it includes:

  • cabal-install
  • stack
  • hls
  • potentially other build/packaging tools like nix Haskell packages etc.

As examples of cross-package build system features, there's lots of examples, including the ones we listed in the blog post

  • building components individually (rather than whole package)
  • multi-component repl
  • cross-package parallel builds

and off the top of my head:

  • loading the repl for individual modules/files
  • more accurate file change detection
  • parallel builds using ghc single-shot mode rather than ghc --make
  • adding better support for tools like hlint, haddock, doctests, hpc, profiling, hoogle etc etc

Yes, we can improve support for tools like those in the existing architecture but it's more clunky and harder to maintain. In the current architecture it involves adding support in to the Cabal lib, exposing new CLI commands or flags, and then adding support for that CLI in cabal-install or other tools, and having to deal with packages that are using old Cabal versions for their setup and thus do not have those features available. Whereas with the new architecture where the builder (e.g. cabal-install) provides the build system then one adds the feature once, and doesn't have to go via a CLI, and don't have to deal with version-conditional feature availability. It becomes just calling new library functions.

And thanks for the more detailed comments on the hooks.

@gbaz
Copy link
Collaborator

gbaz commented Nov 4, 2023

Thanks for the links -- I'd read all a few times before but the survey, which I hadn't seen. It seems to me that from both the use-cases for the hooks build type for tools that you've enumerated, as well as the package uses in the survey that the key thing really is the configure hooks -- and in particular that the deficiency in the existing design is that there is no way for the "driving" program (cabal, etc) to get access to the result of applying hooks to the localbuildconfig of the result of applying hooks to the component.

I see how its possible that we could then modify cabal to then provide a less hacky multi-component repl or better parallelism, so that all makes sense to me now, and generally how this all fits into the v2-builld / project based model more cleanly.

That said, I want to continue to push back on the need for build hooks or copy/install hooks. And in fact, I want to question if we need postConf hooks in configuration at all, or only preConf hooks.

I think it would be much cleaner if we could avoid the need for as many hooks as possible, to get the "surface area" of what hooks we provide sufficiently small, which would make even further pushes to make things more declarative in the long term yet more tractable.

@Ericson2314
Copy link
Collaborator

@gbaz

It seems to me that code-generation can be done as part of configure hooks, and maybe makes most sense there.

But it wouldn't be nice to reconfigure every time a happy file was updated right?

that said, I do agree the hooks need reworking --- "pre" and "post", "copy", etc. seem like holdovers from the current way of doing things, and not really declarative.

I still don't see how a build hook would be used to turn a .y file into a .hs file in a nice way, without reimplementing inotify/caching logic in the hook which is exactly what we don't want.

@gbaz
Copy link
Collaborator

gbaz commented Nov 6, 2023

But it wouldn't be nice to reconfigure every time a happy file was updated right?

That's a fair point, sure. But again, I think that we really should have a declarative way to extend the baked-in build-tools set beyond just happy, alex and the like. (speaking of which, the new design doesn't seem to have a replacement for hookedPreProcessors unless I missed it? But also I don't think anyone ever really extended this because its quite messy and also doesn't have control over the order of the pipeline...)

@Ericson2314
Copy link
Collaborator

Ericson2314 commented Nov 6, 2023

Yeah I think I would want something like

lib:my-component:module:Foo.Bar : src:blah/foo/bar.y
    ${happy:exe:happy} -o ${output} ${input[0]}

The file being produced my-package:lib:my-library:module:Foo.Bar, and the tool being run happy:exe:happy not not specified by literal path, but more abstractly to allow cabal-install to choose locations.

@andreabedini
Copy link
Collaborator

@mpickering

I am struggling a bit to understand your position exactly and feel like we are just talking about technical points rather than actually addressing the general principles and feelings involved.

Yes, I understand now how the technical points I have raised are less relevant. The discussion started from a design for a new build-type, and only during the discussion the following new facts have emerged:

  1. The actual goal is, i.e. implement cross-cutting features (term I can finally associate a meaning to). This is only briefly mentioned in the proposal, with no discussion or explanation.
  2. That the proposal is a first step towards a radical change in cabal architecture (as @dcoutts puts it).

I see the importance of these two topics, which I am keen to see discussed, but you have to admit it took some work to identify them. As I already said, I believe that implementing the proposed build-type to address the use-cases in the survey is not worth it; but I have no shame in revisiting this cost-benefit calculation as more information becomes available. I cannot read people minds after all.
In light of all this, yes, my comments around code-generation and configure-style setups are irrelevant technical points.

Perhaps it is useful if I write down some simple statements which we can discuss whether we agree on or not:

Yes, thank you.

We believe that it should be possible to augment the Cabal build process in unanticipated but controlled ways to cover parts of the build process which were not imagined or supported by the Cabal maintainers.

I agree; provided we agree on the definition of "controlled" :-) You see in the other comments and below that we don't have yet a common understanding of what this means.

It is better to design a more general mechanism for augmenting the process rather than many specific knobs, so that integrating the hooks into Cabal and other build-systems is more straightforward and general.

I agree that a general mechanism for augmenting Cabal build-system is preferable to many point-wise solutions. Whether the hooks is the right mechanism is still up for discussion.

We must begin with the assumption that any existing usage of Setup.hs which merely augments the build process is justified and valid. Any proposed new scheme should support Setup.hs scripts implemented in this manner.

What you might consider "merely augmenting the build process" is vague but I am come to agree that the use-cases identified in the survey are (in general) valid. The difference between haskell-gi's use-case and unix is a technical one that I am happy to set aside.

Since we are not talking anymore about a mere build-type, let's talk about the general principles 1.


The general principle I want to bring to attention is the need for a well-defined interface.

I think you are misunderstanding the ultimate goal here, which is to not go via the Cabal Setup.hs interface when building Simple/Configure/Hooks build-type packages. We want to instead refactor the Cabal library so the relevant functions can be called directly from cabal-install without the indirection of going via the ./Setup.hs build interface.

We should not mix-up the interfaces between different components from its implementation. Saying "the relevant functions can be called directly" does not tell me what the relevant function are. It does not tell me what the interface is.

Indeed my first thought was "but we already do this!". After all cabal-install does call directly into Cabal for the Simple, Configure and Make build-types. This is done by buildTypeAction in SetupWrapper.hs. For Custom we don't have a main function to call because it is defined by the user and has to be compiled first.

But, of course, this is not the point you are trying to make because the "relevant function" that we call is basically main, so we are using the CLI interface in some sense 2.

So you are envisioning a new architecture where:

  1. the package author can augment the build-system with hooks but without being in control of the main function.
  2. cabal-install or other build-systems can call the user-provided hooks rather that calling main. (I would prefer if did not use the term "build-system" for both Cabal and cabal-install but 🤷).

Am I understanding this correctly?

You say:

It seems that you are just proposing what is already the case for Custom setup scripts! This is exactly what we want to move away from.

Yes, indeed without that other interface (point 2 above) it would be the same. I also had in mind 3rd party "build-type provider" could also offer an interface for cabal-install (or stack) to call in; which seems to be what you were also thinking.

Let me ask you. Does your WIP branch only implement point 1 (the SetupHooks) or does it also implement point 2 (the cabal-install interface). When I checked, it only implemented 1; which is why I suggested you can implement it as a new library to use with build-type Custom.

@dcoutts

Classically Cabal says it's the package author, but this design (plus deprecating and removing Custom) changes that so that it is the package builder that provides the build system.

Do I have to understand the goal is now to also deprecate and remove the Make build-type? That can do what it wants too. How about the Configure build-type?

Currently, with custom setup packages, in principle it is impossible to do per-module builds (especially across packages). That is simply because it is the Setup.hs that does the details of the build, and so it gets to choose how to build, and whether it uses ghc --make, or make or whatever it wants to do. All you can do is invoke ./Setup build and it will do whatever it wants.

I am not sold on this. If we assume that the vast majority of uses of the Custom build-type go through defaultMainWithHooks; then exactly what is impossible to do in any case?

The proposal and associated survey identify the most common use-cases to address. On one side I hear is "the Custom build-type makes things impossible to do in principle", on the other I hear "the SetupHooks build-type addresses these most common use-cases".

So I ask: can per-module building be implemented for any of the identified use-cases by reworking defaultMainWithHooks? That would automatically apply also to build-type Simple is that correct? (because any assumption we are able to make above would also apply to build-type Simple).

Doing anything else would require two radically different code paths, which would be a maintenance headache.

Two different entry-point do not necessarily require two radically different code paths. I think refactoring defaultMain to be independent from defaultMainWithHooks would teach us a lot about the problem domain.


Now that we are on the same page about what, we can have a productive discussion about how. I am glad to see we have already started discussing what this new interface is going to look like.

Here are some questions and observations:

  • Does the proposal include moving the implementation of the Simple build-type from defaultMainWithHooks to defaultMainWithSetupHooks?

  • Will defaultMain be able to do per-module builds (within the same package but perhaps across components)? It would be great to see a proof-of-concept of this (which could be implemented as a separate library to use with build-type Custom).

  • The design of a new interface between Cabal and other build-system is a big discussion topic with long term consequences; both in terms of its functionality and of its implementation. It would be beneficial to keep the design as modular as possible so it could evolve in the future.

  • I would like to see this an opportunity to better delineate the separate responsibility between cabal-install and Cabal, and not to couple them further. E.g. the user-provided SetupHooks are going to be a Cabal concept. How is cabal-install going to use them? Through Cabal? Directly? As it stands cabal-install does not know anything about LocalBuildInfo (another Cabal concept) so it will have to pass them back into Cabal.

  • To be good citizens tools like nix, bazel, mason, buck1/2, etc need to have access to a similar level of functionality as stack and cabal-install itself. (We can assume that running Haskell code is ok given they all have a way to run ./Setup.hs). So far Cabal and cabal-install lack an interface to exchange data with other tools, which forces them to unpleasant workarounds.

  • You must have noticed how we have different ideas of what "controlled" above means. We should look at the experience of other build-systems here. I appreciate the approach that meson takes here, thank you @eli-schwartz. I wonder how it relates to Starlak of Bazel and Buck2. The reference to "automagic dependencies" fits perfectly since in Haskell we only have source-based distribution methods.

  • Related to the point above but perhaps more on the technical side, @Ericson2314 brings up the very important point of controlling how the hooks can probe the external world. This is vital to reproducibility and accurate tracking of dependencies (even outside the Haskell domain). It is fine for cabal to not care too much about tracking system-level dependencies but it is not fine to make this impossible.

My take is that you don't want to let the hooks do arbitrary IO directly. Rather we should offer a broad range of facts the hooks can query, similarly to what meson does. These facts can be quite comprehensive though. This could be done with a simple Haskell eDSL. Note that if the interface between Cabal and cabal-install is a akin to a build-graph, there is a degree of dynamic information that can be directly encoded into it.

Footnotes

  1. Since you mention feelings: I feel I was not being told the full story. Just below you say I might be misunderstanding the ultimate goal. Of course I am! It was not discussed or explained anywhere. I can see a glimpse of it written between the few lines in the "future work" section after pages of details on the implementation of the new hooks. I understand why that section had left me confused earlier. We can leave this aside though; no hard feelings.

  2. You might think I am stating the obvious, but these are facts worth saying if we want to be sure we are on the same page and this discussion is accessible to everyone.

@mpickering
Copy link
Collaborator Author

@Ericson2314

Have a more declarative way to define build rules and integrate them into Cabal does sound like a nice idea, however, the build system in Cabal (and cabal-install for that matter) is not implemented in that way so it seems like it would be a massive refactoring job to get to that point. The way things are designed now fit better into the current Cabal architecture without attempting to reinvent everything.

@Ericson2314
Copy link
Collaborator

Ericson2314 commented Nov 9, 2023

@mpickering Thanks for being clear, but I do strongly disagree with that reasoning.

  1. Cabal may not be able to do nice things with such hooks today, but retrofitting them into the current implementation is not hard. One can simply topo sort all the rules and then execute them once in a conventional "pre build" manner.

  2. Even if it was hard, we should still do it anyways. This is a once in a generation reeevaluation of the way Haskell projects should be packaged, and the user cost of rewriting all the Setup.hs (as is desired, if not required) is quite high. We should not do some half-measure reform and then ask users to rewrite everything again in ~ 2 years time. We should do it right, and do it in a way which is forwards compatible with all the implementation improvements we want to do later.

This is why I was and am skeptical of the "phase 1, phase 2". That is fine for the implementation, but right now we are designing the interface, a language, and one for many other implementations than just cabal-install (buck, bazel, nix, etc. tooling). Let's get the interface right, and do it without current-implementation-induced myopia.

@mpickering
Copy link
Collaborator Author

@andreabedini

So you are envisioning a new architecture where:

the package author can augment the build-system with hooks but without being in control of the main function.
cabal-install or other build-systems can call the user-provided hooks rather that calling main. (I would prefer if did not use the term "build-system" for both Cabal and cabal-install but 🤷).

Am I understanding this correctly?

Yes that's right.

You say:

It seems that you are just proposing what is already the case for Custom setup scripts! This is exactly what we want to move away from.

Yes, indeed without that other interface (point 2 above) it would be the same. I also had in mind 3rd party "build-type provider" could also offer an interface for cabal-install (or stack) to call in; which seems to be what you were also thinking.

The point is that cabal-install needs to be able to take control of large parts of the build (for example, building .hs files should always be done in the same way, until the control of cabal-install).

Let me ask you. Does your WIP branch only implement point 1 (the SetupHooks) or does it also implement point 2 (the cabal-install interface). When I checked, it only implemented 1; which is why I suggested you can implement it as a new library to use with build-type Custom.

No, the first phase of the work is to implement point 1, which will then enable future work to happen to take advantage of the rearchitecturing. (Whether this work is further funded by STF or undertaken independently).

The SetupHooks phase could be implemented as a completely separate library but it would be very painful because it would involve copying large parts of the Cabal library. It also would then make phase 2 more difficult to implement because for phase 2 you need guarantees that packages are not expecting to be built via ./Setup interface, so you would then need to introduce a new build-type and get everyone to switch to it.

@dcoutts

Classically Cabal says it's the package author, but this design (plus deprecating and removing Custom) changes that so that it is the package builder that provides the build system.

Do I have to understand the goal is now to also deprecate and remove the Make build-type? That can do what it wants too. How about the Configure build-type?

Yes the Make build-type also needs to be removed (but as far as we know, no one uses this).

The Configure build-type is fine because it can be implemented in terms of Hooks build type. (That is implemented on the branch).

Currently, with custom setup packages, in principle it is impossible to do per-module builds (especially across packages). That is simply because it is the Setup.hs that does the details of the build, and so it gets to choose how to build, and whether it uses ghc --make, or make or whatever it wants to do. All you can do is invoke ./Setup build and it will do whatever it wants.

I am not sold on this. If we assume that the vast majority of uses of the Custom build-type go through defaultMainWithHooks; then exactly what is impossible to do in any case?

I don't understand what the question is here, as soon as anything declares 'Custom' build-type then you have to assume it can do anything in ./Setup build. It can build the .hi files and .o files in whatever way it wants and cabal-install has no say in the matter. We want to move the decision about how to build .hi and .o files into cabal-install so that has the ultimate authority about that part of the build plan.

Perhaps the question is, could we do something similar if we assumed that all Setup.hs were implemented in terms of
defaultMainWithHooks, and the answer is no because the UserHooks type there allows you to replace the whole configure/build/install etc phases. So in order for the hooks to run you have to call ./Setup build ... etc etc.

The proposal and associated survey identify the most common use-cases to address. On one side I hear is "the Custom build-type makes things impossible to do in principle", on the other I hear "the SetupHooks build-type addresses these most common use-cases".

So I ask: can per-module building be implemented for any of the identified use-cases by reworking defaultMainWithHooks? That would automatically apply also to build-type Simple is that correct? (because any assumption we are able to make above would also apply to build-type Simple).

I don't understand this question. How are you imagining reworking defaultMainWithHooks?

Doing anything else would require two radically different code paths, which would be a maintenance headache.

Two different entry-point do not necessarily require two radically different code paths. I think refactoring defaultMain to be independent from defaultMainWithHooks would teach us a lot about the problem domain.

What do you think that would teach us? The implementations of the functions are very similar, defaultMainWithHooks is just like defaultMain but with a variety of extension points.

@mpickering
Copy link
Collaborator Author

@andreabedini

Does the proposal include moving the implementation of the Simple build-type from defaultMainWithHooks to defaultMainWithSetupHooks?

This doesn't really matter in phase 1 as cabal-install still interacts via the ./Setup interface.

Will defaultMain be able to do per-module builds (within the same package but perhaps across components)? It would be great to see a proof-of-concept of this (which could be implemented as a separate library to use with build-type Custom).

Sure, you could modify Cabal in order to do per-module builds. You could also implement it yourself with a custom Setup like you suggest. However, we do not want to implement this in Cabal because we want the entire build-plan to be controlled by cabal-install so that scheduling, progress reporting etc etc is all controlled by the same system.

We really would not like to encourage people at this stage to start using build-type: Custom for this though 😆

The design of a new interface between Cabal and other build-system is a big discussion topic with long term consequences; both in terms of its functionality and of its implementation. It would be beneficial to keep the design as modular as possible so it could evolve in the future.

I would like to see this an opportunity to better delineate the separate responsibility between cabal-install and Cabal, and not to couple them further. E.g. the user-provided SetupHooks are going to be a Cabal concept. How is cabal-install going to use them? Through Cabal? Directly? As it stands cabal-install does not know anything about LocalBuildInfo (another Cabal concept) so it will have to pass them back into Cabal.

cabal-install does not know about LocalBuildInfo directly but it knows about configuration indirectly as it has to construct an appropriate command line to Cabal such that Cabal can construct a LocalBuildInfo which agrees with the configuration chosen by cabal-install. (For example, imagine if LocalBuildInfo concluded that Cabal should use a different version of ghc to the one that cabal-install concluded, disaster would arise).

So it could be argued that depending on the LocalBuildInfo directly is the more principled thing to do is to know about LocalBuildInfo directly and depend directly on the Cabal library, such as is proposed in phase 2.

To be good citizens tools like nix, bazel, mason, buck1/2, etc need to have access to a similar level of functionality as stack and cabal-install itself. (We can assume that running Haskell code is ok given they all have a way to run ./Setup.hs). So far Cabal and cabal-install lack an interface to exchange data with other tools, which forces them to unpleasant workarounds.

These build systems currently use the ./Setup interface, which they can continue to do so with the hooks build type.

If someone wants to do something more clever then they will need to create a Haskell program which links against Cabal to be able to call the hooks process. They will need to populate themselves things like LocalBuildInfo and so on, in the same manner than cabal-install populates it.

The other way these build systems might interact with cabal-install is if cabal-install can be taught to express it's build plan using ninja files which then could be consumed by another build tool and used to perform the build.

You must have noticed how we have different ideas of what "controlled" above means. We should look at the experience of other build-systems here. I appreciate the approach that meson takes here, thank you @eli-schwartz. I wonder how it relates to Starlak of Bazel and Buck2. The reference to "automagic dependencies" fits perfectly since in Haskell we only have source-based distribution methods.

Related to the point above but perhaps more on the technical side, @Ericson2314 brings up the very important point of controlling how the hooks can probe the external world. This is vital to reproducibility and accurate tracking of dependencies (even outside the Haskell domain). It is fine for cabal to not care too much about tracking system-level dependencies but it is not fine to make this impossible.

My take is that you don't want to let the hooks do arbitrary IO directly. Rather we should offer a broad range of facts the hooks can query, similarly to what meson does. These facts can be quite comprehensive though. This could be done with a simple Haskell eDSL. Note that if the interface between Cabal and cabal-install is a akin to a build-graph, there is a degree of dynamic information that can be directly encoded into it.

It seems here that you and @Ericson2314 are advocating for a big redesign of how both Cabal and cabal-install are architectured. The situation is currently that neither of these tools are engineered in this way so we have engaged in a design which fits in with how things are currently working.

The purpose of the proposal is to design a system which fits into the current architecture of Cabal but allows us to remove the assumption that each package provides it's own build system. We haven't set out on a task to design a new architecture for Cabal and cabal-install, and it seems the wrong place to do this when talking about.

Perhaps it is worthwhile to have a separate discussion about what the architecture should be? and if the architecture changed then it could also make sense to introduce another build-type which allowed you to define rules which worked nicely in the new arcitecture but it seems a bit of an over-reach to predict that in this proposal which has simpler goals. It's also possible to refine the Hooks build-type going into the future.

@adamgundry
Copy link
Member

It's also possible to refine the Hooks build-type going into the future.

I think this is the key point. We want to produce a design where the hooks can evolve to meet future needs. In particular it seems sensible to start with coarse-grained pre/post configure/build hooks (because those closely match what Cabal has now and should be a relatively easy migration for existing users of Setup.hs) and then later add hooks that can produce more fine-grained declarative build rules (allowing packages to gradually migrate from coarse-grained hooks to fine-grained rules).

So rather than "current-implementation-induced myopia" I would frame this more positively as "providing a practically feasible implementation strategy and a gradual migration path".

@Ericson2314
Copy link
Collaborator

Ericson2314 commented Nov 9, 2023

@adamgundry For the same reason of invariants and separations of concerns that you all want to git rid of Setup.hs, I want to get rid of course-grained hooks. "Adding fine-grain hooks" doesn't achieve good invariants, in the same way adding course grained hooks but keeping Setup.hs doesn't add achieve good invariants.

To restart, I am much more interested in raising the floor of what packages can not do, than on raising the ceiling on what they can do it.

I think we all agree where we can incrementally add features, we have to be a lot more intentional and occasional about taking them away.

@eli-schwartz
Copy link

In general, progress means designing systems, discovering where their limits are, repenting the bad decisions of the past and designing new systems. Such things are, ideally, kept to a minimum. As previously mentioned, when forcing people to do big migrations it's best to keep said migrations to once a decade or actually preferably once a century but we all know how well that works out. ;)

Being forced to redesign the system isn't bad and shouldn't be shunned. But I would like to distinguish between that and choosing to design a system that you know is going to need to be redesigned.

If you're trying to redesign the system and you know broadly where you want to be in ten years' time, every step you take to get there really should be compatible with that future vision. I would consider it a very unwise move to say "X and Y are not super nice and we'd like to deprecate X, let's go add completely new ways of doing Y2 with the eventual goal of deprecating both Y and Y2 and moving on to Z".

This is just forcing people to stop using X now, making them use Y2, and then prompting an eventual second migration from Y2 to Z.

What you should be doing is incrementally making Z possible.

Alternatively, you should be defending the opinion that Y2 is actually great and your vision for ten years' time and therefore you don't see anything problematic in designing it and getting people to migrate to it.

Even if Y2 turns out to not be great in the end and the ecosystem has to move onwards to Z after all, that's... fine. Mistakes happen, hindsight is 20/20, that's exactly why X was created and now being deprecated, and every ecosystem has a story about that (sometimes many stories). But the point is that mistakes are what happens when people drop the ball on possessing mystical prophetic powers of seeing the future and prophesizing the correct decisions to make in advance.

Whatever else you do, please don't build systems which you have planned to fail. If you do build such systems, please let us know so we can publish "transitional systems doomed to fail Considered Harmful" essays warning people to not touch those systems. Thanks! 🙏

@hsenag
Copy link
Member

hsenag commented Nov 9, 2023

There aren't actually that many projects using custom setup anyway, and speaking for one of them (darcs) I'd be quite happy doing a migration to an incrementally better future even if it means a second migration later on.

Even if we all think Z might be the holy grail, maybe we'll find out that Y2 is good enough and we don't need to invest the effort into ever actually implementing Z. And in the meantime we'll have had the benefits of Y2 because it actually got delivered.

I also haven't been following every last detail of this discussion, but it seems like refactoring to make hooks more visible would anyway be a good starting point for then turning them into something more declarative.

@gbaz
Copy link
Collaborator

gbaz commented Nov 9, 2023

I think this is the key point. We want to produce a design where the hooks can evolve to meet future needs. In particular it seems sensible to start with coarse-grained pre/post configure/build hooks (because those closely match what Cabal has now and should be a relatively easy migration for existing users of Setup.hs) and then later add hooks that can produce more fine-grained declarative build rules (allowing packages to gradually migrate from coarse-grained hooks to fine-grained rules).

So rather than "current-implementation-induced myopia" I would frame this more positively as "providing a practically feasible implementation strategy and a gradual migration path".

I think this is the heart of the disagreement. Once central problem with the current implementation is that it is a local minimia, which the current proposal explains pretty well. There's no small incremental evolution out of it, because it yields all control to the Setup file. My concern is that we not create another local minima that we cannot evolve out of, and the current proposal I think falls into that category.

Even if we have a setting where instead of "main" function we just have a "record of hooks" we have the problem that if these are "coarse grained" hooks then we're still stuck with the coarse grained ones indefinitely even if fine-grained ones were introduced. Adding to a record, especially if it is kept abstract, and has a default value populated with explicit Nothing values, is very backwards-compatible. However, deleting from a record is not. So once those coarse grained hooks exist, then we're stuck supporting them indefinitely, just as even with this proposal, we're still stuck supporting the current Custom indefinitely.

I would suggest the more "practically feasible implementation strategy and gradual migration path" would be to introduce the new build type only with very few fine-grained hooks, all of which we've convinced themselves are no more general than necessary, and all of which, to the extent possible, are configured via datatypes and not functions so that they are statically inspectable. I would argue in the first pass we only need a preConf hook, and perhaps, in a pinch, a preBuild hook.

(In my ideal world these would be not even in code at all, but just given in the cabal file as executable build-tools-depends that could be run in the preconf and prebuild phases with a fixed input and output structure in terms of build configuration information. However, I'd yield on this if people really don't want it).

I don't see how we can make preConf much more granular given all the work we want it to do, but I think we could maybe make preBuild significantly more rules-like -- and this would even help in tracking rebuild dependencies.

I'm not intending to suggest a ton more implementation work -- if anything I'm suggesting more careful thought and caution, to avoid doing work which turns out to not be necessary, but which we will get stuck supporting.

@adamgundry
Copy link
Member

Thanks everyone for your input into this discussion. I understand there have been some concerns raised at the Cabal developers meeting about the way this work is progressing, so I wanted to clarify a few things:

  • The project we originally pitched to the STF was about coming up with a design (and proving that design with a prototype implementation, tests, etc.). We always intended that this design would be discussed and scrutinized by the whole Cabal developer community (and the wider Haskell community via a HF tech proposal) before anything gets merged. We obviously don't expect code to be merged until there is consensus that it is a good design and a sensible step forward.

  • We started this discussion with some rough sketches rather than a fully motivated and fleshed out design, precisely because we wanted to solicit feedback early rather than waiting until everything was clearly defined. I'm sorry if that has led to confusion about the motivation or what the design actually is.

  • We have been working on a more detailed design document, which will eventually become a HF tech proposal RFC. That will hopefully address many of the concerns and questions that have been raised in the discussion (or at least clarify the trade-offs between alternative designs). It'll take a little longer to get that into a suitable state, but we'll share the draft here for further discussion as soon as we can.

@Mikolaj
Copy link
Member

Mikolaj commented Nov 13, 2023

Quite a bit off-topic, but I wonder if this work subsumes the ideas in this ticket or, the opposite, makes them harder to achieve (or is just totally orthogonal): #7394.

@fgaz
Copy link
Member

fgaz commented Nov 13, 2023

It's orthogonal, #7394 is about project-level cabal-install hooks

@adamgundry
Copy link
Member

We have (finally!) opened an RFC with a detailed design as a HF tech proposal: haskellfoundation/tech-proposals#60

We appreciate all the input everyone has already given in this thread, and have tried to consider proposed alternatives and explain our position more clearly. Obviously the final decision is one for the Cabal maintainers, but hopefully having a clearer design document and community discussion via the HF process will help reach consensus.

@sheaf
Copy link
Collaborator

sheaf commented May 8, 2024

This has been implemented in c5f9933.

@ulysses4ever
Copy link
Collaborator

Since this ticket has some subsequent tickets listed in the description and has the status of "tracking", I wonder if it should be in the open state for better visibility and to align with how tracking tickets operate usually...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests