Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate from the .cabal format to a widely supported format #7548

Open
Profpatsch opened this issue Aug 15, 2021 · 125 comments
Open

Migrate from the .cabal format to a widely supported format #7548

Profpatsch opened this issue Aug 15, 2021 · 125 comments

Comments

@Profpatsch
Copy link

In the wake of the exact-printer initiative, I proposed another approach: why not say good bye to the .cabal file format and switch to something that is widely supported.

There is a few alternatives, the most important attribute would be that they are widely supported in industry.

  • JSON
  • YAML
  • TOML

Note that all of these are (mostly) isomorphic to JSON (scalars, lists, dicts), which is important for easy translation between them (e.g. for config generation purposes).

What would this give the Haskell ecosystem?

  • Editor support: every modern editor like vscode has a way of assigning JSON schema to a file, which gives completion and inline documentation for free everywhere
  • Also, syntax highlighting and auto formatting come for free
  • Cabal doesn’t need to implement its own parser. If JSON is chosen, the parser is even context-free.

What would it give to users?

  • Instant familiarity with the format: you don’t touch cabal files all too often as a user, so you don’t want to learn yet another syntax
  • Templating cabal config with standard tooling (e.g. jq, yj), which is important e.g. in a monorepo context
  • Inline documentation without setup

What are others doing?

Most modern package managers that don’t go the full turing-complete configuration route (e.g. Scala’s sbt, Erlang) usually converge their config on a widely supported syntax.

Examples:

  • npm, yarn (package.json, package-lock.json)
  • Stack (stack.yaml, stack.yaml.lock)
  • hpack (project.yaml)
  • cargo (Cargo.toml)
  • Elm (elm-package.json)
  • Maven (pom.xml)
  • poetry (pyproject.toml, poetry.lock)

Counterexamples:

  • go (go.mod), though flat shasums and go packages have no configuration file
  • pip (requirements.txt), though see poetry above
  • sbt, hex: both use their turing complete parent languages
  • leiningen (project.clj), clojure is a lisp, and sexps are already a data format

I don’t expect cabal would drop support for the cabal file format very soon, rather it would start out by generating a .cabal file from the .json/.toml/.yaml for consumption by older version of cabal. Then after a multi-year grace period, the new format would become the standard and projects could drop their autogenerated .cabal files.

@Profpatsch
Copy link
Author

Note that some people have mentioned dhall as a possible alternative, but using it would destroy most benefits, namely:

  • Familiarity
  • Editor support
  • widely available tooling for ops integration
  • simple parser
  • isomorphic to json

@Profpatsch
Copy link
Author

However, I expect that there would be a dhall library for generating cabal.json files, which can aid into integrating dhall-based (dev)ops setups with Haskell packages.

@philderbeast
Copy link
Collaborator

philderbeast commented Aug 15, 2021

Is this cabal.json the plan.json from the cabal docs or is it a .cabal in JSON format?

plan.json (JSON)
A JSON serialization of the computed install plan intended for integrating cabal with external tooling. The cabal-plan package provides a library for parsing plan.json files into a Haskell data structure as well as an example tool showing possible applications.

@ocharles
Copy link
Contributor

ocharles commented Aug 15, 2021

Note that some people have mentioned dhall as a possible alternative

Also note that this already exists as dhall-to-cabal. While its goal was to generate .cabal files, that's not the only solution. A more integrated solution would be a cabal-install that can actually consume these files. I'm not saying this is the solution, just mentioning this as prior art. I'll step out of the conversation for now and let others share there thoughts, but if any one wants to talk about Dhall in particular here, I do have thoughts

@Profpatsch
Copy link
Author

Is this cabal.json the plan.json from the cabal docs or is it a .cabal in JSON format?

It is the current .cabal file in a not-home-grown syntax

@emilypi
Copy link
Member

emilypi commented Aug 15, 2021

I proposed another approach: why not say good bye to the .cabal file format and switch to something that is widely supported.

When I think about it, don't think these are mutually exclusive tickets: an exact printer gets us a reasonable source representation. This is good for a few reasons: we can derive translational tools from the same representation - we only need change the parser and the printer. This frees up efforts to migrate between formats!

There is a few alternatives, the most important attribute would be that they are widely supported in industry.

Of the three suggestions, TOML is the most attractive. YAML is has too much variable syntax, and JSON is aesthetically (and mechanically) displeasing for me to write as a human. TOML's grammar is minimal and admits a small and easy to generate + verify parser and lexer (note: toml-parser is a little outdated) that eliminates the need for us to write it from scratch. In fact, maintaining this would be a dream, and would be a boon for tooling, since we can derive an ABNF for our specific flavor quite easily.

Then after a multi-year grace period, the new format would become the standard and projects could drop their autogenerated .cabal files.

👍

@TikhonJelvis
Copy link

TikhonJelvis commented Aug 15, 2021

As an experiment, I took a library I wrote a few years ago and manually converted its Cabal file to TOML. I like it! The conversion can be totally systematic.

TOML was noticeably nicer to edit—Emacs has a simple built-in TOML mode and I didn't have to worry about indentation/formatting. (I've done Haskell for over a decade now and I'm still not consistent in how I format Cabal files!) Structured commands for navigating and editing the TOML file would be nice; I don't know if something like this already exists, but if it doesn't, adding it to Emacs would be easy. I wouldn't even think of trying something like that for Cabal's custom syntax.

I've used YAML a lot more than TOML in the past. Compared to YAML, I found needing to quote all my strings a bit annoying; on the other hand, TOML was much nicer to pick up and doesn't have weird corner cases to worry about. At work I recently ran into some weird YAML files that used anchors in a way that didn't work in Python—not something that would happen with TOML.

In my dream world we would use an S-expression based syntax (like sexplib) but I know that is not to be :(.

I immediately found that multiline strings were useful. Multiline strings and comments seems like the bare minimum for a human-oriented format; YAML and TOML support that, JSON doesn't.

It's a bit long, but here's the whole file:

cabal-version = "2.2"

[package]
name = "modular-arithmetic"
version = "2.0.0.1"
synopsis = "A type for integers modulo some constant."
description = """
A convenient type for working with integers module some constant. It saves you from manually wrapping numeric operations all over the place and prevents a range of simple mistakes. @Integer `Mod` 7@ is the type of integers (mod 7) backed by @Integer@.

We also have some cute syntax for these types like @ℤ/7@ for integers modulo 7.
"""
homepage = "https://github.com/TikhonJelvis/modular-arithmetic"
bug-reports = "https://github.com/TikhonJelvis/modular-arithmetic/issues"
license = "BSD-3-Clause"
license-file = "LICENSE"
author = "Tikhon Jelvis <[email protected]>"
maintainer = "Tikhon Jelvis <[email protected]>"
category = "Math"
build-type = "Simple"
extra-source-files = ["README.md", "CHANGELOG.md"]

[source-repository.head]
type = "git"
location = "git://github.com/TikhonJelvis/modular-arithmetic.git"

[library]
hs-source-dirs = ["src"]
ghc-options = ["-Wall"]
default-language = "Haskell2010"
exposed-modules = [
  "Data.Modular"
]
build-depends = [
  "base >4.9 && <5",
  "typelits-witnesses <0.5"
]

[test-suite.examples]
hs-source-dirs = ["test-suite", "src"]
main-is = "DocTest.hs"
default-language = "Haskell2020"
type = "exitcode-stdio-1.0"
build-depends = [
  "base >4.9 && <5",
  "doctest >= 0.9",
  "typelits-witnesses <0.5"
]

@TikhonJelvis
Copy link

Another benefit: the format would be naturally extensible. Cabal could provide a section for plugin/tool/etc config, and tools would have no issues parsing values from there. I'm imagining something like this:

[plugin.liquid-haskell]
smt-solver = "z3mem"

My experience has been that providing "extension points" in formats is always useful. We can't figure out everything people want to do with their libraries ahead of time but we can make the format adaptable. If people need something Cabal doesn't support, they can add it while still keeping a single canonical file for library-specific settings.

@gbaz
Copy link
Collaborator

gbaz commented Aug 15, 2021

For yaml there's also of course hpack. So anyone who wants to write cabal files in either yaml or dhall is welcome to do so. Note that we don't have exactprinters for either of those formats either, as far as I know. As I recall, due to the semantics of yaml, conditional clauses are rather unpleasant there, among a few other issues (and pretty-printing reorders things in unpleasant ways as well). (And also as emily notes, the yaml grammar is rather complicated as is).

Toml does seem promising, but I worry that its support for conditionals or other more complex syntax wouldn't be particularly great either. Translations of some more complex files might be worthwhile, to experiment with this.

Btw, note that cabal is already extensible, via "x-" fields.

In any case, I think the right next step is to get the cabal grammar pinned down and to have an exactprinter for at least the format we already have and is widespread.

@philderbeast
Copy link
Collaborator

philderbeast commented Aug 15, 2021

This is good for a few reasons: we can derive translational tools from the same representation - we only need change the parser and the printer. This frees up efforts to migrate between formats!

I like this and am the maintainer of the translational tool hpack-dhall that can translate:

dhall -> cabal
dhall -> json
dhall -> yaml # the package.yaml format of hpack
dhall -> dhall # with imports resolved

I am a bit wary about each format being capable of doing a faithful representation. For instance, hpack's conditionals can break dhall's typing. This is the trouble @gbaz just mentioned.

@Ericson2314
Copy link
Collaborator

Ericson2314 commented Aug 15, 2021

I want this to work, but unfortunately I see some issues with all the proposed formats so far. I think TOML / YAML / JSON will never work, but Dhall, while it might not be a good fit today, can be made to work.

TOML / YAML / JSON

If Cabal files were merely data they would be fine, but unfortunately they are code, due to the conditionals, and parameters used in those conditions. This is true of Cargo packages too, and the solution there has been to stuff syntax into strings. Firstly, this largely defeats the point as we still need application-specific parsers (and pretty printers!) to handle those strings.

But more worryingly, I have reason to believe this has warped the design process of Cargo. See, for example, the back and for with @djc and me in rust-lang/rfcs#3143, where @djc agrees Cargo has backed itself into a corner, but objects to my further using strings, or trying to encode the information in a more structured but awkward and verbose way. I may have disagreed with @djc on which unpleasant choice too take, but I absolutely do agree that TOML forcing Cargo into this awkward situation is trajic, and no one shouldhave to pick between those unpleasant options in the first place.

Cabal converting from the existing design avoids some of the distortion from TOML's perverse incentives, but I have no doubt the language of Cabal files will continue to evolve, and I don't want "TOML goggles" to mess things up going forward.

Dhalll

Dhall is an actual programming language, and therefore squarely fixes the above issues. And to be clear I would really like to endorse Dhall as it is the right sort of way to make these things conform to a standard. There are two quibbles with Dhall as it currently exists however, that I think should be addressed first:

  • imports / IO. As far as I know, Dhall always allows downloading arbitrary stuff, etc. as long as you give it a content address of some sort to make it pure (like fixed output derivations). This is a fine design in general, but I worry about e.g. cabal2nix needing internet access to do it's job, which (ironically giving the nix inspiration!) would be a regression and major pain. If there is a way force Dhall programs to be more self-contained, that would assuage my concern.

  • abstract interpretation. The Dhall model somewhat assumes that dhall will evaluate a closed term, spltting out a value for the consuming application to deal with. But this doesn't totally reflect how Cabal works. Today, we have automatic flags, which means we need to vary parameters based on results. With a few flags this can be brute forced, but with more abstract interpretation is much more efficient. Dhall's strong normalization is good to make such static analysis tractable, but we might also want additional restrictions to make it efficient and also easier for humans to understand.

    Maybe that is overkill for manual flags, but @mpickering's and my presentation on what comes after CPP (https://icfp19.sigplan.org/details/hiw-2019-papers/9/Configuration-but-without-CPP, https://www.youtube.com/watch?v=YupkE1vsZ4o) has gotten me thinking about abstract interpretation more broadly. Eventually we want to tackle the goal of "type safe packaging" i.e. ensuring all valid version solving solutions will in fact compile. It's hard, but not tackling it is anathema to our values, and abstract interpretation of various sorts is key to making it work.

So yeah, in conclusion I want Dhall to work, but it's important we we be able to restrict ourselves to a sort of "mini Dhall" so we can do this analysis and we will have to integrate Dhall with Cabal fairly deeply. I'm not sure whether the current Dhall implementation supports such a restricted "mini Dhall", but that can easily be fixed.

@Bodigrim
Copy link
Collaborator

Then after a multi-year grace period, the new format would become the standard and projects could drop their autogenerated .cabal files.

Remember that Hackage is an append-only repository. It would be utterly disappointing if a future version of Cabal would be unable to build an old package just because it no longer parses its very own package format. So I don't think it would be wise to abandon a parser of Cabal files even after a very long grace period. And if we are to retain the parser and all its complexity, than what exactly are we to gain? What about other tooling (e. g., Stack)?

With regards to editor support, why aim for a generic JSON autocompletion? These days we should not settle for anything less than a domain-specific language server, and custom format is not a hindrance for it.

I'm sorry if my tone sounds harsh, but I'm afraid we are chasing an ideal to the detriment of compatibility, as it's very customary in Haskell community.

@djc
Copy link

djc commented Aug 16, 2021

Maybe Starlark is a decent option if logic is important to this project?

@Mikolaj
Copy link
Member

Mikolaj commented Aug 16, 2021

Given that the quality standards and popularity standings for configuration languages change every decade, I'd rather focus on a good internal representation, support the old cabal format (and only this format) forever-guaranteed and let contributors add exact-parser-prettyprinters for whatever format works best for them. We also need a story for keeping in sync many files that contain the same information or for translation on the fly (e.g., when showing a .cabal form a Hackage webpage of a package).

@kamoii
Copy link

kamoii commented Aug 16, 2021

This might be total stupid ideaw, but how about using limited Haskell for configuration?
For example, configuration is a module exports one binding named config which has type Config. Limited to Haskell98, no GHC extension, no external pacakge, no IO.
Noone is suggesting it so I assuming there is an obvious reason this is not a good idea..

@fgaz
Copy link
Member

fgaz commented Aug 16, 2021

@kamoii the main argument against that is that a Haskell program is not guaranteed to terminate

@ocharles
Copy link
Contributor

The argument is also to not invent something new as much as possible. We want to leverage existing tooling, syntax highlighting, etc. A limited Haskell only lets us benefit from a fraction of this

@phadej
Copy link
Collaborator

phadej commented Aug 16, 2021

Two forgotten things in this discussion:

First: JSON / YAML / ... and even Dhall would still need some stringly sublanguages, as @Ericson2314 hints. Consider build-depends or mixins fields.

build-depends: foo (>=0.4.0.0 && <0.4.1) || (>=0.5 && <0.6)
mixins:        foo (Foo.Bar as AnotherFoo.Bar, Foo.Baz as AnotherFoo.Baz)
"build-depends": {
    "foo": {
      "and": [ { "or": [ { ">=": "0.4.0.0"  } 
                         , { "<": "0.4.1" }
                         ]
                 }
               , { "or" : [ { ">=": "0.5" }
                          , { "<": "0.6" }
                          ]
                 }
               ]
    }

(better would be model version numbers as [0 4 0 0] i.e. array of integers - though what [0.0 4.0 0.0 0.0] means?!)

I don't even try to model mixins. Dhall would look terrible as well (from dhall-to-cabal README)

in    GitHub-project { owner = "ocharles", repo = "example" }
    ⫽ { version =
          prelude.v "1.0.0"
      , library =
          prelude.unconditional.library
          (   prelude.defaults.MainLibrary
            ⫽ { build-depends =
                  [ { package =
                        "base"
                    , bounds =
                        prelude.majorBoundVersion (prelude.v "4")
                    }
                  ]

There is also license which uses SPDX license expressions,
which is a standard just for that.
NPM embedds them as string, i.e. there is no benefit from generic JSON
strings helping edit them. (though honestly that field is rarely edited).

EDIT: Also file globs (though I think that was a mistake to add them to .cabal format)

If we use stringly sublanguages (like in @TikhonJelvis examples) we we will
need to explain their syntax anyway.
Nothing changes in comparison with current format.

Writing a tool to automatically edit bounds is still difficult
with stringly build-depends (as difficult as today, I would say).

Second: Performance matters. Solver parses plenty of package descriptions
while figuring out dependencies. Dhall unbounded computation costs is asking for problems.
Package descriptions in indicies should be (close to) normal forms.
Common stanzas make current format not normal, but their substitution is cheap (linear cost).

Currently hackage-tests test suite (cabal run hackage-tests parsec)
reports on my machine:

Reading index from: /cabal/packages/hackage.haskell.org/01-index.tar
151055 files processed
41573 files contained warnings
0 files failed to parse
147.663162 seconds elapsed
0.977546 milliseconds per file

That 1ms per file is a good goal. cabal is used as an interactive tool.

A solution is that cabal sdist would normalise the package description files
before packing a source tarball. That would work, but we would need
to specify the normal form independently.
The normal form would need to be only readable by humans, not necessarily
convenient to write.

That approach would make sense for revisions too, it might be substatially
easier to specify which edits are valid on the normal forms, then on "full"
grammar. The current check is semi-syntactical, which is somewhat limiting.

Another solution is that cabal update would produce a cache
with normalized descriptions. The drawback is that it would take
at least 3 minutes! (Or be too clever and brittle trying to reuse older caches).


If we really want to change the format to something "used elsewhere",
then EDN is actually not that bad (I was taught scheme in school).

:build-depends
  { "foo"
    (|| (&& (>= #(ver 0 4 0 0)) (< #(ver 0 4 1)))
        (&& (>= #(ver 0 4) (< #(ver 0 6))))
    )
  }

:mixins
  { "foo"
    (as [Foo Bar] [AnotherFoo Bar])
    ; the drawback is that everything is different, if EDN structure is used deeply:
    ; even the module names, as "Foo.Bar" is an expression in a sublanguage for module names,
    ; something general EDN tools are not aware of.
    ...

TL;DR, I challenge JSON, ..., Dhall suggestors to model e.g.

in their favourite "syntax" format. Otherwise this discussion is just
wasting everyones time by not being concrete.

(IMO simple examples don't tell much, simple stuff is easy).

@jmorag
Copy link

jmorag commented Aug 16, 2021

Does "unlimited Haskell" as opposed to limited Haskell qualify as not something new? IMO the argument that Haskell is turing complete isn't that compelling, as the nix expression language is also. With a cabal file being just some Haskell expression of type Config or a single file program

import Cabal

main = buildPackage PackageOptions {...} -- dependencies and build configuration here

we get to use all of the existing Haskell tooling and get around the sublanguage issues by representing everything as normal Haskell values, which if I understand correctly cabal today does anyway.

Going further with this train of thought, it seems like any configuration format, JSON, YAML, Dhall, edn, TOML, etc. is basically some level of indirection that gets parsed into a Haskell value at build time, so why not just focus on making a more convenient EDSL for Cabal the library?

@gbaz
Copy link
Collaborator

gbaz commented Aug 16, 2021

There's a reason we encourage cabal files rather than custom setups -- far easier for external consumption (even with a not fully specified grammar). To get values out of a haskell executable it needs to either emit them (in which case the format it emits in is the actual spec) or you need to build and link into it directly. Either way you're compiling and building a haskell program every time you want to ask "what modules does this package provide." That is not feasible for, e.g., a package store such as hackage.

@jmorag
Copy link

jmorag commented Aug 16, 2021

The external consumption argument is very compelling. I guess we could have cabal generate a lockfile from a build specification in Haskell and have other tools read that. We already have cabal.project.freeze/stack.yaml.lock so there's precedent, but those files haven't historically been required.

@fgaz
Copy link
Member

fgaz commented Aug 16, 2021

...but then you get in the same situation as now, it's just that cabal is doing the conversion instead of dhall2cabal/hpack/...
You still have to commit/upload/distribute the redundant (as opposed to freeze files) generated file

@jneira
Copy link
Member

jneira commented Aug 16, 2021

In this issue there is an interesting discussion about how to handle other configuration formats than the builtinn cabal one: #5343

@andreasabel
Copy link
Member

I see no problem that a e.g. YAML outer syntax has to be complemented by ad hoc expression syntaxes for certain fields (constraints etc.) that transcend YAML. Having an outer YAML syntax would still allow third-party tools easy access to certain contents of the .cabal file, and nice syntax (that is, the current syntax) for constraints can parsed from string fields using/adapting the existing cabal parsers.

YAML-bombs can be avoided by restricting to a sublanguage of YAML.

The syntax examples in #7548 (comment) look like straw-mans to me.

@michaelpj
Copy link
Collaborator

Does "unlimited Haskell" as opposed to limited Haskell qualify as not something new?

I encourage anyone who thinks this is a good idea to think about how much fun Setup.hs is already (hint: extremely unfun). That is: if you need to compile and run a Haskell program to work out what your config is now you need configuration to work out how to compile and run the config program. What compiler options does it use? What libraries does it have access to? What GHC version is it using? etc. And what if the level-2 configuration is also a non-trivial Haskell program? Time for level-3 configuration. Extremely unfun.

@phadej
Copy link
Collaborator

phadej commented Aug 18, 2021

Having an outer YAML syntax would still allow third-party tools easy access to certain contents of the .cabal file, and nice syntax (that is, the current syntax) for constraints can parsed from string fields using/adapting the existing cabal parsers.

What's wrong with using Cabal as a library? I had great success with that. You need it anyway for expression parsing.

@andreasabel
Copy link
Member

andreasabel commented Aug 18, 2021

What's wrong with using Cabal as a library? I had great success with that. You need it anyway for expression parsing.

For the Haskell programmer, there is the obstacle of Cabal being a large package that regularly undergoes changes.
Some third parties might not even use Haskell to write code that extracts information from a .cabal file. YAML parsers are ubiquitous...

Anecdotally, I have just written a small tool (https://github.com/andreasabel/cabal-clean) to partially clean artefacts from dist-newstyle/build, and I originally considered drawing some information (version, tested-with) from the respective .cabal file. But I shied away as there was no light-weight parser for cabal files.

@ocharles
Copy link
Contributor

ocharles commented May 24, 2023

I'd just like to add to point 2 (syntax highlighting): there is a movement towards using tree-sitter to provide syntax highlighting (for example, all syntax highlighting in Helix uses tree sitter). There is a WIP tree sitter grammar for Cabal here, so this work essentially only needs to be done once and you'll cover all editors that can take advantage tree-sitter.

@TristanCacqueray
Copy link
Collaborator

As a user, after the initialization, updating the modules list is my main interaction with cabal files, and it's a bit annoying as this makes ghcid reload the whole project. It would be nice if a new format or version could improve this.

@mouse07410
Copy link
Collaborator

It would be nice if a new format or version could improve this.

I fail to see how or why changing format would improve this. I.e., "fertilizer" in any language smells the same.

@ivanperez-keera
Copy link
Contributor

If there were no other problems to address in Haskell ecosystem, I'd say - why not, it's your time. Given that there's plenty of existing real problems - I see no need to invent and invest efforts into solving artificial ones. "A change for the sake of a change" is not a good idea.

I think this is a very important point and highly underappreciated. I don't know how many are really bugs, but there are currently 384 issues labeled "bug" in cabal alone.

A change of this magnitude affects everyone, every developer has to spend time updating things on their side. Progress is great, but stability would be greatly appreciated (it makes a lot of economic sense to aim for more stability than what we have atm). Cabal has introduced many breaking changes in the last few years. I don't have to maintain the parser, of course, but, as a Haskell developer, I personally don't have a serious problem with the traditional cabal format.

@hasufell
Copy link
Member

A change of this magnitude affects everyone, every developer has to spend time updating things on their side.

This depends on how we do the migration. A slightly different proposal:

  1. cabal will support both formats indefinitely (but will not advance the old format)
  2. hackage will accept both formats for, say, 5 years while printing warnings during upload and reject the old format afterwards
  3. cabal will have a subcommand to migrate a package automatically

This wouldn't be very disruptive to the ecosystem. Additionally, we could write an automated migration for all of hackage (this might need tweaks about how revisions work).

The unfortunate thing will be that both formats will exist for quite some time. That could be confusing, even if the new format is the default and enforced on hackage.

@mouse07410
Copy link
Collaborator

mouse07410 commented May 25, 2023

This depends on how we do the migration

IMHO, your proposal would only postpone the described migration pain, not avoid it.

Also, there's no line of Haskell developers requesting a different .cabal format ("oh if only Cabal would intake YAML instead, how greatly our productivity would increase"). Nobody really needs this... E.g., list one project property that one cannot express in the current .cabal, but would be able to in YAML?

Also, the cost of your proposal would be maintaining now two formats fur a time long enough to burden the maintainers.

Plus, the inevitable cost of migrating other tools in the toolchain. Since it's all volunteer- maintained, the likelihood of some tools migrating quickly, some - slowly, and some - not at all, is fairly high. Just look at how packages (or rather their maintainers) fail to update from, e.g., GHC-8 to GHC-9 (causing failures down or up the chain).

In short, I see this as a change that nobody needs, with few if any appreciable benefits, and huge disruptive costs.

@hasufell
Copy link
Member

Nobody really needs this

I don't know. Nobody really needed ghcup either. You could install GHC anyway. It was just a stepping stone in usability. That's the same here.

Note: I'm not really sure either way and if it's worth it. I'm just trying to challenge arguments.

@angerman
Copy link
Collaborator

Just look at how packages (or rather their maintainers) fail to update from, e.g., GHC-8 to GHC-9 (causing failures down or up the chain).

I'm all with you on the fact that the change of format is not (yet) well motivated (and weighted against the impact).

But let's not throw maintainers under the bus here. It's not the maintainers that fail! It's the compiler that abruptly fails to accept code that was previously accepted perfectly fine.

@Ericson2314
Copy link
Collaborator

Replacing Setup.hs with, say, Ninja, would make me a lot happier than giving me a new surface syntax for Cabal files. I would the biggest problem with Cabal / cabal-install is that it's just too much code / accumulated too many features.

Insofar that there is opportunity costs to everything, I might rather figure out how we can deprecate and remove a bunch of functionality than spend time on this.

It feels like a case Wadler's law, to be honest.

@mouse07410
Copy link
Collaborator

Just look at how packages (or rather their maintainers) fail to update from, e.g., GHC-8 to GHC-9 (causing failures down or up the chain).

But let's not throw maintainers under the bus here. It's not the maintainers that fail! It's the compiler that abruptly fails to accept code that was previously accepted perfectly fine.

I've seen a lot of breaking changes, mainly in the packages that make incompatible changes in their API, without a care in the world about others that might depend on them. Compound this by typically large dependencies trees, and you'll understand how a newcomer feels about Haskell ecosystem... That's been my biggest gripe with the Haskell ecosystem in general - nowhere else have I seen such an amount of instability. I admit that it became better in the last couple of years or so. But there's still a lot of room for progress in this area.

@angerman
Copy link
Collaborator

@mouse07410 yes. And this is a direct result of the compiler breaking existing packages that are perfectly fine.

If you do not upgrade to a new compiler version, you can keep your existing packages just fine.

Now almost every compiler release requires material changes to the package. Of course the maintainer ends up making most likely only the latest release compatible with the new compiler. Making older releases compatible is a work investment that needs justification. And now you are forced to update to that package, by proxy of the new compiler (you want), not anymore accepting the package (you used).

In any case this is not the correct thread to discuss this. And I think we agree there are more pressing topics than a change of the .cabal format. As others have said, it's an open source project and everyone is free to spend time on what they deem interesting.

@hasufell
Copy link
Member

And I think we agree there are more pressing topics than a change of the .cabal format

Power users are mostly blind to usability issues. This is the problem.

We're used to dealing with the warts.

I think the current cabal file format is hostile towards new users. The rise of hpack is proof: people don't want to deal with it. But hpack causes more problems.

@AshleyYakeley
Copy link
Member

If you want usability, use stack. It's one tool to install, does everything for you, including installing any other stuff you need, based specifically on what you put in your stack.yaml. Can also build in a Docker container if you need it, for no extra trouble.

If you want separation of concerns, use ghcup, cabal-install, hpack as separate tools, at the cost of usability.

@hasufell
Copy link
Member

If you want usability, use stack.

You're on the cabal issue tracker.

Stack is irrelevant to this issue. We're trying to figure out how to improve usability for cabal here.

The hpack workflow has already been sufficiently explained to be worse for usability.

@AshleyYakeley
Copy link
Member

The hpack workflow has already been sufficiently explained to be worse for usability

I don't think so, it's been explained as complicating the tool design. But it would be better for usability.

Likewise, integrating ghcup's automatic GHC installation would be a big improvement for usability, but a complication from a design perspective.

@AshleyYakeley
Copy link
Member

You're on the cabal issue tracker.

Changing the .cabal format is an infrastructure thing that affects both the cabal-install tool and stack equally.

@hasufell
Copy link
Member

You're on the cabal issue tracker.

Changing the .cabal format is an infrastructure thing that affects both the cabal-install tool and stack equally.

You're moving goalposts.

You told people on the cabal issue tracker that they should use stack if they want usability.

At this point I'm not sure if you're trolling.

Yes, it will affect stack as well, but stack doesn't do its own parsing of cabal files. The Cabal API used can stay largely the same. Pantry might need some adjustment, but that won't be hard.

@AshleyYakeley
Copy link
Member

I'm saying, if you care about cabal-install usability, you should copy features from stack, as some people are asking for (#8605). But this is in tension with your desire to keep the design of cabal-install simple.

You need to pick one: do you want a single tool that does everything (most usable), or do you want cabal-install a tool that just does one thing well (simpler design)?

@ivanperez-keera
Copy link
Contributor

ivanperez-keera commented May 26, 2023

@angerman

And this is a direct result of the compiler breaking existing packages that are perfectly fine.

If you do not upgrade to a new compiler version, you can keep your existing packages just fine.

It tends to be pretty hard to stay behind in practice, for the reasons you indicated after this.

Now almost every compiler release requires material changes to the package.

In any case this is not the correct thread to discuss this.

I think it is. Such local decisions compound and affect the community as a whole. Using this logic, projects would be making decisions about how people are affected by breaking changes considering only their own package.

When you consider that similar thinking is being applied to many other parts of the Haskell ecosystem, then it's easier to see that the instability of it compounds and becomes too much. No single project will be able to completely stop it. It takes all projects together to do so. So it's not just that this conversation is important here, it's that it's important every time it comes up in any project that is core to the community.

As others have said, it's an open source project and everyone is free to spend time on what they deem interesting.

True, although as a community we are all contributing, and we all rely on cabal & ghc do to it. Changes to those packages affect every one of us. Someone being free to spend their time on whatever they want doesn't mean that the change should be accepted in cabal or ghc.

@angerman
Copy link
Collaborator

@ivanperez-keera yes, that was precisely my point. The failure to keep any resemblance of backwards compatibility, and upstream continuously hard breaking the whole ecosystem has massive ripple effects. If maintainers were to spend less time dealing with the churn of adapting to new compiler version that simply reject their code, you could upgrade the compiler, and still use the old library perfectly fine. If you decide to upgrade to a newer major version of that library, that is your choice, and not one force onto you by trying to stay somewhat current with the compiler. Right now new compiler, almost always implies a slew of new dependencies. Why? Because the compiler (with each 6mo release) introduces breaking changes, that make old code incompatible. It's insane!

I think we completely agree on this topic. And while I see the relation to breaking the cabal format, I still would like to not derail this thread too much into that direction.

Yes, it's everyones time and they are free to do what they want with their time, and what hopefully makes them happy. I did not say anything about me supporting such a change in anyway. I'm highly skeptical. I will remain open minded, but I would want to see a purportedly better format to be better in every dimension, not just read/write support for it in some other non-haskell language.

I don't think cabal is perfect. And I absolutely hate the conditional logic in it. It's almost as bad a CPP from a flattening out standpoint. But I will give the cabal format that it's fairly straight forward.

If someone can come up with a better format, that doesn't regress in any significant way, solves lots of warts the cabal format has, I'm happy to lend my support to it; I have not yet seen it though, but that doesn't mean it doesn't exist.

@matthunz
Copy link

Hey just wanna bring life to this old issue!

I made a little example of how a Cargo style interface could make cabal way easier to use.
https://github.com/matthunz/hoot

We've been talking on reddit https://www.reddit.com/r/haskell/comments/14v3wo7/would_anyone_be_interested_in_hoot_a_cabal/
about how TOML could be helpful

@seanhess
Copy link

Hey just wanna bring life to this old issue!

I made a little example of how a Cargo style interface could make cabal way easier to use. https://github.com/matthunz/hoot

We've been talking on reddit https://www.reddit.com/r/haskell/comments/14v3wo7/would_anyone_be_interested_in_hoot_a_cabal/ about how TOML could be helpful

I’m just a single person of course, but I’m coming back to Haskel after a few years in Rust and other languages. I strongly disagree with the comments in this issue about this being an unnecessary change. I think I remember one of them being “there isn’t real demand, this is just one person asking for something”

I don’t think that’s true. I think many of us who would benefit from cabal working similarly to other modern build systems are just less likely to be core Haskell contributors. I’m trying to get involved and make up for that now, but I suspect there are many others like me who would benefit.

@fgaz
Copy link
Member

fgaz commented Jul 10, 2023

@seanhess I think nobody is questioning that using a "widely supported format" would be useful, that much is clear. The point is that there are two obstacles:

  1. Migration and backward compatibility
  2. The new format has to be sufficiently expressive and performant

Any proposal for a new format would have to address those two concerns, and as far as I can see so far none did.


I don't want to sound condescending, but this ticket is already really long1 so please read the conversation before commenting, or at least the comments I linked above and/or the ones with most reactions.

Footnotes

  1. 119 comments, the longest ticket in this tracker

@matthunz
Copy link

Thanks for sharing! I read a little into that issue but those comments really shed a lot of light. I do still think that the advanced cabal examples (such as https://hackage.haskell.org/package/raaz-0.3.0/raaz.cabal) could be solved with something like Rust's feature flags

@BurningWitness
Copy link

I find myself strongly opposed to the idea of moving to a common format, so here are the counterpoints:

  • Instant familiarity

    There's An INI critique of TOML that argues the precise opposite through the lens of a person without any prior knowledge of the format. Having to learn another syntax is arguably a lesser evil than having to learn another flavor of TOML/YAML.

    An extreme case of this is Kubernetes with its box of DevOps joke.

  • Ease of implementation

    This is only true for small configuration files that fully fit into the spec of the format. For this specific case the library that parses/serializes the format already does everything you need it to do.

    With an extra format you still require a parser, it's just that instead of parsing from a sequence of bytes or a custom grammar you are now bound to a far wider grammar with some tokens outright missing (build-depends was mentioned many times in this discussion). At best this means you're tying yourself to libraries and/or external tools that process the common format, at worst you just signed up for writing said libraries and/or external tools.

  • Syntax highlighting

    As in syntax highlighting for the wider format, it wouldn't magically know the meaning of the keywords Cabal introduced. If the highlighting is going to be partial or incorrect, everything might as well just be one color.

  • Templating

    I don't get the appeal and I assume the reasons for this are pragmatic, so I feel like templating is only brought in when there is no other way to create or alter the file. Note that format changes up to date have been very conservative: deprecations are rare and no sections have been removed. Some day Cabal will do a major bump that rewires something big and everyone will have to spend several hours upgrading their meta-tooling.

    The correct answer here would be a separate tool to do this, ideally maintained by Cabal (call it cabal edit), which would allow flexibility in moving and removing certain options in a backwards-compatible manner.

All in all the gains are speculative, overshadowed by the sheer girth of maintenance a change like this would entail. I fully agree with #7548 (comment), better tooling is both easier to define and far more desirable.

@vanceism7
Copy link

TL;DR, I challenge JSON, ..., Dhall suggestors to model e.g.

https://hackage.haskell.org/package/transformers-compat-0.7/transformers-compat.cabal
https://hackage.haskell.org/package/raaz-0.3.0/raaz.cabal

in their favourite "syntax" format. Otherwise this discussion is just
wasting everyones time by not being concrete.

I dunno... Doesn't seem that bad to me ¯\_(ツ)_/¯
https://gist.github.com/vanceism7/9db49c255b5ca1677e11fdf17a699fda

(I'm being super tongue-in-cheek here, I just pasted the cabal file to chat-gpt and had it generate equivalents for me in json and yaml. I'm sure maybe there's some difficulty here I'm missing, but the yaml file in specific looks pretty neat in my eyes)

@rhendric
Copy link

some difficulty here

All those duplicated if, then, else keys are likely problems for many implementations of either JSON or YAML, for starters.

@gbaz
Copy link
Collaborator

gbaz commented Oct 18, 2023

Those files won't work for a huge variety of reasons, basically all of which have been discussed at length in this thread. This is why we need people to design protocols through thought and discussion, rather than asking machines to churn out "neat" looking but wrong slop.

@ivanperez-keera
Copy link
Contributor

ivanperez-keera commented Jan 24, 2024

This issue has stalled for a while. The benefits of switching currently to a different format are severely outweighed by the cost of 1) unavailable features (e.g., syntax highlighting), 2) implementation effort, 3) temporary introduction of complexity in cabal (until all old formats can be removed, which would take years) and other tools that work with cabal files, 4) potential need for additional tooling, 5) the need for new learning by people in the community, 6) adaptation effort in the Haskell ecosystem (packages would have to update), 7) documentation that would have to be put together and maintained for years while the community transitions. Many of these points, and others, were very well captured by #7548 (comment) and #7548 (comment). Cabal has a long history of breaking the interface, and this would simply add to that and create more breakage of packages, in a community that already struggles to put enough energy to keep packages well maintained.

There are currently 416 bugs open in Cabal's repo (out of 1521 issues total), many going back for many years.

I propose that we simply close this item as "not for now" and focus on stability before moving on to bigger changes that would require a huge investment by the community at large.


General note

I would also like to ask people who propose new changes to take the seat of devil's advocate, and try to think also of good reasons not to do things, as well as the cost that it has for everybody (almost literally: imagine if you had to pay everyone in the community who's going to spend time as a consequence of this change $250/h; how much would that rack up to?), and how it makes the ecosystem better as a whole for everyone (beyond their own specific interest). Many of us are also investing a lot of effort into promoting haskell and getting it adopted into our companies, which would benefit the community at large. Understanding the impact for the client and the user is important to increasing Haskell's adoption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests