Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proof of concept removing dependency on Algebra. #762

Closed
wants to merge 4 commits into from

Conversation

milessabin
Copy link
Member

This is a very crude proof of concept, removing the dependency of Cats on Algebra. This is for discussion and at @non's invitation.

My primary motivation for wanting a discussion around this is that I believe that the current split of type classes between Cats and Algebra, although it has perfectly reasonable explanation in terms of the history and evolution of Spire, Algebird, Algebra and Cats, is difficult to understand and motivate now. I think that in both the short and the long run it will hinder the adoption of Cats.

We are promoting Cats as the entry point into the Typelevel world, and have gone to considerable lengths to make the type classes that we provide consistent, accessible and well documented. Unfortunately several of the type classes we can expect newcomers to turn to first (Eq, Order, Monoid) are not part of Cats, they're part of Algebra. So having been introduced to Cats people are almost immediately bounced over to Algebra where they will find code written in a noticeably different style and surrounded by specialized artefacts like lattices and Heyting algebras which are most likely quite distant from their immediate interests. I don't think that this provides a good initial experience for people new to Cats.

Nb. I fully appreciate the reasons for the stylistic differences, nor do I mean to imply that Algebra isn't consistent and well documented. This isn't a criticism of Algebra, merely an observation that it's goals pull it in a somewhat different direction.

@mpilquist
Copy link
Member

👍 in principle. If cats-core is too big of a dependency for projects that use algebra today, I'd be in favor of a cats-kernel module that contains Eq, (Parial)Order, Show, Semigroup, Monoid, and Group.

@rklaehn
Copy link
Contributor

rklaehn commented Dec 17, 2015

One thing first: does simulacrum support specialization? If not, we can not use simulacrum, since spire and algebird really rely on specialization heavily for performance. I would also prefer a cats-kernel module, because I fear that the more complex parts of cats evolve too quickly to serve as a stable foundation for algebra/spire/algebird

I think Hash is fundamental enough to warrant inclusion into cats-kernel. Then there would be a typeclass-based replacement for everything on java.lang.Object. But maybe that is another discussion.

@dwijnand
Copy link
Contributor

@rklaehn says:

One thing first: does simulacrum support specialization? If not, we can not use simulacrum, since spire and algebird really rely on specialization heavily for performance.

According to http://typelevel.org/blog/2013/10/13/spires-ops-macros.html:

You might wonder how the Ops macros interact with specialization. Fortunately, macros are expanded before the specialization phase. This means you don't need to worry about it! If your type class is specialized, and you invoke the implicit from a specialized (or non-generic) context, the result will be a specialized call.

@ceedubs
Copy link
Contributor

ceedubs commented Dec 17, 2015

Grouping Eq, (Partial)Order, Show, Semigroup, Monoid, and Group into a cats-kernal seems a little arbitrary to me. I fear we are just trying to create the "core" that algebra was supposed to be in the first place. What would people think of a module that included Eq/(Partial)Order, one that included Show, and one that included Semigroup, Monoid, and Group? As far as I know, these modules are orthogonal to each other and this seems like a more meaningful breakdown to me. It might avoid another reorg if yet another library gets thrown into the spire/algebird/algebra/cats mix that people would like to support.

@milessabin
Copy link
Member Author

I think the K and non-K classes should be together ... I didn't mention that above, but them being split between Cats and Algebra is another of my issues with the current situation.

@milessabin
Copy link
Member Author

Apropos the Travis failure, it looks like the state tests aren't covered by the test target, only validateJVM ... fixing now.

@rklaehn
Copy link
Contributor

rklaehn commented Dec 17, 2015

@dwijnand The ops macros are from machinist, and are for providing syntax like infix ops. An approach that e.g. algebird does not use, afaik. I am more concerned about simulacrum, which creates the typeclass boilerplate and about which I really don't know enough about.

@ceedubs I think Eq is so fundamental that it does not really make sense to have a module which does not use it. How would you test your laws without Eq?

If you really want to spilt it, I would have Eq, (Partial)Order, Show and Hash in one module, and then the Semigroup/Monoid/Group hierarchy in another one. But for me personally having everything in one module would not be a big problem. As long as the lattice and ring hierarchies remain outside, this should still be a relatively small library.

And @denisrosset was talking about inserting more classes in typelevel/algebra#138 . E.g. if we wanted to add a Magma, we would have to do it before enforcing binary compatibility.

@dwijnand
Copy link
Contributor

@rklaehn Sorry I was unclear, my link wasn't for machinist. I was just providing evidence that the use of macros and macro annotations don't interfere with specialisation.

So I expect if the typeclasses were annotated with simulacrum's @typeclass as well as @specialized it would be specialised.

@davegurnell
Copy link

From an adoption point of view it would be excellent if Cats provided all the basic type classes in one place. It'll make it much more approachable for beginners.

@denisrosset
Copy link
Contributor

@rklaehn @ceedubs I'm in favor of a very small core package. The criteria for inclusion would be:

1 the name of the typeclass, its scope and the shape of its methods are all widely agreed upon,
2a either the typeclass is of common use in several contexts,
2b or it complements a typeclass that satisfies 2a
3 it provides a good experience for newcomers

With that in mind, I get the list Eq, (Partial)Order, Hash, Show, Semigroup, Monoid, and Group.

In criterion 1, I would include evidence along the lines of:

  • the typeclass is already present in several Scala libraries, eventually with slight naming differences,
  • the typeclass is already part of the standard library of other FP languages,
  • the typeclass corresponds to a standard algebraic structure.

Typeclasses of ambiguous implementation are out, i.e. when there are several correct implementations with different trade-offs.

In criterion 2a, I would ask that the typeclass has applications in several contexts. This would rule out things like lattices, Heyting algebras and so on. Semigroup and Monoid satisfy 2a, as you would use them e.g. in Spire and in a JSON library. IMHO, Group does not satisfy 2a, as I would be used only in a mathematical context, but is included for consistency (2b).

Criterion 3 is more subjective. For example, if the typeclass is consistently one of the first introduced in tutorials.

@denisrosset
Copy link
Contributor

With that in mind, Eq, (Partial)Order, Show, Semigroup, Monoid, and Group seem a good choice. I would also add Hash, if we agree on criterion 1.

The goal would be then to freeze this small core package and obtain unrestricted binary compatibility.

Essentially, changes to this core package should be made in synchronization with major releases of Scala itself (i.e. 2.12, 2.13, ...)

@denisrosset
Copy link
Contributor

@davegurnell : while I understand completely the argument, it will probably run against consistent binary compatibility.

I imagine some of the basic cats typeclasses evolving at a much quicker pace than Eq or Monoid.

@denisrosset
Copy link
Contributor

BTW, the Eq, (Partial)Order, Show typeclasses are the easiest to grasp, coming from an imperative background, while Semigroup and Monoid are perhaps second. This core package would then be an excellent introduction to functional programming, the typeclass pattern, for cats newcomers.

@stew
Copy link
Contributor

stew commented Dec 17, 2015

Eq, Order Show are also things that aren't really related directly to category theory, but are all things which are replacements for poor choices in the std library. It makes me wonder if we should just be starting on the project to create a better stdlib... but that is probably an enormous can of worms

@rklaehn
Copy link
Contributor

rklaehn commented Dec 17, 2015

Things like Show and Hash fit neither into algebra nor into cats. But it would be kind of silly to have a library just for Show. I think having it in cats-kernel or whatever would be fine.

Regarding starting the project to create a better stdlib: I think the whole notion of a large stdlib that handles everything from xml to parser combinators to swing guis is misguided. Also, in a way, cats and algebra are a standard lib. They contain the stuff (typeclasses and laws for them) that is most beneficial for people to agree on.

@erik-stripe
Copy link
Contributor

I would love to hear from as many Cats users (and maintainers) as possible.

  1. How do you find the current split (core type classes in algebra)?
  2. Would you prefer to instead define these type classes in Cats?
  3. How important (if at all) is compatibility with Algebra?
  4. Would you prefer to define cats-kernel, or do you prefer a larger cats-core?

I had been hesitant to consider making this change, but after hearing that a number of folks are unhappy with the current situation I encouraged Miles to open this PR so that we can figure out the best way to move forward and do it. (Thanks @milessabin!)

(EDIT: This is @non from a work-based Github account.)

@TomasMikula
Copy link
Contributor

I think the K and non-K classes should be together ... I didn't mention that above, but them being split between Cats and Algebra is another of my issues with the current situation.

It's a question of where you draw a line. For example, I recently needed a MonoidK[F[_[_]]] (as opposed to MonoidK[F[_]] present in cats). It is special enough that I don't expect it to be in a core library. Is MonoidK[F[_]] common enough to be in the core library?

@julien-truffaut
Copy link
Contributor

I would also prefer to see Eq, Order, Semigroup and so on in cats.

My main concern with the current algebra - cats split is that I cannot justify why one typeclass is in one project or another and I am sure other users will face the same problem.

I am not a big fan of a cats-kernel mainly focus on binary compatibility / freeze. Once we reach 1.0, we will be more careful about binary compatibility for all cats modules. If we want to experiment, we can either do it on a separate project or on a module flagged as experimental.

@milessabin
Copy link
Member Author

I've just pushed a fix for the tut breakage. I'm still seeing a very puzzling issue with Order on 2.10.5 ... see the discussion here ... if anyone has any insight, please let me know.

@milessabin
Copy link
Member Author

The 2.10.5 build issue has been explained by @dwijnand ... in Algebra the -Ywarn-value-discard flag isn't enabled: https://github.com/non/algebra/blob/23bf1748dfc5d5986aafa368f49597eb264b6cef/build.sbt#L27. We can do the same, specifically for 2.10.x builds.

@ceedubs
Copy link
Contributor

ceedubs commented Dec 18, 2015

How do you find the current split (core type classes in algebra)?

I find it slightly cumbersome, but not a huge problem.

If Cats really just wants Order, Eq, Semigroup, and Monoid (which I think is pretty much the case), it seems like overkill to also get something like BoundedSemilattice. However, since the lattice module was pulled out, this doesn't seem like too much of an issue to me.

I think my main concern here is potentially conflicting interests in the two libraries. For example, algebra's Order.compare returns an Int for performance reasons, but I suspect that if Cats had its own it would return and ADT something like LessThan | GreaterThan | EqualTo. Similarly, I think there has been talk of adding instances for Array, while in Catsland I think it has been decided that this sort of thing should probably reside in Alleycats.

Would you prefer to instead define these type classes in Cats?

As someone who has done a lot of work with Cats and not much with Algebra, this is slightly appealing. However, I don't like the idea of Cats and Algebra defining their own separate Order, Monoid etc type classes. These are already defined in a number of libraries; I don't want to worsen the problem!

How important (if at all) is compatibility with Algebra?

Personally at the moment, not that important. Perhaps if I do more number-crunching at work in the future it would be more important to me. And perhaps this isn't the best reason for a technical decision, but if dropping Algebra compatibility would mean losing @non's interest in Cats, then I would really prefer to avoid that path.

Would you prefer to define cats-kernel, or do you prefer a larger cats-core?

I'm neutral here. As mentioned previously, I would even be okay with several micro modules instead of a cats-kernel if it meant some useful sharing between Cats and Algebra/etc. I think if modules can be cleanly separated that's a great thing to do (I think before with things like std and data we just weren't trying to split into orthogonal concerns). If those libraries are okay with bringing in all of cats core, I don't mind a larger cats-core, but to my understanding they (understandably) want a very stable core and thus this probably isn't desired.

@codecov-io
Copy link

Current coverage is 85.56%

Merging #762 into master will decrease coverage by -2.65% as of 24ff68b

@@            master    #762   diff @@
======================================
  Files          166     175     +9
  Stmts         2292    2557   +265
  Branches        74     107    +33
  Methods          0       0       
======================================
+ Hit           2022    2188   +166
  Partial          0       0       
- Missed         270     369    +99

Review entire Coverage Diff as of 24ff68b

Powered by Codecov. Updated on successful CI builds.

@aryairani
Copy link
Contributor

Data point: I was looking for SemigroupLaws in cats, but couldn't find it.

@non
Copy link
Contributor

non commented Jan 4, 2016

It sounds like almost everyone is in favor of this. I'm fine with merging it and then seeing where we stand.

@milessabin
Copy link
Member Author

@non thanks! :-)

Given that this is a fairly big change ... can I get one more 👍 from a committer before going ahead?

@mandubian
Copy link
Contributor

👍 for moving those algebra in cats... It's hard to explain that to people & I've been caught myself at the beginning while using cats and searching for typeclass definition...
Moreover, I'm not sure about the current lifecycle and evolution of spire/algebird compared to cats...
Concerning cats-kernel, why not, it could keep them isolated until we see clearer in cats ecosystem...

@rklaehn
Copy link
Contributor

rklaehn commented Jan 4, 2016

I am OK with this as long as the goal of not having duplicated typeclasses between algebra/cats/spire is retained. The cats-kernel modularization would have to happen ASAP so that algebra can depend on cats (I guess algebra would not want to depend on all of cats)

@denisrosset
Copy link
Contributor

@rklaehn I'm worried about algebra depending on cats-kernel, and not the other way around, as cats is quite opinionated about the structure of syntax extensions (by simulacrum), the usage of type-projector, and probably typeclassic in the future.

As I understand the vision of non/algebra, it was designed to unify the basic type classes used in several Scala projects without requiring further dependencies.

I opened an issue in typelevel/algebra#142 as well.

@larsrh
Copy link
Contributor

larsrh commented Jan 4, 2016

Not a maintainer, but 👍 anyway on the big picture (removing dependency on algebra, with the prospect of reversing the dependency).

@rklaehn
Copy link
Contributor

rklaehn commented Jan 4, 2016

@larsrh I think it is crucial that typelevel scala has a common set of basic typeclasses. So given the strong reactions on the algebra side, this should only be merged once the algebra/spire/algebird side is fully on board with the change. Otherwise we have another split in the FP part of the scala community...

@travisbrown
Copy link
Contributor

I'm not opposed to this change (although I don't think the current arrangement is all that bad), but I would like to hear from some of the Algebird committers about how it would impact the prospects / timeline for adoption of Algebra.

@erik-stripe
Copy link
Contributor

@travisbrown, @rklaehn, and others:

That's the rub. For awhile I had been blocking this proposal because I knew Cats was going to be a large, dynamic project, and I didn't think that it could easily serve as the kind of common algebraic library that Algebird, Spire, and other libraries needed, whereas Algebra was small, relatively unopinionated, and already existing. (Being unopinionated around implicits/syntax was necessary to even be in consideration as an Algebird dependency.)

Given that there is overwhelming desire from Cats maintainers (and users I think) for Cats to define its own type classes here I decided to stop being a grouch and let things move forward.

However, I don't think that Cats' decision here will be binding on Algebra. That is, the Algebra project has its own constraints, users, and goals, and won't necessarily be well-served by creating a Cats dependency right off the bat. This is especially true given the recent de-modularization (which was probably the right thing for Cats itself, but certainly makes an Algebra dependency on Cats a bigger deal).

We had been proposing creating an algebra-kernel project with the bare minimum needed to allow Cats interop. If Cats is able to provide a similar module then it's a lot easier to imagine an Algebra dependency (but I think such a module is currently somewhat unpopular amongst the Cats maintainers).

Cats currently has more momentum and users, so I want us to feel free to do what is best in the context of Cats here, and then let other projects (Algebra, Algebird, Spire, etc.) decide how they want to move forward.

@rklaehn
Copy link
Contributor

rklaehn commented Jan 4, 2016

If cats and algebra/spire have different basic typeclasses (Eq, Order, Monoid etc), it will be impossible to write libraries that can be used from both without a lot of extremely annoying and inefficient adapter code. That would be very bad. Basically it would be strong evidence that in a language such as scala without canonical encoding for typeclasses, you can't build an interoperable ecosystem around typeclasses.

@erik-stripe
Copy link
Contributor

@rklaehn I agree with you, but I think that Cats feeling a need to "own" these basic type classes is already a strong indicator of this position. We had floated a possibility of breaking a subset of algebra out into a super-small library that Algebra and Cats could both depend on but if I recall that wasn't popular with any of the folks who wanted these types in Cats itself.

In some sense I think this discussion parallels one about micro- versus mono-repos.

@aryairani
Copy link
Contributor

@rklaehn How inefficient would the adapter code be?

We already have algebra/cats monoids and scalaz monoids, along with every other typeclass, and I expect I’m not the only person wondering what the best way of supporting both of those two libraries is already. (Side question, how are folks planning support both? scataz?)

@rklaehn
Copy link
Contributor

rklaehn commented Jan 5, 2016

@refried it would be another object allocation whenever a typeclass instance is required. This might not be a big deal for typical cats code ("it's just a constant factor"), but it is a complete deal breaker for spire and algebird code which is very performance sensitive.

But you have to also think about the overhead of having to maintain adapter libraries etc., and the cognitive overhead of the slightly different typeclass encoding schemes. Just think about the SO questions: "no, that is a cats.Monoid, you need an algebra.Monoid for that to work. You have to import spire-cats-compat-2.11.jar. But then you must not import algebra.implicits._, because then you will have lots of ambiguous implicits..." people will rightfully just run away screaming.

The duplication of e.g. scalaz monoids and cats monoids is very unfortunate, but IMHO not that bad because cats and scalaz target the same set of problems, whereas spire/algebird and cats would complement each other very nicely.

At least for me, if there are no common typeclass instances to target, my motivation to spend my free time developing libraries for the scala/typelevel ecosystem will quickly approach zero. And I would also start to have doubts whether I can recommend typelevel/FP scala at work.

@ceedubs
Copy link
Contributor

ceedubs commented Jan 5, 2016

I think we should take heed of @rklaehn's comments. Libraries like fs2 are already dropping scalaz dependencies presumably at least partially to avoid the baggage of the scalaz/cats split. It will become a big hassle to have Yet Another Monoid, and adapting between different versions could have a prohibitive performance (not to mention cognitive) cost. It doesn't make much sense to me to have Algebra and Cats both under the Typelevel umbrella if we don't make them fit together nicely. There are some fair criticisms of the current approach in this thread, but to me they mostly seem like they could be resolved with improved documentation and some consistency changes with respect to laws, etc. Dependencies are hard, but I think both libraries have an eager bunch of committers who can come up with a good solution. Anyone looking at Cats code is already exposed to Simulacrum, Kind-Projector, Machinist, etc, so I don't think that having them also exposed to a small external library dependency is a dealbreaker.

@milessabin
Copy link
Member Author

See also the discussion here: typelevel/algebra#142

@milessabin
Copy link
Member Author

Consensus seems to have broken out ... I'll rework this PR as a PoC of a Cats/Algebra kernel in the Cats repo.

Thanks to everyone for engaging so constructively in an important issue which could have been quite divisive :-)

@milessabin
Copy link
Member Author

Closing this PR in favour of #786.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.