Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add distinct method to NonEmptyList and NonEmptyVector. #1240

Closed
wants to merge 4 commits into from
Closed

Add distinct method to NonEmptyList and NonEmptyVector. #1240

wants to merge 4 commits into from

Conversation

Tvaroh
Copy link
Contributor

@Tvaroh Tvaroh commented Jul 28, 2016

Hello.

Does it make sense to have distinct method on NonEmptyList? If this is something that could be accepted I might work on unit tests for this.

The implementation is based on universal equality to compare values. Using Eq would require a typeclass for getting hash codes and some Set implementation that would use those.

@ceedubs
Copy link
Contributor

ceedubs commented Jul 28, 2016

Thanks for submitting this @Tvaroh! I was thinking about wanting a distinct on NonEmptyList/NonEmptyVector just the other day.

There are a few complications:

  • We've recently added first-class NonEmptyList and NonEmptyVector implementations. Now that those are in place, I think we are likely going to drop OneAnd (see Create cats.data.NonEmptyStream? #1089).
  • As you have brought up, this relies on universal equality. This seems a bit unnatural in Cats, where most things are type class based. However, as you also mentioned, we don't have a hash type class. One alternative would be to take an Order instance and use a tree-based set, but we don't even have one of those.

I'm not quite sure how to go forward. I wonder if anyone else has any thoughts on this?

@Tvaroh
Copy link
Contributor Author

Tvaroh commented Jul 28, 2016

@ceedubs I understand. Writing a balanced ordered tree-based set might be complicated, not sure if it's even in cats' scope.

@adelbertc
Copy link
Contributor

Short of writing our own tree, perhaps we can use scala.collection.immutable.TreeSet under the covers which requires a scala.math.Ordering, which we can get from our own Order ? Just need to be careful that none of the existing instances of Ordering are inconsistent with Order. I'd imagine they're not.

@codecov-io
Copy link

codecov-io commented Jul 28, 2016

Current coverage is 89.86% (diff: 0.00%)

Merging #1240 into master will decrease coverage by 0.27%

@@             master      #1240   diff @@
==========================================
  Files           243        243          
  Lines          3285       3295    +10   
  Methods        3231       3240     +9   
  Messages          0          0          
  Branches         51         52     +1   
==========================================
  Hits           2961       2961          
- Misses          324        334    +10   
  Partials          0          0          

Sunburst

Powered by Codecov. Last update 2d231ca...308dcc9

@Tvaroh Tvaroh changed the title Add OneAnd.distinct method. Add distinct method to NonEmptyList and NonEmptyVector. Jul 28, 2016
@Tvaroh
Copy link
Contributor Author

Tvaroh commented Jul 28, 2016

Gents, I've updated the PR with distinct on NonEmptyList and NonEmptyVector using TreeSet to keep track of duplicates. I tried to do it efficiently, so the implementation uses mutable builders (ListBuffer and Vector.newBuilder) underneath.

Let me know if this is acceptable.

# Conflicts:
#	core/src/main/scala/cats/data/NonEmptyVector.scala
#	tests/src/test/scala/cats/tests/NonEmptyVectorTests.scala
@Tvaroh
Copy link
Contributor Author

Tvaroh commented Jul 28, 2016

Have added tests and resolved conflicts with master. Unfortunately sbt test hangs forever for me, do you run it using some specific options?

@ceedubs
Copy link
Contributor

ceedubs commented Jul 28, 2016

@Tvaroh I usually use testsJVM/test.

@Tvaroh
Copy link
Contributor Author

Tvaroh commented Jul 29, 2016

@ceedubs thanks, it works and tests pass.

@ceedubs
Copy link
Contributor

ceedubs commented Jul 29, 2016

This approach seems reasonable to me. I'd like for a couple others to weigh in on this.

@Tvaroh could you please remove the unrelated formatting changes from your PR? They don't follow the conventions used elsewhere in the code base.

I could see it being useful to have a def distinctBy[B](f: A => B)(implicit O: Order[B]) generalization with distinct being a special case for f being identity. That could also wait for a separate PR though.

@Tvaroh
Copy link
Contributor Author

Tvaroh commented Jul 29, 2016

@ceedubs omg, didn't notice that, blame Intellij. Let me better create a new PR from master to not pollute the git log with unrelated changes.

@Tvaroh
Copy link
Contributor Author

Tvaroh commented Jul 29, 2016

See #1243.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants