Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accelerate some more predicates (like ==) #17

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

andyferris
Copy link
Owner

@andyferris andyferris commented Aug 5, 2020

It's super common to count or search for or filter data using predicates other than isequal or isless, and we should really support things like == (e.g. count(==(0), vector))

The interesting thing is that it's not clear if we even can accelerate something like == in theory, because in theory there are no constraints on the relationship of == and isequal, so having a dictionary of values (where dictionaries use isequal comparisons) doesn't necessarily make things faster. However, ignoring theory, in practice we can do something dirty that works for common types like AbstractFloat.

The appraoch here attempts to be somewhat flexible, allowing one to expand the search space via a new internal other_equal function (e.g. other_equal(0.0) == [-0.0] but other_equal(1.0) == []). We also check for the case that ==(x,x) is false. The approach won't work for Base types like Matrix{Complex{Float64}} since there is a combinatoric explosion in the search space.

Another crappy thing is the way count(==(0), vector) doesn't actually work as advertised here, haha, you need to type count(==(0.0), vector). We also might like to use count(iszero, vector) but the search space for iszero is a bit hard to define (e.g. Complex{<:AbstractFloat} has four distinct values where this is true).

Other predicates like isone are a pain (besides numbers, we have isone("") is true but isone('x') is an error...).

One way of constraining the search space might be to limit to certain element types. But honestly I believe everything should still function as normal on arrays of Any so dispatching on element type can become an antipattern. I suppose it's OK in the case of an acceleration that doesn't impact the result... stronger typing should make things execute faster... hmm...

CC @bkamins this is one of those things that needs to be done to make this package more generally usable - I'm still not convinced of what approach to take.

@andyferris
Copy link
Owner Author

Another approach is to try use reflection to determin when == and isequal agree for the element type, but I don't think this is feasible. Perhaps types should have a trait of has_sane_equality_rules or something...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant