Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sum(x, 2) is too slow #2325

Closed
lindahua opened this issue Feb 16, 2013 · 4 comments
Closed

sum(x, 2) is too slow #2325

lindahua opened this issue Feb 16, 2013 · 4 comments
Labels
performance Must go faster

Comments

@lindahua
Copy link
Contributor

I tried to give another example at GIST: https://gist.github.com/lindahua/4967432#file-ju_reduc-jl

Here is the benchmark result on my Mac:

sum(x, 1)      :  elapsed = 0.04232287406921387
fast_sum(x, 1) :  elapsed = 0.03795504570007324
sum(x, 2)      :  elapsed = 0.3422269821166992
fast_sum(x, 2) :  elapsed = 0.06840705871582031

So, sum(x, 1) is comparable to my implementation, but sum(x, 2) is 5x slower. The key problem is that it uses a perhaps easier approach -- simply doing reduction for each slide, instead of the cache-friendily approach that organizes the computation according to the memory layout.

Many reductions can be considered as recursively application of a binary operation can be implemented in such cache friendly way (such as, max, min, prod). There are also other functions can benefit from this (e.g. mean, var, and std).

@ViralBShah
Copy link
Member

If you have a cache friendly code, it would be great to include it in base.

@ViralBShah
Copy link
Member

I see the gist.

@lindahua
Copy link
Contributor Author

The gist code is only for the purpose of illustration. Actual implementation has to take care of more things (e.g. generic arrays with dim > 2, etc).

I will consider more on how this can be implemented efficiently for general cases, (e.g. something as the follows)

X = rand(3, 4, 5, 6)
sum(X, 2)

@lindahua
Copy link
Contributor Author

lindahua commented Jan 5, 2014

Closed by #5294.

@lindahua lindahua closed this as completed Jan 5, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

2 participants