fix sparse getindex regression #7162

tanmaykm · 2014-06-07T01:09:15Z

This PR aims to fix broken getindex for sparse matrices with index type other than Int and adds some related tests.
ref #7131, #7047

ViralBShah · 2014-06-07T04:04:55Z

Is it possible to avoid having a new binarysearch implementation in the sparse matrix code? Cc @mauro3 @kmsquire

ViralBShah · 2014-06-07T04:06:46Z

We should also update NEWS.md with the sparse indexing improvements.

tanmaykm · 2014-06-07T09:38:03Z

Here are some results from a comparison between binarysearch and searchsorted: https://gist.github.com/tanmaykm/7a1e3d56ff7445f0aa03

We can probably move binarysearch into sort.jl as searchuniquesorted or something similar, though the difference is not that significant for this use case.

mauro3 · 2014-06-07T10:24:18Z

Testing for equality after searchsortedfirst should be almost the same as binarysearch. However, I tested it with @tanmaykm script and it's slowest. No idea why. Have a look at my comment in his gist:
https://gist.github.com/tanmaykm/7a1e3d56ff7445f0aa03

I don't have time to investigate just now though.

ViralBShah · 2014-06-07T13:18:58Z

Cc: @kmsquire

ViralBShah · 2014-06-07T13:24:59Z

We should probably also test for small inputs. It is quite likely that for columns with less than 10 elements, a linear search may be faster.

mauro3 · 2014-06-07T19:49:34Z

@ViralBShah I did a comparison when I wrote the indexing and found no advantage of using linear search. But that was wrong as enumerate seems to kill performance, so here updated:
https://gist.github.com/mauro3/246432bd4f2943e3dfd5

Binary and linear search are about equally fast for a haystack of size 15

For n=1 to 5, linear search takes about 50% of binary search
For n=5 to 10, linear search takes about 75% of binary search

(But this is not what is found in http://schani.wordpress.com/2010/04/30/linear-vs-binary-search/ for C. Where linear search is better up to arrays length 200)

ViralBShah · 2014-06-08T05:16:05Z

It would be nice to bake the linear search into the various searchsorted methods.

ViralBShah · 2014-06-08T05:17:09Z

That blog post is good. I would be curious to see how close we can get, since a bunch of work on SIMD is going on now.

mauro3 · 2014-06-08T21:49:46Z

Made PR with performance tests for sparse indexing: #7177.
It also contains tests for uint32 indices which need to be un-commented once this is fixed.

tanmaykm · 2014-06-09T04:56:07Z

Updated and optimized the tests at https://gist.github.com/tanmaykm/7a1e3d56ff7445f0aa03 further. Now times reported by alll three methods (binarysearch, searchsorted, searchsortedfirst) are very close, though searchsorted beats the other two consistently by a narrow margin.

The bulk of the difference in test result was because of importing Base.Order instead of using Base.Order.Forward. Using Base.Order.Forward I think prevented some loop unrolling optimizations which code_llvm shows. I am not too knowledgeable about interpreting llvm bytecode and would appreciate if someone else also has a look at it.

I think we can use searchsorted instead of binarysearch. I shall update this PR.

mauro3 · 2014-06-09T07:21:03Z

@tanmaykm good detective work but I'm still puzzled why searchsortefirst should be slower. Maybe something to ask on the mailing list.

There are also a searchsorted* functions for ranges, maybe the rangesort in sparsematrix.jl could be replaced as well?

tanmaykm · 2014-06-09T10:55:41Z

Updated.

@mauro3 yes, the slower searchsortedfirst is puzzling. I didn't replace rangesearch as it performs better (https://gist.github.com/tanmaykm/b287d6a8c98f4be3dfe3).

ViralBShah · 2014-06-09T11:05:54Z

Is this good to merge now?

mauro3 · 2014-06-09T11:21:00Z

base/sparse/sparsematrix.jl

-    ind = binarysearch(A.rowval, i0, A.colptr[i1], A.colptr[i1+1]-1)
-    ind > -1 ? A.nzval[ind] : zero(T)
+    r1 = int(A.colptr[i1])
+    r2 = A.colptr[i1+1]-1


Shouldn't this also be converted to int?

r2 is an int because of the subtraction

mauro3 · 2014-06-09T11:36:29Z

Thanks for the clarifications and sorry to bother with two more things!

returning 0 instead of -1 to indicate "not found" might be more natural?
~~I had another look at rangesearch yesterday and found that this is 20% faster (but give me a minute to double check):~~

function rangesearch(haystack::Range, needle)
    di = (needle - first(haystack))/ step(haystack)
    i = ifloor(di)
    (di-i==0 && 1<=i+1<=length(haystack)) ? i+1 : -1
end

mauro3 · 2014-06-09T11:48:40Z

Sorry, no the second suggestion is wrong, ignore it.

kmsquire · 2014-06-09T13:53:35Z

On my machine, searchsorted actually beats the other two significantly:

# last run only
elapsed time: 0.06766474 seconds (0 bytes allocated)
elapsed time: 0.032284962 seconds (0 bytes allocated)
elapsed time: 0.070093474 seconds (0 bytes allocated)

I actually find that somewhat surprising as well. I'm also not well versed at code_llvm, although I try to read it from time to time.

kmsquire · 2014-06-09T14:05:17Z

@tanmaykm good detective work but I'm still puzzled why searchsortefirst should be slower.

One possible reason has nothing to do with the search function: the timing does include an extra 10^6 array lookups that aren't in the searchsorted call.

mauro3 · 2014-06-09T15:41:09Z

More detective work:
https://gist.github.com/mauro3/2da5d92f1da8a56d40f0

This shows that for searchsortedfirst is faster than searchsorted when looping over a disordered array of "needles". In particular for smaller haystacks, which is probably what is encountered more often for sparse arrays. For instance with N length of haystack, NN number of needles to loop over:

N=1000, NN=100000
randperm(NN) input:
bin      elapsed time: 0.002756478 seconds (0 bytes allocated)
sort     elapsed time: 0.005649829 seconds (0 bytes allocated)
sortlast elapsed time: 0.003916286 seconds (0 bytes allocated)
[1:NN] input:
bin      elapsed time: 0.002726484 seconds (0 bytes allocated)
sort     elapsed time: 0.001809206 seconds (0 bytes allocated)
sortlast elapsed time: 0.00395091 seconds (0 bytes allocated)

This suggests that the difference is caused by caching(?). Trying slightly different tests, it didn't seen the extra array lookups make much of a difference.

also replaced binarysearch with methods from sort.jl ref JuliaLang#7131, JuliaLang#7047

tanmaykm · 2014-06-09T17:53:46Z

Caching does seem to be the likely explanation. Changed rangesearch to return 0 instead of -1 as suggested and rebased.

mauro3 · 2014-06-09T18:17:04Z

I think this is good to merge.

fix sparse getindex regression

tanmaykm · 2014-06-10T05:18:58Z

Thanks. Merged.

ViralBShah · 2014-06-10T06:30:32Z

I now see:

exception on 5: ERROR: test failed: S[end] == S[1] + S[2,2]
 in error at error.jl:21
 in default_handler at test.jl:19
 in do_test at test.jl:39
 in anonymous at no file:296
 in runtests at /Users/viral/julia/test/testdefs.jl:5
 in anonymous at multi.jl:847
 in run_work_thunk at multi.jl:613
 in anonymous at task.jl:847
while loading sparse.jl, in expression starting on line 292
ERROR: test failed: S[end] == S[1] + S[2,2]
 in anonymous at task.jl:1350
while loading sparse.jl, in expression starting on line 292
while loading /Users/viral/julia/test/runtests.jl, in expression starting on line 46

tanmaykm · 2014-06-10T06:36:35Z

Yes. Raised an issue #7197 for that.

mauro3 · 2014-06-10T12:24:58Z

Work-around for #7197 and a minor bug fix for this PR in #7201

ViralBShah added the sparse label Jun 7, 2014

ViralBShah added this to the 0.3 milestone Jun 7, 2014

mauro3 reviewed Jun 9, 2014
View reviewed changes

fix sparse getindex regression.

1254792

also replaced binarysearch with methods from sort.jl ref JuliaLang#7131, JuliaLang#7047

tanmaykm added a commit that referenced this pull request Jun 10, 2014

Merge pull request #7162 from tanmaykm/tanmaykm

664cab5

fix sparse getindex regression

tanmaykm merged commit 664cab5 into JuliaLang:master Jun 10, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix sparse getindex regression #7162

fix sparse getindex regression #7162

tanmaykm commented Jun 7, 2014

ViralBShah commented Jun 7, 2014

ViralBShah commented Jun 7, 2014

tanmaykm commented Jun 7, 2014

mauro3 commented Jun 7, 2014

ViralBShah commented Jun 7, 2014

ViralBShah commented Jun 7, 2014

mauro3 commented Jun 7, 2014

ViralBShah commented Jun 8, 2014

ViralBShah commented Jun 8, 2014

mauro3 commented Jun 8, 2014

tanmaykm commented Jun 9, 2014

mauro3 commented Jun 9, 2014

tanmaykm commented Jun 9, 2014

ViralBShah commented Jun 9, 2014

mauro3 Jun 9, 2014

tanmaykm Jun 9, 2014

mauro3 commented Jun 9, 2014

mauro3 commented Jun 9, 2014

kmsquire commented Jun 9, 2014

kmsquire commented Jun 9, 2014

mauro3 commented Jun 9, 2014

tanmaykm commented Jun 9, 2014

mauro3 commented Jun 9, 2014

tanmaykm commented Jun 10, 2014

ViralBShah commented Jun 10, 2014

tanmaykm commented Jun 10, 2014

mauro3 commented Jun 10, 2014

fix sparse getindex regression #7162

fix sparse getindex regression #7162

Conversation

tanmaykm commented Jun 7, 2014

ViralBShah commented Jun 7, 2014

ViralBShah commented Jun 7, 2014

tanmaykm commented Jun 7, 2014

mauro3 commented Jun 7, 2014

ViralBShah commented Jun 7, 2014

ViralBShah commented Jun 7, 2014

mauro3 commented Jun 7, 2014

ViralBShah commented Jun 8, 2014

ViralBShah commented Jun 8, 2014

mauro3 commented Jun 8, 2014

tanmaykm commented Jun 9, 2014

mauro3 commented Jun 9, 2014

tanmaykm commented Jun 9, 2014

ViralBShah commented Jun 9, 2014

mauro3 Jun 9, 2014

Choose a reason for hiding this comment

tanmaykm Jun 9, 2014

Choose a reason for hiding this comment

mauro3 commented Jun 9, 2014

mauro3 commented Jun 9, 2014

kmsquire commented Jun 9, 2014

kmsquire commented Jun 9, 2014

mauro3 commented Jun 9, 2014

tanmaykm commented Jun 9, 2014

mauro3 commented Jun 9, 2014

tanmaykm commented Jun 10, 2014

ViralBShah commented Jun 10, 2014

tanmaykm commented Jun 10, 2014

mauro3 commented Jun 10, 2014