Fix setindex! with SubDArray source #74

mbauman · 2016-07-07T19:06:44Z

This method is an optimization wherein we try to chunk accesses based upon the parent DArray's parts. The hard thing is then going backwards and trying to figure out which parts of the assignment indices need to be used in order to access those chunks. This is a four stage process that uses five different types of indices:

Find the indices of each portion of the DArray
Find the valid subset of indices of the SubArray that index into that portion
Find the portion of the indices for the assignment that need to be used for that subset of indices in step 2. This is the hard part. It requires creating another set of indices that represents the mask of valid indices from step 2. With those masks in hand, it's possible to reindex I to the indices we need. The trouble is that setindex! supports singleton dimensions in the source array in ways that getindex does not, so we need to selectively drop singleton dimensions as we restrict the indices. A final complication is that the last index can be a linear index over many indices in either the source or destination.
Finally, if the entire DArray chunk isn't getting used, we need to shift the indices from step 2 to refer to the local part of the DArray.

Tests pass locally with --depwarn=no. CC @andreasnoack.

coveralls · 2016-07-07T19:22:13Z

Coverage decreased (-19.9%) to 49.066% when pulling be6a31c on mbauman:mb/setindex into 1a31742 on JuliaParallel:master.

andreasnoack · 2016-07-08T02:44:17Z

src/DistributedArrays.jl

+    sz::NTuple{N,Int}
+end
+Base.size(M::MergedIndices) = M.sz
+Base.getindex{_,N}(M::MergedIndices{_,N}, I::Vararg{Int, N}) = CartesianIndex(map(getindex, M.indices, I))


The two parameter version of Vararg causes an error on 0.4. Is this fix strictly 0.5 material or is it possible to make it work on both versions?

mbauman · 2016-07-08T15:04:17Z

No, I don't think this fix will be easy to backport. Both this and the previous implementation lean heavily on the assumption that length(S.indexes) = ndims(S.parent) for SubArrays. But that only became true on 0.5. So this doesn't really fix the bugs on 0.4 — they're a result of that assumption being false.

In general, setindex! does support a crazy amount of variation in the shapes of arrays it accepts:

A = rand(2,2)
A[:] = [1 2; 3 4] # Linear indexing into the destination
A[:,:] = 1:4 # Linear indexing into the source
A[:] = [1 2 3 4] # Skipping singleton dimensions with linear indexing into both
A[:, :] = [1 2 3 4] # Skipping singleton dimensions with linear indexing into the source

And it even feels like it should be more accepting than it currently is… but that'd just make this method even harder. See this comment and the following few ones: JuliaLang/julia#15431 (comment). On the other hand, the fact that it's hard to document its semantics is another red flag.

If we ever deprecate linear indexing, that would remove one of the possible sources of variation here. In fact, we could use the same trick I did in Base by reshaping the destination before doing the assignment — that'd probably be simpler. Nope, never mind. It's too early in the morning to be doing four levels of simultaneous indexing. This would only become simpler if we required the source array to exactly match the shape of the destination indices.

codecov-io · 2016-07-08T15:31:02Z

Current coverage is 48.89% (diff: 42.50%)

Merging #74 into master will decrease coverage by 19.47%

@@             master        #74   diff @@
==========================================
  Files             1          1          
  Lines           702        589   -113   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits            480        288   -192   
- Misses          222        301    +79   
  Partials          0          0

Powered by Codecov. Last update 166642e...54cbc66

andreasnoack · 2016-07-08T20:04:24Z

Thanks a lot to our guest contributor. I think we'll have to declare defeat to old style indexing in Julia and remove support for 0.4 with this PR. People will still be able to use some of the DArray functionality on 0.4 but not the fixed version of this method. I'll push the necessary adjustments of REQUIRE and .travis.yml.

andreasnoack · 2016-07-09T15:55:34Z

@timholy JuliaLang/julia#17137 broke this PR. It would be great if you could explain how the bounds checking code should be adjusted to work with new Base. Thanks.

mbauman · 2016-07-09T17:35:50Z

I think I got it through some moderately informed monkey-see-monkey-do imitation of CartesianIndex. That said, I no longer understand the strategy for bounds checking or the difference between all these functions and methods.

coveralls · 2016-07-09T18:40:52Z

Coverage decreased (-20.1%) to 48.896% when pulling 54cbc66 on mbauman:mb/setindex into 1a31742 on JuliaParallel:master.

timholy · 2016-07-09T19:45:19Z

It seems a bit messed up now because of a bad interaction between JuliaLang/julia#17137 and JuliaLang/julia#17340, the latter of which got rid of any calls to _chkbounds, so that function isn't even used now.

I'll see what I can do.

This method is an optimization wherein we try to chunk accesses based upon the parent DArray's parts. The hard thing is then going backwards and trying to figure out which parts of the assignment indices need to be used in order to access those chunks. This is a four stage process that uses five different types of indices: 1. Find the indices of each portion of the DArray 2. Find the valid subset of indices of the SubArray that index into that portion 3. Find the portion of the indices for the assignment that need to be used for that subset of indices in step 2. This is the hard part. It requires creating another set of indices that represents the mask of valid indices from step 2. With those masks in hand, it's possible to reindex `I` to the indices we need. The trouble is that `setindex!` supports singleton dimensions in the source array in ways that `getindex` does not, so we need to selectively drop singleton dimensions as we restrict the indices. A final complication is that the last index can be a linear index over many indices in either the source or destination. 4. Finally, if the entire DArray chunk isn't getting used, we need to shift the indices from step 2 to refer to the local part of the DArray.

This is no longer needed -- the comment is from when I only had restrict_indices partially implemented

Also clarify the comment since I was confused upon coming back to this method a few weeks later

Both these lazy arrays are effectively generalizations of Tim's MappedArrays.jl package. Doing this generally adds a bit more difficulty in terms of element types, but that is true of the MappedArray type, too. It might be worth breaking this out into a package at some point.

As a further optimization, (at)inbounds could be added throughout the algorithm once it has received more widespread testing.

cf. JuliaLang/julia#17228 (comment)

mbauman · 2016-08-03T22:02:01Z

Updated to work with master

andreasnoack · 2016-08-04T01:16:54Z

@mbauman Thanks a million. I've merged you commits in #76.

andreasnoack reviewed Jul 8, 2016
View reviewed changes

andreasnoack mentioned this pull request Jul 8, 2016

Fix setindex! with SubDArray source (+ remove Julia 0.4 support) #76

Merged

timholy mentioned this pull request Jul 9, 2016

Revise checkbounds again JuliaLang/julia#17355

Merged

mbauman added 8 commits July 23, 2016 17:04

Remove unnecessary failsafe

e382ca6

This is no longer needed -- the comment is from when I only had restrict_indices partially implemented

Also implement checkbounds_indices

7f0b418

Only enable this method on 0.5

36204e6

Fixup checkbounds_indices to the new APIs

1c93330

Also clarify the comment since I was confused upon coming back to this method a few weeks later

Propagate inbounds for the lazy array types

474b93c

As a further optimization, (at)inbounds could be added throughout the algorithm once it has received more widespread testing.

Update for SubArray change

4fb5404

cf. JuliaLang/julia#17228 (comment)

mbauman force-pushed the mb/setindex branch from 54cbc66 to 4fb5404 Compare August 3, 2016 22:01

andreasnoack closed this Aug 4, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix setindex! with SubDArray source #74

Fix setindex! with SubDArray source #74

mbauman commented Jul 7, 2016 •

edited

Loading

coveralls commented Jul 7, 2016 •

edited

Loading

andreasnoack Jul 8, 2016

mbauman commented Jul 8, 2016 •

edited

Loading

codecov-io commented Jul 8, 2016 •

edited

Loading

andreasnoack commented Jul 8, 2016

andreasnoack commented Jul 9, 2016

mbauman commented Jul 9, 2016

coveralls commented Jul 9, 2016 •

edited

Loading

timholy commented Jul 9, 2016

mbauman commented Aug 3, 2016

andreasnoack commented Aug 4, 2016

Fix setindex! with SubDArray source #74

Fix setindex! with SubDArray source #74

Conversation

mbauman commented Jul 7, 2016 • edited Loading

coveralls commented Jul 7, 2016 • edited Loading

andreasnoack Jul 8, 2016

Choose a reason for hiding this comment

mbauman commented Jul 8, 2016 • edited Loading

codecov-io commented Jul 8, 2016 • edited Loading

Current coverage is 48.89% (diff: 42.50%)

andreasnoack commented Jul 8, 2016

andreasnoack commented Jul 9, 2016

mbauman commented Jul 9, 2016

coveralls commented Jul 9, 2016 • edited Loading

timholy commented Jul 9, 2016

mbauman commented Aug 3, 2016

andreasnoack commented Aug 4, 2016

mbauman commented Jul 7, 2016 •

edited

Loading

coveralls commented Jul 7, 2016 •

edited

Loading

mbauman commented Jul 8, 2016 •

edited

Loading

codecov-io commented Jul 8, 2016 •

edited

Loading

coveralls commented Jul 9, 2016 •

edited

Loading