Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ambiguity of broadcast!(identity, ::SparseVector, ::SparseVector) #19895

Merged
merged 1 commit into from
Jan 11, 2017

Conversation

martinholters
Copy link
Member

Without this:

julia> a=spzeros(10); a.=a
ERROR: MethodError: broadcast!(::Base.#identity, ::SparseVector{Float64,Int64}, ::SparseVector{Float64,Int64}) is ambiguous. Candidates:
  broadcast!{Tf}(f::Tf, C::Union{SparseMatrixCSC,SparseVector}, A::Union{SparseMatrixCSC,SparseVector}) in Base.SparseArrays.HigherOrderFns at sparse/higherorderfns.jl:84
  broadcast!{Tf,N}(f::Tf, C::Union{SparseMatrixCSC,SparseVector}, A::Union{SparseMatrixCSC,SparseVector}, Bs::Vararg{Union{SparseMatrixCSC,SparseVector},N}) in Base.SparseArrays.HigherOrderFns at sparse/higherorderfns.jl:86
  broadcast!{T,S,N}(::Base.#identity, x::AbstractArray{T,N}, y::AbstractArray{S,N}) in Base.Broadcast at broadcast.jl:23

BTW, should test/ambiguous.jl have caught this?

@@ -148,7 +148,7 @@ ambiguityfunnel{Tf}(f::Tf, x, y) = _aresameshape(x, y) ? _noshapecheck_map(f, x,
broadcast(::typeof(+), x::SparseVector, y::SparseVector) = ambiguityfunnel(+, x, y) # base/sparse/sparsevectors.jl:1266
broadcast(::typeof(-), x::SparseVector, y::SparseVector) = ambiguityfunnel(-, x, y) # base/sparse/sparsevectors.jl:1266
broadcast(::typeof(*), x::SparseVector, y::SparseVector) = ambiguityfunnel(*, x, y) # base/sparse/sparsevectors.jl:1266
function broadcast!(::typeof(identity), C::SparseMatrixCSC, A::SparseMatrixCSC) # from #17623, loc?
function broadcast!{T<:SparseVecOrMat}(::typeof(identity), C::T, A::T) # from #17623, base/broadcast.jl:26
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this won't handle the same cases as the previous one. Maybe

broadcast!{T<:SparseVecOrMat}(::typeof(identity), C::T, A::SparseVecOrMat)

or

broadcast!{T<:SparseVecOrMat,S<:SparseVecOrMat}(::typeof(identity), C::S, A::T)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Will update after the weekend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggested forms might allow e.g. C <: SparseVector while A <: SparseMatrixCSC, no? Best!

@kshyatt kshyatt added sparse Sparse arrays broadcast Applying a function over a collection labels Jan 6, 2017
@kshyatt kshyatt requested a review from Sacha0 January 6, 2017 17:30
@Sacha0
Copy link
Member

Sacha0 commented Jan 7, 2017

With something along the lines of @pabloferz's suggestion, these changes LGTM. I wonder whether we shouldn't simply remove the problematic AbstractArray specialization though; thoughts? Best!

@martinholters
Copy link
Member Author

I wonder whether we shouldn't simply remove the problematic AbstractArray specialization though; thoughts?

As a .= b would be lowered to exactly this, I thought it might be worth keeping for maximum performance. But some quick benchmarking seems to indicate that broadcast!(x->x, a, b) is at least as fast as the (presently specialized) broadcast!(identity, a, b). You folks did a tremendous job squeezing performance out of broadcast! 👏

Experimenting here, I noted something else:

julia> zeros(3,1) .= sin.(zeros(3))
3×1 Array{Float64,2}:
 0.0
 0.0
 0.0

julia> spzeros(3,1) .= sin.(spzeros(3))
ERROR: DimensionMismatch("argument shapes must match")
Stacktrace:
 [1] _checksameshape at ./sparse/higherorderfns.jl:124 [inlined]
 [2] map!(::Base.#sin, ::SparseMatrixCSC{Float64,Int64}, ::SparseVector{Float64,Int64}) at ./sparse/higherorderfns.jl:64
 [3] broadcast!(::Base.#sin, ::SparseMatrixCSC{Float64,Int64}, ::SparseVector{Float64,Int64}) at ./sparse/higherorderfns.jl:84

@Sacha0 Is that being fixed by something you're currently working on (I lost track, I admit), or should I open an issue so that we don't forget?

@martinholters martinholters force-pushed the mh/fix_sparse_identity_broadcast branch from 992de28 to b0ae001 Compare January 9, 2017 08:41
@martinholters
Copy link
Member Author

Updated the PR to just remove the specialization. I realized, however, that the specialized version for ::SparseMatrixCSC was faster then basic broadcast!. Should we keep it (but move to a better suited location in the file as it has nothing to do with ambiguity anymore)?

@martinholters
Copy link
Member Author

@tkelman is that a 👍 to the updated PR or to keeping broadcast!(::typeof(identity), C::SparseMatrixCSC, A::SparseMatrixCSC) around?

@tkelman
Copy link
Contributor

tkelman commented Jan 9, 2017

both, mostly the latter for now if it makes a noticeable difference

@pabloferz
Copy link
Contributor

pabloferz commented Jan 9, 2017

I'd keep the AbstractArray specialized version and remove the sparse ones (both are basically doing the same) in case someone has a subtype of AbstractArray for which the specialization is also faster.

@Sacha0
Copy link
Member

Sacha0 commented Jan 9, 2017

@martinholters, re. #19895 (comment), you've found a bug --- thanks! :) The signature of a specialization in the sparse broadcast! code is too loose. Fix inbound. Best!

@Sacha0
Copy link
Member

Sacha0 commented Jan 9, 2017

I'd keep the AbstractArray specialized version and remove the sparse ones (both are basically doing the same) in case someone has a subtype of AbstractArray for which the specialization is also faster.

IIRC the AbstractArray specialization induces ambiguities, necessitating the sparse specializations. So if I understand you correctly, retaining the AbstractArray specialization while removing the sparse specializations is not viable? Best!

@@ -148,10 +148,13 @@ ambiguityfunnel{Tf}(f::Tf, x, y) = _aresameshape(x, y) ? _noshapecheck_map(f, x,
broadcast(::typeof(+), x::SparseVector, y::SparseVector) = ambiguityfunnel(+, x, y) # base/sparse/sparsevectors.jl:1266
broadcast(::typeof(-), x::SparseVector, y::SparseVector) = ambiguityfunnel(-, x, y) # base/sparse/sparsevectors.jl:1266
broadcast(::typeof(*), x::SparseVector, y::SparseVector) = ambiguityfunnel(*, x, y) # base/sparse/sparsevectors.jl:1266
function broadcast!(::typeof(identity), C::SparseMatrixCSC, A::SparseMatrixCSC) # from #17623, loc?
# specializations to improve performance for C .= A
function broadcast!(::typeof(identity), C::SparseVector, A::SparseVector)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps compact these?

broadcast!(::typeof(identity), C::SparseVector, A::SparseVector) = (_checksameshape(C,A); copy!(C, A))
broadcast!(::typeof(identity), C::SparseMatrixCSC, A::SparseMatrixCSC) = (_checksameshape(C,A); copy!(C, A))

Copy link
Contributor

@pabloferz pabloferz Jan 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add an @boundscheck so people can remove shape checking when they already checked that before hand or know in advance the arrays are of the same size? E.g.

broadcast!(::typeof(identity), C::SparseVector, A::SparseVector) = (@boundscheck _checksameshape(C,A); copy!(C, A))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated; also added @inline.

@martinholters
Copy link
Member Author

Right. The AbstractArray version of this specialization doesn't make much sense. For Arrays there is no advantage, other broadcast implementations would be forced to provide their own specializations to avoid ambiguity.

@pabloferz
Copy link
Contributor

Right, ambiguities. Carry on then.

@martinholters martinholters force-pushed the mh/fix_sparse_identity_broadcast branch from a5ae045 to 6f2f739 Compare January 10, 2017 08:10
@martinholters
Copy link
Member Author

Huh, the specializations for the sparse case actually don't quite match the unspecialized versions as they don't remove 0-entries:

julia> a=sprand(5, 1.0); a.nzval .= 0.0; a
Sparse vector of length 5 with 5 Float64 nonzero entries:
  [1]  =  0.0
  [2]  =  0.0
  [3]  =  0.0
  [4]  =  0.0
  [5]  =  0.0


julia> broadcast!(identity, copy(a), a)
Sparse vector of length 5 with 5 Float64 nonzero entries:
  [1]  =  0.0
  [2]  =  0.0
  [3]  =  0.0
  [4]  =  0.0
  [5]  =  0.0


julia> broadcast!(x->x, copy(a), a)
Sparse vector of length 5 with 0 Float64 nonzero entries:

Adding a dropzeros! kills any performance advantage of the specialized versions. Should we keep the specializations for performance sake at the cost of this inconsistency or drop them?

@martinholters
Copy link
Member Author

My present tendency is to remove the specialized versions, i.e. go back to just b0ae001. If we want to treat performance of broadcast against possibly stored zeros, we could better be consistent and do that always instead of just for identity. Opinions?

@Sacha0
Copy link
Member

Sacha0 commented Jan 10, 2017

My present tendency is to remove the specialized versions

Favoring consistency, SGTM.

If we want to treat performance of broadcast against possibly stored zeros, we could better be consistent and do that always instead of just for identity.

If you are suggesting changing the zero-storing behavior of generic sparse map[!]/broadcast[!], please see discussion in #19239, #19371, #19372, and downstream PRs. Best!

@martinholters martinholters force-pushed the mh/fix_sparse_identity_broadcast branch from 6f2f739 to b0ae001 Compare January 11, 2017 06:57
@martinholters
Copy link
Member Author

Ok, went back to just removing the specializations. Any objections?

@pabloferz
Copy link
Contributor

I'm fine with this. Better general performance seems like a more appropriate path to follow here.

@martinholters
Copy link
Member Author

Better general performance seems like a more appropriate path to follow here.

I just don't think that if be want both a .= b and a .= (x->x).(b) to be zero-expunging, they can ever be as fast as copy!(a, b).

@pabloferz
Copy link
Contributor

I also don't think they can be as fast as copy!(a, b). I was meaning to imply that is better to improve performance (if possible) while preserving consistency (zero-expunging in this case).

@martinholters
Copy link
Member Author

Looks like we have reached agreement to just remove the specializations and CI is green. Merge?

@martinholters martinholters merged commit 93ecb41 into master Jan 11, 2017
@martinholters martinholters deleted the mh/fix_sparse_identity_broadcast branch January 11, 2017 13:35
@stevengj
Copy link
Member

How does this affect performance for x::Array .= y::Array? copy! in that case can call memcpy.

@Sacha0
Copy link
Member

Sacha0 commented Jan 12, 2017

On reflection the former specializations were incorrect, failing to handle cases where the source and destination shapes differ. Best! (Edit: Tests added for the sparse case in #19986.)

@stevengj
Copy link
Member

Might still be nice to do memcpy for Arrays

@Sacha0
Copy link
Member

Sacha0 commented Jan 13, 2017

Might still be nice to do memcpy for Arrays

To clarify, the specialization that Arrays formerly hit (for AbstractArray) was likewise incorrect?

@martinholters
Copy link
Member Author

To substantiate the incorrectness:

# this is before this PR, still including the specialization now removed
julia> x=zeros(5,1); y=ones(5,5);

julia> y .= x # equivalent to y .= identity.(x), only does a partial copy!
5×5 Array{Float64,2}:
 0.0  1.0  1.0  1.0  1.0
 0.0  1.0  1.0  1.0  1.0
 0.0  1.0  1.0  1.0  1.0
 0.0  1.0  1.0  1.0  1.0
 0.0  1.0  1.0  1.0  1.0

julia> y .= (a->a).(x) # does correct broadcast with anonymous identity function
5×5 Array{Float64,2}:
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0

However, I have to withdraw my previous statement about the generic broadcast! not being slower. I'm not sure what I benchmarked there. I now measure a penalty of 10% to 50% for matrices with 64-bit element types, depending on matrix size, but much higher ones (more than 2000% in some cases!) for 8- and 16-bit element types. I guess this is where memcpy can make all the difference.

Should we add something like

function broadcast!{T,S,N}(::typeof(identity), x::Array{T,N}, y::Array{S,N})
    if size(x) == size(y)
        copy!(x, y)
    else
        broadcast_c!(identity, Array, x, y)
    end
end

?

@Sacha0
Copy link
Member

Sacha0 commented Jan 13, 2017

However, I have to withdraw my previous statement about the generic broadcast! not being slower. I'm not sure what I benchmarked there. I now measure a penalty of 10% to 50% for matrices with 64-bit element types, depending on matrix size, but much higher ones (more than 2000% in some cases!) for 8- and 16-bit element types.

Interesting! I wonder whether some measure of unrolling might be useful? Best!

@vchuravy
Copy link
Member

However, I have to withdraw my previous statement about the generic broadcast! not being slower. I'm not sure what I benchmarked there. I now measure a penalty of 10% to 50% for matrices with 64-bit element types, depending on matrix size, but much higher ones (more than 2000% in some cases!) for 8- and 16-bit element types. I guess this is where memcpy can make all the difference.

Would be good to have these cases in BaseBenchmarks so that nanosoldier can check them.

@stevengj
Copy link
Member

Can someone open an issue to restore the use of memcpy for broadcast(::typeof(identity), ...), at least for Array, unless there is already a PR for this?

@Sacha0
Copy link
Member

Sacha0 commented Jan 25, 2017

Can someone open an issue to restore the use of memcpy for broadcast(::typeof(identity), ...), at least for Array, unless there is already a PR for this?

#20227

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broadcast Applying a function over a collection sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants