Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spones documentation is incorrect #22381

Closed
dgleich opened this issue Jun 15, 2017 · 3 comments · Fixed by #25037
Closed

spones documentation is incorrect #22381

dgleich opened this issue Jun 15, 2017 · 3 comments · Fixed by #25037
Labels
sparse Sparse arrays

Comments

@dgleich
Copy link

dgleich commented Jun 15, 2017

The documentation for the spones function describes the Matlab behavior for this function, which replaces each non-zero entry by a value of 1.0. However, the implementation appears to use julia's notion of "stored entries" instead.

julia> versioninfo()
Julia Version 0.6.0-rc3.0
Commit ad290e9 (2017-06-07 11:53 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)

help?> spones
search: spones unsafe_pointer_to_objref

  spones(S)

  Create a sparse array with the same structure as that of S, but with every nonzero element having the value 1.0.

  julia> A = sparse([1,2,3,4],[2,4,3,1],[5.,4.,3.,2.])
  4×4 SparseMatrixCSC{Float64,Int64} with 4 stored entries:
    [4, 1]  =  2.0
    [1, 2]  =  5.0
    [3, 3]  =  3.0
    [2, 4]  =  4.0
  
  julia> spones(A)
  4×4 SparseMatrixCSC{Float64,Int64} with 4 stored entries:
    [4, 1]  =  1.0
    [1, 2]  =  1.0
    [3, 3]  =  1.0
    [2, 4]  =  1.0

  Note the difference from speye.

julia> A = sparse([1,2],[1,2],[true,false])
2×2 SparseMatrixCSC{Bool,Int64} with 2 stored entries:
  [1, 1]  =  true
  [2, 2]  =  false

julia> B = spones(A)
2×2 SparseMatrixCSC{Bool,Int64} with 2 stored entries:
  [1, 1]  =  true
  [2, 2]  =  true

julia> A = sparse([1,2],[1,2],[1.0,0.0])
2×2 SparseMatrixCSC{Float64,Int64} with 2 stored entries:
  [1, 1]  =  1.0
  [2, 2]  =  0.0

julia> C = spones(A)
2×2 SparseMatrixCSC{Float64,Int64} with 2 stored entries:
  [1, 1]  =  1.0
  [2, 2]  =  1.0

Expected behavior given the documentation

julia> B = spones(A)
2×2 SparseMatrixCSC{Float64,Int64} with 1 stored entries:
  [1, 1]  =  1.0

julia> C = spones(A)
2×2 SparseMatrixCSC{Float64,Int64} with 1 stored entries:
  [1, 1]  =  1.0

So note that there are two issues with the current documentation. First, the value is one(T) not 1.0. Second, the operation proceeds over the stored entries, not the non-zero stored entries.

Ideal handling.

  • Change the operation of spones to do the same thing as Matlab's function. In which case, this would operate according to the documentation. (That is, it would satisfy the relationship countnz(spones(A)) == countnz(A))
  • Introduce a new function for the current behavior. spsones ?

Minimal handling.

  • Adjust the documentation for the current function. Would suggest including an example describing the difference from Matlab's behavior (as above).

I don't like the minimal handling because the semantics of Matlab's spones and Julia's spones are radically different in unexpected and bug-inducing fashions because I have no idea which Julia operations introduce new structural non-zeros. (Let's just say I spent a while working with the wrong distribution of random matrices.)

@nalimilan nalimilan added the sparse Sparse arrays label Jun 15, 2017
@andreasnoack
Copy link
Member

andreasnoack commented Jun 15, 2017

I agree that the documentation is not correct but I think both behaviors could be defended. Notice that we satisfy nnz(spones(A)) == nnz(A). Can anybody think of a situation where the current behavior is useful?

I don't like the minimal handling because the semantics of Matlab's spones and Julia's spones are radically different in unexpected and bug-inducing fashions because I have no idea which Julia operations introduce new structural non-zeros.

Our sparse matrices are already different from Matlab's. In Julia, the user has to consider the possibility of stored zeros. However, the documentation should be precise about this.

@Sacha0
Copy link
Member

Sacha0 commented Jun 15, 2017

The existing behavior seems consistent with the long-term direction of evolution of sparse arrays' overall behavior (being concerned with stored entries rather than numerical zeros). Adjusting the documentation as you suggest strikes me as the best and simplest solution, perhaps augmented with a pointer to dropzeros[!] for those concerned with numerical zeros rather than stored entries. Thoughts? :) Best!

@dgleich
Copy link
Author

dgleich commented Jun 19, 2017

Here's my slightly nuanced take.

I appreciate where you are coming from with the idea of stored entries
and the names. I like the direction of stored entries and would advocate
for functions to change names to reflect that. (e.g. nnz -> nse).

That said, that's a bigger task. In terms of spones, I still think
the current function should be deprecated and highly discouraged given
the direction the sparse matrix code is taking with these stored
entries. This is for two reasons.

  1. It is inconsistently named with the rest of the sparse functions.

     eye -> speye
     zeros -> spzeros
     ones -> spones # this does something very different!
    

    In this case, spfill would be a better and more descriptive name.

  2. It's behavior is harmful and likely unpredictable. This is because
    there are zero guarantees on the stored entries of the matrix
    and how operations change them. So having a function that
    manipulates these is extremely hard for me to see used.

    Put another way, I can't think of a scenario when I would use the
    current function without first dropzeros.

    • If I'm getting input from a user, I can't trust that they haven't run
      operations to introduce stored entries elsewhere.

    • If I'm running my own code, I can't trust that Julia won't have changed
      in subsequent versions to preserve the stored entry structure I need.

    So I really can't think of a time when this would be safe to use.
    (Except in scenarios where I just generated the sparse matrix, but then
    I'd probably still be better off re-calling "sparse".)

So again, I think changing the name to spfill and adjusting
the documentation. In the documentation, I'd definitely mention that
spfill!(dropzeros(A)...) replaces all the non-zero entries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sparse Sparse arrays
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants