Skip to content

Commit

Permalink
Add documentation for sparse ops (apache#148)
Browse files Browse the repository at this point in the history
*  draft doc for sparse op

* add more stype doc for operators

* add doc for cast_storage

* see also cast_storage. remove base sparse ndarray. fix aux_types comemtn

* grammar / spelling fix
  • Loading branch information
eric-haibin-lin authored Aug 11, 2017
1 parent a44afed commit ceca9b6
Show file tree
Hide file tree
Showing 9 changed files with 73 additions and 15 deletions.
5 changes: 0 additions & 5 deletions docs/api/python/ndarray.md
Original file line number Diff line number Diff line change
Expand Up @@ -547,11 +547,6 @@ The `contrib.ndarray` module contains many useful experimental APIs for new feat
:members:
:special-members:
.. autoclass:: mxnet.ndarray.BaseSparseNDArray
:members:
:special-members:
:exclude-members: __weakref__
.. autoclass:: mxnet.ndarray.CSRNDArray
:members:
:special-members:
Expand Down
4 changes: 4 additions & 0 deletions python/mxnet/ndarray/ndarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -1109,6 +1109,10 @@ def backward(self, out_grad=None, retain_graph=False, train_mode=True):
def tostype(self, stype):
"""Return a copy of the array with chosen storage type.
See Also
----------
:meth:`mxnet.ndarray.cast_storage`.
Returns
-------
NDArray, CSRNDArray or RowSparseNDArray
Expand Down
4 changes: 2 additions & 2 deletions python/mxnet/ndarray/sparse_ndarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -846,8 +846,8 @@ def _zeros_sparse_ndarray(stype, shape, ctx=None, dtype=None, aux_types=None, **
dtype : str or numpy.dtype, optional
An optional value type (default is `float32`)
aux_types: list of numpy.dtype, optional
An optional type for the aux data for BaseSparseNDArray (default values depends
on the storage type)
An optional list of types of the aux data for RowSparseNDArray or CSRNDArray
(default values depends on the storage type)
Returns
-------
Expand Down
14 changes: 7 additions & 7 deletions python/mxnet/ndarray/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,10 @@ def zeros(shape, ctx=None, dtype=None, stype=None, aux_types=None, **kwargs):
dtype : str or numpy.dtype, optional
An optional value type (default is `float32`)
stype: string, optional
The storage type of the empty array, such as 'row_sparse', 'csr', etc
The storage type of the empty array, such as 'row_sparse', 'csr', etc.
aux_types: list of numpy.dtype, optional
An optional type for the aux data for the BaseSparseNDArray (default values
depends on the storage type)
An optional list of types of the aux data for RowSparseNDArray or CSRNDArray
(default values depend on the storage type)
Returns
-------
Expand Down Expand Up @@ -73,8 +73,8 @@ def empty(shape, ctx=None, dtype=None, stype=None, aux_types=None):
stype : str, optional
An optional storage type (default is `default`).
aux_types: list of numpy.dtype, optional
An optional type for the aux data for the BaseSparseNDArray (default values depends
on the storage type)
An optional list of types of the aux data for RowSparseNDArray or CSRNDArray
(default values depend on the storage type)
Returns
-------
Expand Down Expand Up @@ -111,8 +111,8 @@ def array(source_array, ctx=None, dtype=None, aux_types=None):
The data type of the output array. The default dtype is ``source_array.dtype``
if `source_array` is an `NDArray`, `float32` otherwise.
aux_types: list of numpy.dtype, optional
An optional type for the aux data for the BaseSparseNDArray (default values
depends on the storage type)
An optional list of types of the aux data for RowSparseNDArray or CSRNDArray
(default values depend on the storage type)
Returns
-------
Expand Down
33 changes: 33 additions & 0 deletions src/operator/tensor/cast_storage.cc
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,39 @@ namespace op {
DMLC_REGISTER_PARAMETER(CastStorageParam);
NNVM_REGISTER_OP(cast_storage)
.describe(R"code(Casts tensor storage type to the new type.
When an NDArray with default storage type is cast to csr or row_sparse storage,
the result is compact, which means:
- for csr, zero values will not be retained
- for row_sparse, row slices of all zeros will not be retained
The storage type of ``cast_storage`` output depends on stype parameter:
- cast_storage(csr, 'default') = default
- cast_storage(row_sparse, 'default') = default
- cast_storage(default, 'csr') = csr
- cast_storage(default, 'row_sparse') = row_sparse
Example::
dense = [[ 0., 1., 0.],
[ 2., 0., 3.],
[ 0., 0., 0.],
[ 0., 0., 0.]]
# cast to row_sparse storage type
rsp = cast_storage(default, 'default')
rsp.indices = [0, 1]
rsp.values = [[ 0., 1., 0.],
[ 2., 0., 3.]]
# cast to row_sparse storage type
csr = cast_storage(default, 'default')
csr.indices = [1, 0, 2]
csr.values = [ 1., 2., 3.]
csr.indptr = [0, 1, 3, 3, 3]
)code" ADD_FILELINE)
.set_num_inputs(1)
.set_num_outputs(1)
Expand Down
8 changes: 8 additions & 0 deletions src/operator/tensor/dot.cc
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,14 @@ NNVM_REGISTER_OP(dot)
y = reshape([7,6,5,4,3,2,1,0], shape=(2,2,2))
dot(x,y)[0,0,1,1] = 0
sum(x[0,0,:]*y[:,1,1]) = 0
The storage type of ``dot`` output depends on storage types of inputs and transpose options:
- dot(csr, default) = default
- dot(csr.T, default) = row_sparse
- dot(csr, row_sparse) = default
- otherwise, ``dot`` generates output with default storage
)doc" ADD_FILELINE)
.set_num_inputs(2)
.set_num_outputs(1)
Expand Down
9 changes: 8 additions & 1 deletion src/operator/tensor/elemwise_binary_op_basic.cc
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,14 @@ namespace mxnet {
namespace op {
MXNET_OPERATOR_REGISTER_BINARY(elemwise_add)
.add_alias("_add").add_alias("_plus").add_alias("_Plus")
.describe("Adds arguments element-wise.")
.describe(R"code(Adds arguments element-wise.
The storage type of ``elemwise_add`` output depends on storage types of inputs
- elemwise_add(row_sparse, row_sparse) = row_sparse
- otherwise, ``elemwise_add`` generates output with default storage
)code")
.set_attr<FCompute>("FCompute<cpu>", BinaryCompute<cpu, mshadow::op::plus>)
.set_attr<nnvm::FGradient>("FGradient", CloneGradient{"_backward_add"})
.set_attr<FComputeEx>("FComputeEx<cpu>", BinaryComputeEx<cpu, mshadow::op::plus>)
Expand Down
6 changes: 6 additions & 0 deletions src/operator/tensor/elemwise_sum.cc
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,12 @@ NNVM_REGISTER_OP(add_n)
add\_n(a_1, a_2, ..., a_n) = a_1 + a_2 + ... + a_n
``add_n`` is potentially more efficient than calling ``add`` by `n` times.
The storage type of ``add_n`` output depends on storage types of inputs
- add_n(row_sparse, row_sparse, ..) = row_sparse
- otherwise, ``add_n`` generates output with default storage
)doc" ADD_FILELINE)
.set_attr_parser(ParamParser<ElementWiseSumParam>)
.set_num_inputs([](const nnvm::NodeAttrs& attrs) {
Expand Down
5 changes: 5 additions & 0 deletions src/operator/tensor/sparse_retain.cc
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,11 @@ Example::
rsp_out.values = [[1, 2], [5, 6]]
rsp_out.indices = [0, 3]
The storage type of ``sparse_retain`` output depends on storage types of inputs
- sparse_retain(row_sparse, default) = row_sparse
- otherwise, ``sparse_retain`` is not supported
)code" ADD_FILELINE)
.set_num_inputs(2)
.set_num_outputs(1)
Expand Down

0 comments on commit ceca9b6

Please sign in to comment.