Distributed Compressed Sparse Column Matrix #1377

Mystic-Slice · 2024-02-20T09:52:30Z

Due Diligence

General:
- title of the PR is suitable to appear in the Release Notes
Implementation:
- unit tests: all split configurations tested
- unit tests: multiple dtypes tested
- documentation updated where needed

Description

A new format similar to the one implemented in #1028.

Most code from the previous implementation is common for both formats.

Steps:

NOTE: In the future, when arithmetic operations are supported by torch for sparse_csc, they can be enabled for DCSC_matrix by restoring the code removed in commit 6d727af. The binary operator has already been modified to be generic for both DCSR_matrix and DCSC_matrix.

Type of change

New feature

…/sparse conversion is complete.

ghost · 2024-02-20T09:53:27Z

👇 Click on the image for a new way to code review

Legend

…to sparse_csc

for more information, see https://pre-commit.ci

Mystic-Slice · 2024-03-06T05:20:53Z

I just discovered something that I should have seen a long time ago.

torch only officially supports the matmul operation for its CSR and CSC formats. (https://pytorch.org/docs/stable/sparse.html#csr-tensor-operations)
The elementwise operations (add and mul) are implemented for CSR(although not mentioned in their docs) but not for CSC.

It is still possible for us to implement matmul for ht.sparse module. Just wanted to know how we should proceed.

github-actions · 2024-03-07T09:27:59Z

Thank you for the PR!

github-actions · 2024-03-07T09:28:09Z

Thank you for the PR!

codecov · 2024-03-07T10:03:22Z

Codecov Report

Attention: Patch coverage is 92.85714% with 9 lines in your changes missing coverage. Please review.

Project coverage is 91.74%. Comparing base (ee0d72a) to head (90dc1d6).
Report is 325 commits behind head on main.

Files	Patch %	Lines
heat/sparse/_operations.py	78.94%	4 Missing ⚠️
heat/sparse/dcsx_matrix.py	93.02%	3 Missing ⚠️
heat/sparse/factories.py	96.96%	1 Missing ⚠️
heat/sparse/manipulations.py	95.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1377      +/-   ##
==========================================
- Coverage   91.76%   91.74%   -0.03%     
==========================================
  Files          80       80              
  Lines       11640    11683      +43     
==========================================
+ Hits        10682    10719      +37     
- Misses        958      964       +6

Flag	Coverage Δ
unit	`91.74% <92.85%> (-0.03%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ClaudiaComito · 2024-04-22T03:49:34Z

I just discovered something that I should have seen a long time ago.

torch only officially supports the matmul operation for its CSR and CSC formats. (https://pytorch.org/docs/stable/sparse.html#csr-tensor-operations) The elementwise operations (add and mul) are implemented for CSR(although not mentioned in their docs) but not for CSC.

It is still possible for us to implement matmul for ht.sparse module. Just wanted to know how we should proceed.

@Mystic-Slice thank you so much for this and apologies for the delay.

It's OK not to provide the element-wise operations for CSC if torch doesn't implement them yet.

The basic operation we want to be able to provide with this class, is matrix multiplication CSR @ CSC, when
CSR is distributed (split=0) and CSC is not distributed. This operation requires no MPI communication, returns a distributed CSR. It would be a great starting point, and it doesn't need to be part of this PR.

Is this PR, the CSC class, ready for review @Mystic-Slice ?

Thanks again, we highly appreciate your contribution!

github-actions · 2024-04-22T04:11:26Z

Thank you for the PR!

…to sparse_csc

into sparse_csc

Mystic-Slice · 2024-04-22T15:59:22Z

Hi @ClaudiaComito
Sorry for the delay from my side too.
The implementation of DCSC_matrix is complete. I just need to write a few tests.
I will complete them asap.

If tests aren't a blocker for review, we can proceed with that.

github-actions · 2024-04-22T16:01:49Z

Thank you for the PR!

ClaudiaComito · 2024-04-23T07:15:30Z

Hi @ClaudiaComito Sorry for the delay from my side too. The implementation of DCSC_matrix is complete. I just need to write a few tests. I will complete them asap.

If tests aren't a blocker for review, we can proceed with that.

Hi @Mystic-Slice, thanks, the tests make the review so much easier, so please go ahead and sketch the tests first. Thanks a lot!

github-actions · 2024-05-27T03:09:58Z

Thank you for the PR!

ClaudiaComito · 2024-05-27T13:47:52Z

@Mystic-Slice on the CUDA runner, with PyTorch 2.2, we get the following error:

___________ ERROR collecting heat/sparse/tests/test_manipulations.py ___________
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/_pytest/runner.py:341: in from_call
    result: Optional[TResult] = func()
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/_pytest/runner.py:372: in <lambda>
    call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/_pytest/python.py:531: in collect
    self._inject_setup_module_fixture()
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/_pytest/python.py:545: in _inject_setup_module_fixture
    self.obj, ("setUpModule", "setup_module")
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/_pytest/python.py:310: in obj
    self._obj = obj = self._getobj()
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/_pytest/python.py:528: in _getobj
    return self._importtestmodule()
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/_pytest/python.py:617: in _importtestmodule
    mod = import_path(self.path, mode=importmode, root=self.config.rootpath)
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/_pytest/pathlib.py:565: in import_path
    importlib.import_module(module_name)
/opt/conda/envs/heat_dev/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1050: in _gcd_import
    ???
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:992: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
    ???
<frozen importlib._bootstrap>:1050: in _gcd_import
    ???
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:688: in _load_unlocked
    ???
<frozen importlib._bootstrap_external>:883: in exec_module
    ???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
    ???
heat/sparse/tests/__init__.py:1: in <module>
    from .test_arithmetics_csr import *
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:688: in _load_unlocked
    ???
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/_pytest/assertion/rewrite.py:178: in exec_module
    exec(co, module.__dict__)
heat/sparse/tests/test_arithmetics_csr.py:4: in <module>
    import heat as ht
heat/__init__.py:20: in <module>
    from . import utils
heat/utils/__init__.py:5: in <module>
    from . import data
heat/utils/data/__init__.py:7: in <module>
    from . import mnist
heat/utils/data/mnist.py:7: in <module>
    from torchvision import datasets
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/torchvision/__init__.py:6: in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/torchvision/_meta_registrations.py:26: in <module>
    def meta_roi_align(input, rois, spatial_scale, pooled_height, pooled_width, sampling_ratio, aligned):
/opt/conda/envs/heat_dev/lib/python3.10/site-packages/torchvision/_meta_registrations.py:18: in wrapper
    if torchvision.extension._has_ops():
E   AttributeError: partially initialized module 'torchvision' has no attribute 'extension' (most likely due to a circular import)

Many test files return the same error. But we can't reproduce it with other PRs. Can you check it out?
The ROCm runner runs fine, by the way.

Mystic-Slice · 2024-05-27T14:54:49Z

I was able to recreate this error on my local machine.
It occurs when the torch and torchvision version do not match.
In my case, specifically, torch==2.2 and torchvision==0.18.0 (which corresponds to torch==2.3).
When I made the versions match, they work perfectly fine.

I believe the torch version on cuda runner is 2.2 and not 2.3. But I am not sure.

Is there a way to check that?

ClaudiaComito · 2024-06-03T09:12:52Z

I was able to recreate this error on my local machine. It occurs when the torch and torchvision version do not match. In my case, specifically, torch==2.2 and torchvision==0.18.0 (which corresponds to torch==2.3). When I made the versions match, they work perfectly fine.

I believe the torch version on cuda runner is 2.2 and not 2.3. But I am not sure.

Is there a way to check that?

~~@mtar can you look into this? thanks~~ never mind

github-actions · 2024-06-03T09:20:36Z

Thank you for the PR!

ClaudiaComito · 2024-06-05T14:57:06Z

@Mystic-Slice this looks amazing. Will you add your name to the CITATION.cff file, under the # release contributors - add as needed header? Thanks!

into sparse_csc

Mystic-Slice · 2024-06-05T17:15:16Z

Done!

github-actions · 2024-06-05T17:19:20Z

Thank you for the PR!

github-actions · 2024-06-06T07:25:58Z

Thank you for the PR!

ClaudiaComito

Amazing work @Mystic-Slice ! 👏🏼

ClaudiaComito · 2024-06-06T08:20:52Z

@Mystic-Slice one single test fails with torch<2, see below. From my side it's absolutely OK to skip the test when torch<2

_________________________ TestDCSC_matrix.test_astype __________________________

self = <heat.sparse.tests.test_dcscmatrix.TestDCSC_matrix testMethod=test_astype>

    def test_astype(self):
        heat_sparse_csc = ht.sparse.sparse_csc_matrix(self.ref_torch_sparse_csc)
    
        # check starting invariant
        self.assertEqual(heat_sparse_csc.dtype, ht.float32)
    
        # check the copy case for uint8
>       as_uint8 = heat_sparse_csc.astype(ht.uint8)

heat/sparse/tests/test_dcscmatrix.py:273: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = , dtype = <class 'heat.core.types.uint8'>, copy = True

    def astype(self, dtype, copy=True) -> __DCSX_matrix:
        """
        Returns a casted version of this matrix.
        Casted matrix is a new matrix of the same shape but with given type of this matrix. If copy is ``True``, the
        same matrix is returned instead.
    
        Parameters
        ----------
        dtype : datatype
            HeAT type to which the matrix is cast
        copy : bool, optional
            By default the operation returns a copy of this matrix. If copy is set to ``False`` the cast is performed
            in-place and this matrix is returned
        """
        dtype = canonical_heat_type(dtype)
>       casted_matrix = self._array.type(dtype.torch_type())
E       RuntimeError: torch.copy_: only sparse compressed tensors with the same number of specified elements are supported.

heat/sparse/dcsx_matrix.py:309: RuntimeError

into sparse_csc

Mystic-Slice · 2024-06-07T07:11:29Z

@ClaudiaComito
referred https://discuss.pytorch.org/t/why-does-pytorch-needs-the-three-functions-to-type-and-type-as/20164
Seems like the .type() doesnt support conversion of a lot of Tensor types. I should have used .to() method instead.
Made the change.

github-actions · 2024-06-07T07:15:02Z

Thank you for the PR!

ClaudiaComito

Thanks @Mystic-Slice ! (take 2)

Mystic-Slice · 2024-06-07T09:20:43Z

Seems like it didnt work
I will just skip the test for torch < 2.0.
Sorry for the delay. I thought the previous fix would work.

github-actions · 2024-06-07T09:24:52Z

Thank you for the PR!

ClaudiaComito

Take 3!

Mystic-Slice added 3 commits February 3, 2024 09:31

modified factory method to include compressed column type

023f0a7

Created a base class for both DCSR_matrix and DCSC_matrix

48a5bc9

Updated docstrings and type annotations. The class, methods and dense…

00fd19f

…/sparse conversion is complete.

Mystic-Slice and others added 4 commits February 20, 2024 16:56

refactoring changes

82a38d9

Arithmetric operations implemented

f070d23

Merge branch 'main' of https://github.com/helmholtz-analytics/heat in…

b362628

…to sparse_csc

[pre-commit.ci] auto fixes from pre-commit.com hooks

f8cd97d

for more information, see https://pre-commit.ci

ClaudiaComito added the sparse label Mar 3, 2024

Mystic-Slice requested review from ClaudiaComito and mrfh92 March 6, 2024 05:21

Mystic-Slice added the PR talk label Mar 6, 2024

mrfh92 marked this pull request as ready for review March 7, 2024 09:23

Merge branch 'main' into sparse_csc

fa2bdd4

ClaudiaComito removed the PR talk label Mar 25, 2024

Merge branch 'main' into sparse_csc

e7c4184

Mystic-Slice added 3 commits April 22, 2024 21:02

Merge branch 'main' of https://github.com/helmholtz-analytics/heat in…

c513bd3

…to sparse_csc

PyTorch CSC tensors do not support arithmetic ops yet

6d727af

Merge branch 'sparse_csc' of https://github.com/helmholtz-analytics/heat

3684130

into sparse_csc

tests for csc matrix - manipulations

f6db1bb

tests for DCSC_matrix class methods

203c274

Merge branch 'main' into sparse_csc

9fec353

Mystic-Slice added 2 commits June 5, 2024 22:44

added name to CITATION.cff

b913228

Merge branch 'sparse_csc' of https://github.com/helmholtz-analytics/heat

959557c

into sparse_csc

Merge branch 'main' into sparse_csc

12b7dbc

ClaudiaComito previously approved these changes Jun 6, 2024

View reviewed changes

ClaudiaComito added the merge queue label Jun 6, 2024

Mystic-Slice added 2 commits June 7, 2024 12:39

fix: fixed dtype conversion bug in astype method

680f3ee

Merge branch 'sparse_csc' of https://github.com/helmholtz-analytics/heat

28ff0fb

into sparse_csc

Mystic-Slice dismissed ClaudiaComito’s stale review via 28ff0fb June 7, 2024 07:10

ClaudiaComito previously approved these changes Jun 7, 2024

View reviewed changes

skip type conversion test for DCSC_matrix if torch < 2.0

90dc1d6

Mystic-Slice dismissed ClaudiaComito’s stale review via 90dc1d6 June 7, 2024 09:20

ClaudiaComito approved these changes Jun 7, 2024

View reviewed changes

ClaudiaComito merged commit 0e40d14 into main Jun 7, 2024
53 checks passed

ClaudiaComito deleted the sparse_csc branch June 7, 2024 12:40

ClaudiaComito added the enhancement New feature or request label Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed Compressed Sparse Column Matrix #1377

Distributed Compressed Sparse Column Matrix #1377

Mystic-Slice commented Feb 20, 2024 •

edited

Loading

ghost commented Feb 20, 2024 •

edited by ghost

Loading

Legend

Mystic-Slice commented Mar 6, 2024

github-actions bot commented Mar 7, 2024

github-actions bot commented Mar 7, 2024

codecov bot commented Mar 7, 2024 •

edited

Loading

ClaudiaComito commented Apr 22, 2024

github-actions bot commented Apr 22, 2024

Mystic-Slice commented Apr 22, 2024

github-actions bot commented Apr 22, 2024

ClaudiaComito commented Apr 23, 2024

github-actions bot commented May 27, 2024

ClaudiaComito commented May 27, 2024 •

edited

Loading

Mystic-Slice commented May 27, 2024

ClaudiaComito commented Jun 3, 2024 •

edited

Loading

github-actions bot commented Jun 3, 2024

ClaudiaComito commented Jun 5, 2024

Mystic-Slice commented Jun 5, 2024

github-actions bot commented Jun 5, 2024

github-actions bot commented Jun 6, 2024

ClaudiaComito left a comment

ClaudiaComito commented Jun 6, 2024

Mystic-Slice commented Jun 7, 2024 •

edited

Loading

github-actions bot commented Jun 7, 2024

ClaudiaComito left a comment

Mystic-Slice commented Jun 7, 2024

github-actions bot commented Jun 7, 2024

ClaudiaComito left a comment

Distributed Compressed Sparse Column Matrix #1377

Distributed Compressed Sparse Column Matrix #1377

Conversation

Mystic-Slice commented Feb 20, 2024 • edited Loading

Due Diligence

Description

Type of change

ghost commented Feb 20, 2024 • edited by ghost Loading

Legend

Mystic-Slice commented Mar 6, 2024

github-actions bot commented Mar 7, 2024

github-actions bot commented Mar 7, 2024

codecov bot commented Mar 7, 2024 • edited Loading

Codecov Report

ClaudiaComito commented Apr 22, 2024

github-actions bot commented Apr 22, 2024

Mystic-Slice commented Apr 22, 2024

github-actions bot commented Apr 22, 2024

ClaudiaComito commented Apr 23, 2024

github-actions bot commented May 27, 2024

ClaudiaComito commented May 27, 2024 • edited Loading

Mystic-Slice commented May 27, 2024

ClaudiaComito commented Jun 3, 2024 • edited Loading

github-actions bot commented Jun 3, 2024

ClaudiaComito commented Jun 5, 2024

Mystic-Slice commented Jun 5, 2024

github-actions bot commented Jun 5, 2024

github-actions bot commented Jun 6, 2024

ClaudiaComito left a comment

Choose a reason for hiding this comment

ClaudiaComito commented Jun 6, 2024

Mystic-Slice commented Jun 7, 2024 • edited Loading

github-actions bot commented Jun 7, 2024

ClaudiaComito left a comment

Choose a reason for hiding this comment

Mystic-Slice commented Jun 7, 2024

github-actions bot commented Jun 7, 2024

ClaudiaComito left a comment

Choose a reason for hiding this comment

Mystic-Slice commented Feb 20, 2024 •

edited

Loading

ghost commented Feb 20, 2024 •

edited by ghost

Loading

codecov bot commented Mar 7, 2024 •

edited

Loading

ClaudiaComito commented May 27, 2024 •

edited

Loading

ClaudiaComito commented Jun 3, 2024 •

edited

Loading

Mystic-Slice commented Jun 7, 2024 •

edited

Loading