Running demo code results in "LinAlgError: SVD did not converge" or "ValueError: array must not contain infs or NaNs" #14

styler00dollar · 2021-04-01T21:12:10Z

Like I already mentioned in Issue 13, the demo code seems to crash with an error.

from torchvision.models import resnet50
from flopco import FlopCo
from musco.pytorch import CompressorVBMF, CompressorPR, CompressorManual

model = resnet50(pretrained = True)
model.cuda()
model_stats = FlopCo(model, device = 'cuda')

compressor = CompressorVBMF(model,
                            model_stats,
                            ft_every=5, 
                            nglobal_compress_iters=2)
while not compressor.done:
    compressor.compression_step()
compressed_model = compressor.compressed_model

~/anaconda3/lib/python3.8/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_svd_nonconvergence(err, flag)
    104 
    105 def _raise_linalgerror_svd_nonconvergence(err, flag):
--> 106     raise LinAlgError("SVD did not converge")
    107 
    108 def _raise_linalgerror_lstsq(err, flag):

LinAlgError: SVD did not converge

or

~/anaconda3/lib/python3.8/site-packages/numpy/lib/function_base.py in asarray_chkfinite(a, dtype, order)
    495     a = asarray(a, dtype=dtype, order=order)
    496     if a.dtype.char in typecodes['AllFloat'] and not np.isfinite(a).all():
--> 497         raise ValueError(
    498             "array must not contain infs or NaNs")
    499     return a

ValueError: array must not contain infs or NaNs

The output seems to be random and one of both, if code gets run multiple times.

The text was updated successfully, but these errors were encountered:

engharat · 2022-06-07T15:05:39Z

I managed to fix it by replacing scikit-tensor-py3 calls with tensotly calls. The example works fine now, and I avoided also an ugly numpy&scipy downgrade, which was required by scikit-tensor-py3.
For anyone interested, here is what I did:
Remove from musco/pytorch/compressor/decompositions/tucker2.py any import to scikit-tensor-py3 functions
Add
import tensorly
tensorly.set_backend("pytorch")
in get_tucker_factors the weight line becomes:
weights = tensorly.tensor(self.weight.cpu())
The tucker call changes so that it uses tensorly.decomposition.tucker:
core, (U_cout, U_cin, U_dd) = tensorly.decomposition.tucker(weights, [self.ranks[0], self.ranks[1], weights.shape[-1]], init='nvecs')
Finally few lines down, in the same function, change core = core.dot(U_dd.T) into core = core.matmul(U_dd.T) to use pytorch matrix multiplication (.dot works only for 1D vectors).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running demo code results in "LinAlgError: SVD did not converge" or "ValueError: array must not contain infs or NaNs" #14

Running demo code results in "LinAlgError: SVD did not converge" or "ValueError: array must not contain infs or NaNs" #14

styler00dollar commented Apr 1, 2021

engharat commented Jun 7, 2022 •

edited

Loading

Running demo code results in "LinAlgError: SVD did not converge" or "ValueError: array must not contain infs or NaNs" #14

Running demo code results in "LinAlgError: SVD did not converge" or "ValueError: array must not contain infs or NaNs" #14

Comments

styler00dollar commented Apr 1, 2021

engharat commented Jun 7, 2022 • edited Loading

engharat commented Jun 7, 2022 •

edited

Loading