Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with sourmash plot on huge matrix #469

Closed
domenico-simone opened this issue Apr 27, 2018 · 1 comment
Closed

Error with sourmash plot on huge matrix #469

domenico-simone opened this issue Apr 27, 2018 · 1 comment

Comments

@domenico-simone
Copy link

Hello,

When I try to run sourmash plot like this

sourmash plot --pdf --labels compare_viral_sigs

I get this error:

...got 4051 x 4051 matrix.
loading labels from compare_viral_sigs.labels.txt
saving histogram of matrix values => compare_viral_sigs.hist.pdf
Traceback (most recent call last):
  File "/home/dosiaa/miniconda2/envs/sourmash_env/bin/sourmash", line 6, in <module>
    sys.exit(sourmash_lib.__main__.main())
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/sourmash_lib/__main__.py", line 76, in main
    cmd(sys.argv[2:])
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/sourmash_lib/commands.py", line 509, in plot
    Z1 = sch.dendrogram(Y, orientation='right', labels=labeltext)
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2496, in dendrogram
    above_threshold_color=above_threshold_color)
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2782, in _dendrogram_calculate_info
    above_threshold_color=above_threshold_color)
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2782, in _dendrogram_calculate_info
    above_threshold_color=above_threshold_color)
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2782, in _dendrogram_calculate_info
    above_threshold_color=above_threshold_color)
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2782, in _dendrogram_calculate_info
    above_threshold_color=above_threshold_color)
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2782, in _dendrogram_calculate_info
    above_threshold_color=above_threshold_color)
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2782, in _dendrogram_calculate_info
    above_threshold_color=above_threshold_color)
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2782, in _dendrogram_calculate_info
    above_threshold_color=above_threshold_color)
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2782, in _dendrogram_calculate_info
    above_threshold_color=above_threshold_color)
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2782, in _dendrogram_calculate_info
    above_threshold_color=above_threshold_color)
  File "/home/dosiaa/miniconda2/envs/sourmash_env/lib/python2.7/site-packages/scipy/cluster/hierarchy.py", line 2782, in _dendrogram_calculate_info
    above_threshold_color=above_threshold_color)

# and over and over with the same message

As you can see it's a pretty huge matrix. I tried on a subset of 195x195 comparisons and the sourmash plot succeeded. I guess there's some issue with the matrix size then :) Any idea on how to overcome this problem? Should it be useful, here's an archive with both the matrix and the labels.

I am using sourmash from conda with python2, indeed after installing it I updated sourmash itself to version 2 through pip. Here's the output of conda list within the environment. FYI, I tried with both scipy 1.0.0 and 0.19.1, that's why you see that downgraded version.

# packages in environment at /home/domeni/miniconda2/envs/sourmash_env:
#
# Name                    Version                   Build  Channel
backports                 1.0              py27h63c9359_1  
backports.functools_lru_cache 1.5                      py27_1  
backports.shutil_get_terminal_size 1.0.0            py27h5bc021e_2  
backports_abc             0.5              py27h7b3c97b_0  
bleach                    1.4.2                    py27_0    bioconda
bz2file                   0.98                     py27_0  
ca-certificates           2018.03.07                    0  
certifi                   2018.4.16                py27_0  
configparser              3.5.0            py27h5117587_0  
cycler                    0.10.0           py27hc7354d3_0  
dbus                      1.12.2               hc3f9b76_1  
decorator                 4.2.1                    py27_0  
entrypoints               0.2.3            py27h502b47d_2  
enum34                    1.1.6            py27h99a27e9_1  
expat                     2.2.5                he0dffb1_0  
fontconfig                2.12.6               h49f89f6_0  
freetype                  2.8                  hab7d2ae_1  
functools32               3.2.3.2          py27h4ead58f_1  
futures                   3.2.0            py27h7b459c0_0  
glib                      2.53.6               h5d9569c_2  
gmp                       6.1.2                h6c8ec71_1  
gst-plugins-base          1.12.4               h33fb286_0  
gstreamer                 1.12.4               hb53b477_0  
html5lib                  1.0.1            py27h5233db4_0  
icu                       58.2                 h9c2bf20_1  
ijson                     2.3                       <pip>
intel-openmp              2018.0.0             hc7b2577_8  
ipykernel                 4.8.2                    py27_0  
ipython                   5.4.1                    py27_2  
ipython_genutils          0.2.0            py27h89fb69b_0  
ipywidgets                7.1.2                    py27_0  
jinja2                    2.10             py27h4114e70_0  
jpeg                      9b                   h024ee3a_2  
jsonschema                2.6.0            py27h7ed5aa4_0  
jupyter                   1.0.0                    py27_4  
jupyter_client            5.2.3                    py27_0  
jupyter_console           5.2.0            py27hc6bee7e_1  
jupyter_core              4.4.0            py27h345911c_0  
khmer                     2.1.1                     <pip>
kiwisolver                1.0.1            py27hc15e7b5_0  
libedit                   3.1                  heed3624_0  
libffi                    3.2.1                hd88cf55_4  
libgcc                    7.2.0                h69d50b8_2  
libgcc-ng                 7.2.0                hdf63c60_3  
libgfortran-ng            7.2.0                hdf63c60_3  
libpng                    1.6.34               hb9fc6fc_0  
libsodium                 1.0.15               hf101ebd_0  
libstdcxx-ng              7.2.0                hdf63c60_3  
libxcb                    1.12                 hcd93eb1_4  
libxml2                   2.9.7                h26e45fe_0  
markupsafe                1.0              py27h97b2822_1  
matplotlib                2.2.2            py27h0e671d2_1  
mistune                   0.8.3                    py27_0  
mkl                       2018.0.2                      1  
nbconvert                 5.3.1            py27he041f76_0  
nbformat                  4.4.0            py27hed7f2b2_0  
ncurses                   6.0                  h9df7e31_2  
notebook                  5.4.1                    py27_0  
numpy                     1.14.2           py27hdbf6ddf_0  
openssl                   1.0.2o               h20670df_0  
pandoc                    1.19.2.1             hea2e7c5_1  
pandocfilters             1.4.2            py27h428e1e5_1  
pathlib2                  2.3.0            py27h6e9d198_0  
pcre                      8.41                 hc27e229_1  
pexpect                   4.4.0                    py27_0  
pickleshare               0.7.4            py27h09770e1_0  
pip                       9.0.1                    py27_5  
prompt_toolkit            1.0.15           py27h1b593e1_0  
ptyprocess                0.5.2            py27h4ccb14c_0  
pygments                  2.2.0            py27h4a8b6f5_0  
pyparsing                 2.2.0            py27hf1513f8_1  
pyqt                      5.9.2            py27h751905a_0  
python                    2.7.14              h1571d57_30  
python-dateutil           2.3                      py27_0    bioconda
pytz                      2018.4                   py27_0  
pyyaml                    3.12             py27h2d70dd7_1  
pyzmq                     17.0.0           py27h14c3975_0  
qt                        5.9.4                h4e5bff0_0  
qtconsole                 4.3.1            py27hc444b0d_0  
readline                  7.0                  ha6073c6_4  
scandir                   1.7              py27h14c3975_0  
scipy                     0.19.1           py27h1edc525_3  
screed                    1.0                      py27_0    bioconda
send2trash                1.5.0                    py27_0  
setuptools                38.5.1                   py27_0  
simplegeneric             0.8.1                    py27_2  
singledispatch            3.4.0.3          py27h9bcb476_0  
sip                       4.19.8           py27hf484d3e_0  
six                       1.11.0           py27h5f960f1_1  
sourmash                  1.0                      py27_0    bioconda
sourmash                  2.0.0a4                   <pip>
sqlite                    3.22.0               h1bed415_0  
subprocess32              3.2.7            py27h373dbce_0  
terminado                 0.8.1                    py27_1  
testpath                  0.3.1            py27hc38d2c4_0  
tk                        8.6.7                hc745277_3  
tornado                   5.0                      py27_0  
traitlets                 4.3.2            py27hd6ce930_0  
wcwidth                   0.1.7            py27h9e3e1ab_0  
webencodings              0.5.1            py27hff10b21_1  
wheel                     0.30.0           py27h2bc6bb2_1  
widgetsnbextension        3.1.4                    py27_0  
xz                        5.2.3                h55aa19d_2  
yaml                      0.1.7                had09818_2  
zeromq                    4.2.3                h439df22_3  
zlib                      1.2.11               ha838bed_2 

Thanks,

Domenico

@ctb
Copy link
Contributor

ctb commented Apr 4, 2020

We addressed (some of this) behavior in #343, where you can get a plot with a random subsample which should reflect the overall structure.

#217 allows you to export a CSV format so that you can use other tools than sourmash plot to plot these matrices.

I don't know how to support plotting huge matrices in sourmash by default tho :). Guidance welcome as to better/more scalable packages!

The only better solution I can think of might be to support truncation of clusters down to some value, e.g. '--truncate 200' would do a clustering and truncate the dendrogram at clusters. See truncate_mode in scipy.cluster.hierarchy.dendrogram - docs link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants