Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Census docker container crashes and uses wrong version of scikit-learn #2610

Closed
gshimansky opened this issue Jan 14, 2021 · 1 comment
Closed
Assignees
Labels
bug 🦗 Something isn't working

Comments

@gshimansky
Copy link
Collaborator

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):

Ubuntu 20.04

  • Modin version (modin.__version__):

0.8.3

  • Python version:

3.7.9

  • Code we can use to reproduce:

Describe the problem

Container built from this Dockerfile https://github.com/modin-project/modin/blob/master/examples/docker/census-on-omnisci/census-omnisci.dockerfile produces an image where python crashes like this:

UserWarning: Modin Ray engine was started with 696 GB free space avaliable, if it is not enough for your application, pl
ease set environment variable MODIN_ON_RAY_PLASMA_DIR=/directory/without/space/limiting
double free or corruption (top)
*** Aborted at 1610651443 (unix time) try "date -d @1610651443" if you are using GNU date ***
PC: @                0x0 (unknown)

The problem appears to be in version of libc6 package used in this image. After running apt-get update; apt-get upgrade which updates libc6 this problem is not reproducible. Most likely libc6 used in this image is too old for Modin and omnisci binaries built in conda channel.

After crash is fixed script still doesn't work:

UserWarning: Modin Ray engine was started with 676 GB free space avaliable, if it is not enough for your application, please set environment variable MODIN_ON_RAY_PLASMA_DIR=/directory/without/space/limiting
Traceback (most recent call last):
  File "census-omnisci.py", line 20, in <module>
    import daal4py.sklearn as sklearn
  File "/home/modin/miniconda/envs/modin/lib/python3.7/site-packages/daal4py/sklearn/__init__.py", line 18, in <module>
    from .monkeypatch.dispatcher import enable as patch_sklearn
  File "/home/modin/miniconda/envs/modin/lib/python3.7/site-packages/daal4py/sklearn/monkeypatch/dispatcher.py", line 28, in <module>
    from ..cluster.k_means import KMeans as KMeans_daal4py
  File "/home/modin/miniconda/envs/modin/lib/python3.7/site-packages/daal4py/sklearn/cluster/__init__.py", line 1, in <module>
    from .k_means import KMeans
  File "/home/modin/miniconda/envs/modin/lib/python3.7/site-packages/daal4py/sklearn/cluster/k_means.py", line 22, in <module>
    from ._k_means_0_23 import *
  File "/home/modin/miniconda/envs/modin/lib/python3.7/site-packages/daal4py/sklearn/cluster/_k_means_0_23.py", line 25, in <module>
    from sklearn.cluster._kmeans import (k_means, _labels_inertia, _k_init)
ImportError: cannot import name '_k_init' from 'sklearn.cluster._kmeans' (/home/modin/miniconda/envs/modin/lib/python3.7/site-packages/sklearn/cluster/_kmeans.py)

The problem happens because current version of sklearn package is 0.24.0 while daal4py is able to work only with previous versions, e.g. 0.23.2. In version 0.24.0 there is no _k_init in sklearn package. The easiest way is to pin sklearn version until daal4py is updated.

@gshimansky gshimansky added the bug 🦗 Something isn't working label Jan 14, 2021
@gshimansky gshimansky self-assigned this Jan 14, 2021
@anmyachev
Copy link
Collaborator

duplicate of #2611

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants