Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run on google colab #183

Open
otakamrlw opened this issue Mar 30, 2020 · 7 comments
Open

Run on google colab #183

otakamrlw opened this issue Mar 30, 2020 · 7 comments

Comments

@otakamrlw
Copy link

Is there anyone who successfully run pointnet2 on google colab? I don't have GPU so try to run on google colab but I got this error.
tensorflow.python.framework.errors_impl.NotFoundError: /content/pointnet2/tf_ops/sampling/tf_sampling_so.so: cannot open shared object file: No such file or directory

It would be very great if someone share the colab notebook.
Thank you.

@mactavish10
Copy link

Navigate to the tf_ops directory, and you'll see 3 folders, sampling, 3d_interpolation, and grouping. There is a bash file in each folder. Open them, modify the cuda and Python path according to your setup, and run the bash files.

@manishmaruthi
Copy link

manishmaruthi commented Apr 20, 2021

Hi, were anybody able to run Pointnet++ recently in Google colab?

Google colab is currently running tensorflow version 1.x = 1.15.2.
This causes problem in compilation of tf_sampling_compile.sh since libtensorflow_framework.so is not available,
hence I added a symbolic link from libtensorflow_framework.so.1 to libtensorflow_framework.so.

TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
ln -s $TF_LIB/libtensorflow_framework.so.1 $TF_LIB/libtensorflow_framework.so
This helps in compilation.

But,
ldd tf_sampling_so.so gives:

linux-vdso.so.1 (0x00007ffde3dd6000)
/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 (0x00007f55a271d000)
libcudart.so.11.0 => /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudart.so.11.0 (0x00007f55a249f000)
libtensorflow_framework.so.1 => not found
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f55a2116000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f55a1efe000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f55a1b0d000)
libunwind.so.8 => /usr/lib/x86_64-linux-gnu/libunwind.so.8 (0x00007f55a18f2000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f55a16d3000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f55a1335000)
/lib64/ld-linux-x86-64.so.2 (0x00007f55a2ba0000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f55a1131000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f55a0f29000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f55a0d03000)

I have added libtensorflow_framework.so.1 path to LD_LIBRARY_PATH using:
import os
os.environ['LD_LIBRARY_PATH']="/usr/lib64-nvidia:/tensorflow-1.15.2/python3.7/tensorflow_core"

!ldd tf_sampling_so.so then gives me this:
linux-vdso.so.1 (0x00007ffe603b7000)
/usr/lib/x86_64-linux-gnu/libtcmalloc.so.4 (0x00007fcf94025000)
libcudart.so.11.0 => /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudart.so.11.0 (0x00007fcf93da7000)
libtensorflow_framework.so.1 => /tensorflow-1.15.2/python3.7/tensorflow_core/libtensorflow_framework.so.1 (0x00007fcf92087000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fcf91cfe000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fcf91ae6000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fcf916f5000)
libunwind.so.8 => /usr/lib/x86_64-linux-gnu/libunwind.so.8 (0x00007fcf914da000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fcf912bb000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fcf90f1d000)
/lib64/ld-linux-x86-64.so.2 (0x00007fcf944a7000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fcf90d19000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fcf90b11000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007fcf908eb000)

There is libtensorflow_framework.so.1 available in the list but not libtensorflow_framework.so, is that fine?

if I run train.py, I still have this error:

/content/gdrive/My Drive/pointnet2-master/train.py in ()
50 DECAY_RATE = FLAGS.decay_rate
51
---> 52 MODEL = importlib.import_module(FLAGS.model) # import network module
53 MODEL_FILE = os.path.join(ROOT_DIR, 'models', FLAGS.model+'.py')
54 LOG_DIR = FLAGS.log_dir

10 frames

/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/load_library.py in load_op_library(library_filename)
59 RuntimeError: when unable to load the library or get the python wrappers.
60 """
---> 61 lib_handle = py_tf.TF_LoadLibrary(library_filename)
62
63 op_list_str = py_tf.TF_GetOpList(lib_handle)

NotFoundError: libtensorflow_framework.so: cannot open shared object file: No such file or directory

Can anybody please help me with this?

@manishmaruthi
Copy link

manishmaruthi commented Apr 28, 2021

Hello, I am finally able to successfully train Pointnet++ in Google colab:

I referred to #142 commands shared by sheshap

Here is my tf_sampling_compile.sh content:
/usr/local/cuda-11.0/bin/nvcc tf_sampling_g.cu -o tf_sampling_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC

CUDA_INC='/usr/local/cuda-11.0/include'
CUDA_LIB='/usr/local/cuda-11.0/lib64'

TF_CFLAGS=$(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))')
TF_LFLAGS=$(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))')

g++ -std=c++11 -shared tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -I$CUDA_INC -lcudart -L$CUDA_LIB -O2 -D_GLIBCXX_USE_CXX11_ABI=0

@EXJUSTICE
Copy link

EXJUSTICE commented Feb 19, 2022

Hello, I am finally able to successfully train Pointnet++ in Google colab:

I referred to #142 commands shared by sheshap

Here is my tf_sampling_compile.sh content: /usr/local/cuda-11.0/bin/nvcc tf_sampling_g.cu -o tf_sampling_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC

CUDA_INC='/usr/local/cuda-11.0/include' CUDA_LIB='/usr/local/cuda-11.0/lib64'

TF_CFLAGS=$(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))') TF_LFLAGS=$(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))')

g++ -std=c++11 -shared tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -I$CUDA_INC -lcudart -L$CUDA_LIB -O2 -D_GLIBCXX_USE_CXX11_ABI=0

@manishmaruthi Your sampling script really helped I'm trying to do the same (same error with ltensorflow framework), did you encounter an error "Cannot feed value of shape (16, 1024, 6) for Tensor 'Placeholder:0', which has shape '(16, 1024, 3))" After attempting train.py?

@EXJUSTICE
Copy link

Problem solved, was due to data type

@VickkyMama
Copy link

Hello, I am finally able to successfully train Pointnet++ in Google colab:

I referred to #142 commands shared by sheshap

Here is my tf_sampling_compile.sh content: /usr/local/cuda-11.0/bin/nvcc tf_sampling_g.cu -o tf_sampling_g.cu.o -c -O2 -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC

CUDA_INC='/usr/local/cuda-11.0/include' CUDA_LIB='/usr/local/cuda-11.0/lib64'

TF_CFLAGS=$(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))') TF_LFLAGS=$(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))')

g++ -std=c++11 -shared tf_sampling.cpp tf_sampling_g.cu.o -o tf_sampling_so.so -fPIC TFCFLAGS[@]{TF_LFLAGS[@]} -I$CUDA_INC -lcudart -L$CUDA_LIB -O2 -D_GLIBCXX_USE_CXX11_ABI=0

I have tried this but still it can't find the tf_sampling_so.so
I'm also using Google colab with CUDA 11, Tensorflow-2.8 and Python3.7
I tried downgrading tensorflow to 1.15 but it's still same error below:

tensorflow.python.framework.errors_impl.NotFoundError: /content/user/pointnet2/tf_ops/sampling/tf_sampling_so.so: cannot open shared object file: No such file or directory

Pls help!!!

@arrfonseca
Copy link

HEllo. Can you share your colab ipynb?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants