Official PyTorch codes for "Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation", ECCV2024
Please note that the code has not been fully organized yet. The current code cannot be executed directly. I will remove this note once I have finished organizing it.
Our code is based on OpenScene, and you can install the environment according to OpenScene or the following commands.
Start by cloning the repo:
git clone https://github.com/Wang-pengfei/GGSD.git
cd GGSD
First of all, you have to make sure that you have all dependencies in place. The simplest way to do so, is to use anaconda.
You can create an anaconda environment called GGSD
as below. For linux, you need to install libopenexr-dev
before creating the environment.
sudo apt-get install libopenexr-dev # for linux
conda create -n GGSD python=3.8
conda activate GGSD
Step 1: install PyTorch (we tested on 1.7.1, but the following versions should also work):
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
Step 2: install MinkowskiNet:
sudo apt install build-essential python3-dev libopenblas-dev
If you do not have sudo right, try the following:
conda install openblas-devel -c anaconda
And now install MinkowskiNet:
pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps \
--install-option="--force_cuda" \
--install-option="--blas=openblas"
If it is still giving you error, please refer to their official installation page.
Step 3: install all the remaining dependencies:
pip install -r requirements.txt
Step 4 (optional): if you need to run multi-view feature fusion with OpenSeg (especially for your own dataset), remember to install:
pip install tensorflow
We provide the pre-processed 3D&2D data and multi-view fused features for the following datasets:
- ScanNet
- Matterport3D
- nuScenes
- Replica
One can download the pre-processed datasets by running the script below, and following the command line instruction to download the corresponding datasets:
bash scripts/download_dataset.sh
The script will download and unpack data into the folder data/
. One can also download the dataset somewhere else, but link to the corresponding folder with the symbolic link:
ln -s /PATH/TO/DOWNLOADED/FOLDER data
Note: 2D processed datasets (e.g. scannet_2d
) are only needed if you want to do multi-view feature fusion on your own. If so, please follow the instruction for multi-view fusion.
You can run the following to directly download provided fused features:
bash scripts/download_fused_features.sh
will released
When you have installed the environment and obtained the processed 3D data and multi-view fused features, you are ready to run our OpenScene disilled/ensemble model for 3D semantic segmentation, or distill your own model from scratch.
- Start distilling:
sh run/distill.sh EXP_NAME CONFIG.yaml
# Run 3D distilled model
sh run/eval.sh out/replica_openseg config/replica/ours_openseg_pretrained.yaml distill
We build our code on top of the OpenScene repository.