DISORF: Distributed Online 3D Reconstruction Framework

Implementation of DISORF, a framework designed for online 3D reconstruction and visualization of scenes captured by resource-constrained mobile robots and edge devices. Our framework, built upon nerfstudio, efficiently distributes computation between edge devices and remote servers, leveraging on-device SLAM for pose estimation and performing high-quality 3D reconstruction using neural methods like NeRF and 3D Gaussian Splatting (3DGS) on powerful servers.

[DISORF arxiv]

final_demo_v4.mp4

Installation

Prerequisites

DISORF requires installation of ROS1 and CUDA installed. This project has been tested on Ubuntu 20.04 with Python 3.10 and CUDA 12.1

Note: ROS1 is supported up to Ubuntu 20.04 only, higher versions requires building from source

Clone DISORF and submodules:

git clone https://github.com/Xiaorui-Huang/DISORF.git --recursive
# if you have already cloned DISORF
# git submodule update --init --recursive

Environment Setup

conda create --name disorf python=3.10 cmake=3.14.0
conda activate disorf
pip install --upgrade pip

Dependencies

pip install rospkg # ROS1
pip install -e .
conda install -c "nvidia/label/cuda-12.1.1" cuda-toolkit # CUDA & torch
conda install pytorch==2.2.2 torchvision==0.17.2 pytorch-cuda=12.1 -c pytorch -c nvidia 
pip install ninja git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
cd libs/nerfstudio
pip install -e .
cd -

Dust3r (Optional)

This is not required to use DISORF

Installing the Dust3r dependency allows for Multi-View Stereo initializations of scenes

# First time using MVS will be slow as it downloads checkpoints
cd libs/dust3r

# Create the install config for dust3r as editable local package and allows the easy import
cat <<EOF > pyproject.toml
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "dust3r"
version = "0.1.0"

[tool.setuptools.packages.find]
include = ["dust3r*"]
EOF

pip install -e .
pip install -r requirements.txt
cd -

cd libs/dust3r/croco/models/curope/
python setup.py build_ext --inplace
cd -

Local Machine Setup

Refer to DISORF local-end for installation on the edge machine. The local-end repository is cloned as a submodule for reference, but it is not required for the server end to function.

Quick Start

We provide

Streaming mode: for real-time 3D reconstruction downstream applications
Replay mode: for evaluating and experimenting with our framework

Streaming Mode

Start the real-time streaming mode on the remote server with the following command. This will wait for keyframes from the local end SLAM module and will begin training once transmission starts.

On the remote server

python server_end/ros_train.py ros_gaussian_splatting `# use ros_nerfacto for NeRF` \
    --experiment-name streaming_quickstart \
    --data data/streaming_quickstart `# data output path for streaming keyframes, input path for replay mode` \
    --pipeline.datamanager.dataparser.ros-data config/camera/REPLICA_config.json \
    --pipeline.datamanager.training_sampler inverse_delta `# Our sampling method`\
    --pipeline.model.camera-optimizer.mode SO3xR3 \
    --pipeline.model.rasterize-mode antialiased `# remove for NeRF` \
    --dump_received_images True \
    --draw-training-images True \
    --vis viewer+tensorboard \
    --steps_per_eval_all_images 7000 \
    --steps_per_save 7001 \
    --max_num_iterations 7001 \
    --relocate_cameras True \
    --logging.local-writer.enable False `# disable verbose logging`

On a separate terminal execute the following to listen for local end keyframes:

# start roscore in the background if not already running
roscore & 
# listens on localhost:5001 by default, can supply different port as argument
python server_end/packet_to_ros.py

On the local end follow the instructions at DISORF local-end to start the SLAM module and begin streaming keyframes to the remote server.

Replay Mode

Requires Keyframes data from a previous streaming session

REPLAY_LOG_PATH=$(find . -path "*/outputs/streaming_quickstart/*/avail_frames_log.txt" -type f | head -n 1)
python server_end/ros_train.py ros_gaussian_splatting \
    --experiment-name replay_quickstart \
    --data data/streaming_quickstart `# data output path for streaming keyframes, input path for replay mode` \
    --pipeline.datamanager.training_sampler inverse_delta \
    --pipeline.model.camera-optimizer.mode SO3xR3 \
    --pipeline.datamanager.dataparser.ros-data config/camera/REPLICA_config.json \
    --pipeline.datamanager.replay_transmission_log $REPLAY_LOG_PATH `#remove to evaluate offline training` \
    --pipeline.datamanager.dataparser.use_cached_training_data True \
    --disable_streaming True \
    --vis viewer+tensorboard \
    --draw-training-images True \
    --steps_per_eval_all_images 7000 \
    --steps_per_save 7001 \
    --max_num_iterations 7001 \
    --logging.local-writer.enable False `# disable verbose logging`

We also provided a small sample dataset on Replica and Tank and Temples dataset here.

CLI Options Overview

Streaming Options

--pipeline.datamanager.dataparser.ros-data config/camera/REPLICA_config.json  # Path to JSON file for camera intrinsics, calibration, and ROS topic names
--dump_received_images True  # Save SLAM keyframes to disk for Replays
--draw-training-images  # Draw the training images in the viewer
--vis viewer+tensorboard  # Enable the viewer (default localhost:7007) and TensorBoard for visualization
--pipeline.datamanager.training_sampler [sampler]  # Select sampling strategy for data selection during training

Replay Options

Of particular note is replay_transmission_log parameter

The file allow a replay of how many frames were transmitted and when, allowing for consistent and repeatable training results, while simulating online training.
without a replay log, the training will be considered offline and the model will have access to all frames at the start

--pipeline.datamanager.replay_transmission_log [REPLAY_PATH] 
--pipeline.datamanager.dataparser.use_cached_training_data True # Use cached training data for replay
--data [DATA_PATH]  # Path to the replay data directory
--disable_streaming True

Multi-View Stereo (MVS) w/ DUST3R Options

--pipeline.model.mvs_init True  # Initialize positions from MVS point cloud
--pipeline.model.mvs_num_random 100  # Number of Gaussians to initialize with MVS
--pipeline.model.mvs_add_every 300  # Add MVS points every specified number of steps
--pipeline.model.mvs_add_n_views 4  # Number of views to add MVS points from
--pipeline.model.mvs_add_stop_at 1000  # Stop adding MVS points at this step
--pipeline.model.mvs_sample_step 4  # Step size to sample the MVS point cloud
--pipeline.model.mvs_add_max_pts 3000  # Maximum number of MVS points to add

Optinal Arguments

# By default, all frames are evaluated.
--pipeline.eval_idx_list [EVAL_PATH] # OPTIONAL - Specify a .txt file with indices of frames for evaluation to fix a subset of frames for evaluation
--steps_per_save 1000  # Save the model every 1000 steps
--max_num_iterations 5001  # Train for a maximum of 5001 steps
--pipeline.model.camera-optimizer.mode [mode]  # Different camera optimization techniques
--num_msgs_to_start 3  # Number of images received to start training
--pipeline.model.rasterize-mode antialiased # Rasterization mode for 3DGS (see NeRFStudio help for more)
--pipeline.model.num-random 5000 # 3DGS, Number of random point cloud at init
--pipeline.model.random-scale 3 # 3DGS, Random scale for random point cloud bounding box 
--output-dir [OUTPUT_PATH]  # Output directory for training/replay logs and models, default: "outputs
# Refer to the help menu for additional options

Different Sampling Strategies

Default: Uniformly Randoml samples pixels/frames.
Inverse Delta: Prioritizes frames that haven’t been used recently, our core sampling strategy.
Loss Weighted Frame: Focuses on frames where the model performs poorly.
Loss Weighted Patch (NeRF only): Targets specific patches within frames that cause high errors.
IMAP: Balances between focusing on recent and older frames as proposed by the IMAP method.
Hierarchical Loss: Combines loss-weighted strategies with a hierarchical approach, focusing on both frame-level and patch-level errors.

Streaming Preparations

Before starting a streaming session, prepare your data and camera configurations:

Data Preparation:
- Ensure your dataset includes an images folder and a transforms.json file adjacent to it. The transforms.json file should contain the image poses.
Camera Configuration:
- Calibrate your camera before running the SLAM module using the methods below:
  - OpenCV Calibration Guide
  - ROS1 Calibration Guide
- Refer to DISORF local end
- Example camera configurations

iPhone X 1080p Video Mode

{
 "fx": 1975.47873,
 "fy": 2053.41424,
 "cx": 1060.77279,
 "cy": 513.500761,
 "k1": 0.37367123,
 "k2": -1.29505304,
 "k3": 0.92461805,
 "p1": -0.01482113,
 "p2": 0.01368264,
 "H": 1080,
 "W": 1920,
 "image_topic": "/image_raw",
 "pose_topic": "/pose",
 "pcd_topic": "/pcd"
}

Provided Replica Sample dataset cam config

{
 "fx": 600.0,
 "fy": 600.0,
 "cx": 599.5,
 "cy": 339.5,
 "k1": 0.00023840576144970707,
 "k2": -0.00031479749960004814,
 "k3": 0,
 "p1": -7.392305313239943e-05,
 "p2": -2.771600122416197e-05,
 "H": 680,
 "W": 1200,
 "image_topic": "/image_raw",
 "pose_topic": "/pose",
 "pcd_topic": "/pcd"
}

Replay Preparations

To replay a session, gather the following:

Replay Log:
- Obtain the avail_frames_log.txt from the previous session’s output directory.
Keyframes:
- Ensure you have the images folder and the transforms.json file from the original logging session.

Citation

If you use DISORF in your research, please cite our paper:

@article{li2024disorf,
  title        = {{DISORF}: A Distributed Online 3D Reconstruction Framework for Mobile Robots},
  author       = {Li, Chunlin and Fan, Hanrui and Huang, Xiaorui and Liang, Ruofan and Durvasula, Sankeerth and Vijaykumar, Nandita},
  year         = {2024},
  note         = {Under review for IEEE Robotics and Automation Letters (RA-L)},
  eprint       = {2403.00228},
  archivePrefix = {arXiv},
  primaryClass = {cs.RO},
  url          = {https://arxiv.org/abs/2403.00228}
}

Disclaimer

This paper has been submitted to IEEE Robotics and Automation Letters (RA-L) for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.vscode		.vscode
config		config
figs		figs
libs		libs
scripts		scripts
server_end		server_end
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
metric_results.py		metric_results.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DISORF: Distributed Online 3D Reconstruction Framework

Table of Contents

Installation

Prerequisites

Environment Setup

Dependencies

Dust3r (Optional)

Local Machine Setup

Quick Start

Streaming Mode

Replay Mode

CLI Options Overview

Streaming Options

Replay Options

Multi-View Stereo (MVS) w/ DUST3R Options

Optinal Arguments

Different Sampling Strategies

Streaming Preparations

Replay Preparations

Citation

Disclaimer

About

Releases

Packages

Languages

License

Xiaorui-Huang/DISORF

Folders and files

Latest commit

History

Repository files navigation

DISORF: Distributed Online 3D Reconstruction Framework

Table of Contents

Installation

Prerequisites

Environment Setup

Dependencies

Dust3r (Optional)

Local Machine Setup

Quick Start

Streaming Mode

Replay Mode

CLI Options Overview

Streaming Options

Replay Options

Multi-View Stereo (MVS) w/ DUST3R Options

Optinal Arguments

Different Sampling Strategies

Streaming Preparations

Replay Preparations

Citation

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages