Grasp Pose Detection (GPD) is a package to detect 6-DOF grasp poses (3-DOF position and 3-DOF orientation) for a 2-finger robot hand (e.g., a parallel jaw gripper) in 3D point clouds.
GPD consists of two main steps: sampling a large number of grasp candidates, and classifying these candidates as viable grasps or not.
The reference for this package is: Grasp Pose Detection in Point Clouds.
- Requirements
- Installation
- Generate Grasps for a Point Cloud File
- Parameters
- Views
- Input Channels for Neural Network
- CNN Frameworks
- GPU Support With PCL
- Network Training
- Grasp Image
- References
The following instructions have been tested on Ubuntu 16.04. Similar instructions should work for other Linux distributions.
-
Install PCL and Eigen. If you have ROS Indigo or Kinetic installed, you should be good to go.
-
Install OpenCV 3.4 (tutorial).
-
Clone the repository into some folder:
git clone https://github.com/atenpas/gpd2
-
Build the package:
cd gpd2 mkdir build && cd build cmake .. make -j
You can optionally install GPD with sudo make install
so that it can be used by other projects as a shared library.
Run GPD on an point cloud file (PCD or PLY):
./detect_grasps ../cfg/eigen_params.cfg ../tutorials/krylon.pcd
The output should look similar to the screenshot shown below. The window is the PCL viewer. You can press [q] to close the window and [h] to see a list of other commands.
Below is a visualization of the convention that GPD uses for the grasp pose (position and orientation) of a grasp. The grasp position is indicated by the orange cross and the orientation by the colored arrows.
Brief explanations of parameters are given in cfg/params.cfg.
The two parameters that you typically want to play with to improve on the number of grasps found are workspace and num_samples. The first defines the volume of space in which to search for grasps as a cuboid of dimensions [minX, maxX, minY, maxY, minZ, maxZ], centered at the origin of the point cloud frame. The second is the number of samples that are drawn from the point cloud to detect grasps. You should set the workspace as small as possible and the number of samples as large as possible.
You can use this package with a single or with two depth sensors. The package comes with CAFFE model files for both options. You can find these files in gpd/caffe/15channels. For a single sensor, use single_view_15_channels.caffemodel and for two depth sensors, use two_views_15_channels_[angle]. The [angle] is the angle between the two sensor views, as illustrated in the picture below. In the two-views setting, you want to register the two point clouds together before sending them to GPD.
To switch between one and two sensor views, change the parameter weight_file
in your config file.
The package comes with weight files for two different input representations for the neural network that is used to decide if a grasp is viable or not: 3 or 15 channels. The default is 15 channels. However, you can use the 3 channels to achieve better runtime for a loss in grasp quality. For more details, please see the references below.
GPD comes with a number of different classifier frameworks that exploit different hardware and have different dependencies. Switching between the frameworks requires to run CMake with additional arguments. For example, to use the OpenVino framework:
cmake .. -DUSE_OPENVINO=ON
You can use ccmake
to check out all possible CMake options.
GPD supports the following three frameworks:
- OpenVino (CPUs, GPUs, FPGAs from Intel)
- Caffe (GPUs from Nvidia or CPUs)
- Custom LeNet implementation using the Eigen library
Additional classifiers can be added by sub-classing the classifier
interface.
To use OpenVino, you need to run the following command before compiling GPD.
source /opt/intel/computer_vision_sdk/bin/setupvars.sh
GPD can use GPU methods provided within PCL to speed up point cloud processing.
- PCL GPU Install
- Build GPD with
USE_PCL_GPU
cmake flag:cd gpd mkdir build && cd build cmake .. -DUSE_PCL_GPU=ON make -j
To create training data with the C++ code, you need to install OpenCV 3.4 Contribs.
Next, you need to compile GPD with the flag DBUILD_DATA_GENERATION
like this:
```
cd gpd
mkdir build && cd build
cmake .. -DBUILD_DATA_GENERATION=ON
make -j
```
There are three steps to train a network to predict grasp poses. First, we need to create grasp images:
./gpd_generate_training_data.py ../cfg/generate_data.cfg
You should modify generate_data.cfg
according to your needs.
The second step is to train a neural network. The easiest way to training the network is with the existing code. This requires the pytorch framework. To train a network, use train_net.py
:
cd pytorch
python train_net3.py pathToTrainingSet.h5 pathToTestSet.h5 num_channels
The third step is to convert the model to the ONNX format.
python torch_to_onxx.py pathToPytorchModel.pwf pathToONNXModel.onnx num_channels
The last step is to convert the ONNX file to an OpenVINO compatible format: tutorial. This gives two files that can be loaded with GPD by modifying the weight_file
and model_file
parameters in a CFG file.
Generate some grasp poses and their corresponding images/descriptors:
./test_grasp_image ../tutorials/krylon.pcd 3456 1 ../models/lenet/15channels/params/
For details on how the grasp image is created, check out our journal paper.
If you like this package and use it in your own work, please cite our journal paper [1]. If you're interested in the (shorter) conference version, check out [2].
[1] Andreas ten Pas, Marcus Gualtieri, Kate Saenko, and Robert Platt. Grasp Pose Detection in Point Clouds. The International Journal of Robotics Research, Vol 36, Issue 13-14, pp. 1455-1473. October 2017.
[2] Marcus Gualtieri, Andreas ten Pas, Kate Saenko, and Robert Platt. High precision grasp pose detection in dense clutter. IROS 2016, pp. 598-605.
- Remove the
cmake
cache:CMakeCache.txt
make clean
- Remove the
build
folder and rebuild. - Update gcc and g++ to a version > 5.