Revisit Event Generation Model: Self-Supervised Learning of Event-to-Video Reconstruction with Implicit Neural Representations (ECCV2024)
Zipeng Wang,
Yunfan Lu,
Addison Lin Wang
The Hong Kong University of Science and Technology (Guangzhou).
This is the official implementation of the paper "Revisit Event Generation Model: Self-Supervised Learning of Event-to-Video Reconstruction with Implicit Neural Representations" .
Reconstructing intensity frames from event data while maintaining high temporal resolution and dynamic range is crucial for bridging the gap between event-based and frame-based computer vision. Previous approaches have depended on supervised learning on synthetic data, which lacks interpretability and risk over-fitting to the setting of the event simulator. Recently, self-supervised learning (SSL) based methods, which primarily utilize per-frame optical flow to estimate intensity via photometric constancy, has been actively investigated. However, they are vulnerable to errors in the case of inaccurate optical flow. This paper proposes a novel SSL event-to-video reconstruction approach, dubbed EvINR, which eliminates the need for labeled data or optical flow estimation. Our core idea is to reconstruct intensity frames by directly addressing the event generation model, essentially a partial differential equation (PDE) that describes how events are generated based on the time-varying brightness signals. Specifically, we utilize an implicit neural representation (INR), which takes in spatiotemporal coordinate (x, y, t) and predicts intensity values, to represent the solution of the event generation equation. The INR, parameterized as a fully-connected Multi-layer Perceptron (MLP), can be optimized with its temporal derivatives supervised by events. To make EvINR feasible for online requisites, we propose several acceleration techniques that substantially expedite the training process. Comprehensive experiments demonstrate that our EvINR surpasses previous SSL methods by 38% w.r.t. Mean Squared Error (MSE) and is comparable or superior to SoTA supervised methods.
If you're interested in trying EvINR but want to avoid the hassle of installation, we've got you covered. Check out our Colab notebook for a straightforward and quick test run. It's all ready for you to use online!
This repository is organized as follows:
- event_data.py: Loads event data and stacks them into event frames.
- model.py: Contains our neural network solver for event-based video reconstruction.
- utils.py: Contains utility functions for event data manipulation and visualization.
- train.py: Contains the training routine.
- scripts/: Converts common event datasets into formats used in our work.
If you are interested in our ALPIX Event Dataset (AED), please contact [email protected].
We currently provide conversion scripts for the following datasets: IJRR, HQF, and CED. Our AED dataset does not require further conversion.
To process your own dataset, please convert the event data into a numpy array with the shape
We provide the example commands to train EvINR on different dataset.
python train.py -n EXP_NAME -d DATA_PATH --H 240 --W 180
python train.py -n EXP_NAME -d DATA_PATH --H 260 --W 346 --color_event
python train.py -n EXP_NAME -d DATA_PATH --H 408 --W 306 --event_thresh 0.25
If you find our work useful in your research, please cite:
@article{wang2024EvINR,
title={Revisit Event Generation Model: Self-Supervised Learning of Event-to-Video Reconstruction with Implicit Neural Representations},
author={Wang, Zipeng and Lu, Yunfan and Wang, Lin},
journal={ECCV},
year={2024}
}
If you have any questions, please feel free to email the authors or raise an issue.
Our code follows the awesome Siren repository. We thanks them for the inspiring work.