SRT: Stochastic Resonance Transformers

ICML 2024

Visualization Demo

visualization is provided in visualization.ipynb. You can change the pretrain=mae to get visualization for different model.

Evaluatio: NYU-V2 depth.

For evaluation of NYU-V2 depth we require the testset of NYU-V2

Step 1: Installing Dinov2 env

Follow Dinov2 repo to install the dinov2-extra environment.

Step 2: Prepare NYU-V2 testset

cd data
ln -s /<path>/<to>/nyu_v2 .

Directory structure:

nyu_v2/
  testing/
    images/
      1.png
      2.png
      ...
    depths/
      1.png
      2.png
      ...

Step 3: Run

For dinov2 small with linear head, feature space ensemble

python depth_dinov2_main.py \
    --feature_extractor ensemble_dinov2_depth_feats \
    --do_eval \
    --data_path data/nyu_v2 \
    --dx 3 \
    --dy 3 \
    --head_type linear \
    --arch small

Change --head_type to be {linear, dpt} to use linear head or dpt head from dinov2. Change --arch to be {small, base, large, giant} to use different size ViT backbone.

Evaluation: DAVIS 2017 Video object segmentation

Please verify that you're using pytorch version 1.7.1 since we are not able to reproduce the results with most recent pytorch 1.8.1 at the moment.

Step 1: Prepare DAVIS 2017 data

cd ..
git clone https://github.com/davisvideochallenge/davis-2017 && cd davis-2017
./data/get_davis.sh
cd SRT_icml/data
ln -s ../../davis-2017 .

Step 2: Run Davis 2017 evaluation

python main.py \
    --data_path data/davis-2017-alt \
    --output_dir [dir] \
    --patch_size 16 \
    --arch vit_small \
    --feature_extractor ensemble_fast \
    --do_eval \
    --bs 1 \
    --dx 1 \
    --dy 1

Step 3: Compute metrics

git clone https://github.com/davisvideochallenge/davis2017-evaluation $HOME/davis2017-evaluation
python $HOME/davis2017-evaluation/evaluation_method.py --task semi-supervised --results_path /path/to/saving_dir --davis_path $HOME/davis-2017/DAVIS/

Evaluation: ADE20K semantic Segmenation

Step 1: Install dinov2 env

Follow Dinov2 repo to install the dinov2-extra environment.

Step 2: Prepare ADE20K data

Download ADEChallengeData2016 from this link

cd data
ln -s /<path>/<to>/ADEChallenge2016 .

Step 3: Run

For dinov2 small with linear head, feature space ensemble

python segment_dinov2_main.py \
    --feature_extractor ensemble_dinov2_seg_feats \
    --arch small \
    --head_type linear \
    --dx 3 \
    --dy 3 \
    --bs 36 \
    --do_eval \
    --metric_output dinov2-small-linear-feat-dxdy3.json

Same as dpeth estimation, --arch can be changed to use different ViT.

Upcoming

A more comprehensive (containing other tasks) and organized repo will be released later.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
external_src/dinov2		external_src/dinov2
README.md		README.md
datasets.py		datasets.py
depth_dinov2_main.py		depth_dinov2_main.py
dino_main_file.py		dino_main_file.py
distill_utils.py		distill_utils.py
eval_ade20k.py		eval_ade20k.py
eval_copy_detection.py		eval_copy_detection.py
eval_image_retrieval.py		eval_image_retrieval.py
eval_knn.py		eval_knn.py
eval_linear.py		eval_linear.py
eval_nyu_v2.py		eval_nyu_v2.py
eval_utils.py		eval_utils.py
eval_video_segmentation.py		eval_video_segmentation.py
extract_feature_offline.py		extract_feature_offline.py
feature_extractor.py		feature_extractor.py
hubconf.py		hubconf.py
main.py		main.py
run_with_submitit.py		run_with_submitit.py
segment_dinov2_main.py		segment_dinov2_main.py
test.jpg		test.jpg
utils.py		utils.py
video_generation.py		video_generation.py
vision_transformer.py		vision_transformer.py
visualization.ipynb		visualization.ipynb
visualize_attention.py		visualize_attention.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SRT: Stochastic Resonance Transformers

Visualization Demo

Evaluatio: NYU-V2 depth.

Evaluation: DAVIS 2017 Video object segmentation

Evaluation: ADE20K semantic Segmenation

Upcoming

About

Releases

Packages

Languages

donglao/srt

Folders and files

Latest commit

History

Repository files navigation

SRT: Stochastic Resonance Transformers

Visualization Demo

Evaluatio: NYU-V2 depth.

Evaluation: DAVIS 2017 Video object segmentation

Evaluation: ADE20K semantic Segmenation

Upcoming

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages