Skip to content

v0.2.1

Latest
Compare
Choose a tag to compare
@ssundaram21 ssundaram21 released this 15 Oct 03:16
· 1 commit to main since this release
7a3d8a4

We're releasing 4 new variants of DreamSim! These new checkpoints are:

  • DINOv2 B/14 and SynCLR B/16 as backbones
  • DINOv2 B/14 and DINO B/16 trained with the original contrastive loss on both CLS and dense features.

These models (and the originals) are further evaluated in our new NeurIPS 2024 paper, When Does Perceptual Alignment Benefit Vision Representations?

We find that our perceptually-aligned representations outperform the baseline models on a variety of downstream standard computer vision tasks, including semantic segmentation, depth estimation, object counting, instance retrieval, and retrieval-augmented generation. These results point towards perceptual alignment as a useful objective for learning general-purpose vision representations. See the paper and our blog post for more details.

Here's how they perform on NIGHTS:

NIGHTS - Val NIGHTS - Test
ensemble 96.9% 96.2%
dino_vitb16 95.6% 94.8%
open_clip_vitb32 95.6% 95.3%
clip_vitb32 94.9% 93.6%
dinov2_vitb14 94.9% 95.0%
synclr_vitb16 96.0% 95.9%
dino_vitb16 (patch) 94.9% 94.8%
dinov2_vitb14 (patch) 95.5% 95.1%

Additionally, we fixed a bug in embedding normalization. This shouldn't significantly affect model performance, but may explain very minor changes in pipelines where DreamSim (with normalize_embeds=True) is being used.