Skip to content

MIMIC-IT, Otter-Image/Video released

Compare
Choose a tag to compare
@Luodian Luodian released this 24 Jun 18:11
· 95 commits to main since this release
  • 🧨 Download MIMIC-IT Dataset. For more details on navigating the dataset, please refer to MIMIC-IT Dataset README.

  • 🏎️ Run Otter Locally. You can run our model locally with at least 16G GPU mem for tasks like image/video tagging and captioning and identifying harmful content. We fix a bug related to video inference where frame tensors were mistakenly unsqueezed to a wrong vision_x. You can now try running it again with the updated version.

    Make sure to adjust the sys.path.append("../..") correctly to access otter.modeling_otter in order to launch the model.