Paper Revision{2023.acl-long.223}, closes #3895.

acl-org · Sep 18, 2024 · d9fc9ef · d9fc9ef
1 parent 4f52bdc
commit d9fc9ef
Showing 1 changed file with 3 additions and 1 deletion.
diff --git a/data/xml/2023.acl.xml b/data/xml/2023.acl.xml
@@ -3128,10 +3128,12 @@
       <author><first>Anette</first><last>Frank</last><affiliation>Heidelberg University</affiliation></author>
       <pages>4032-4059</pages>
       <abstract>Vision and language models (VL) are known to exploit unrobust indicators in individual modalities (e.g., introduced by distributional biases) instead of focusing on relevant information in each modality. That a unimodal model achieves similar accuracy on a VL task to a multimodal one, indicates that so-called unimodal collapse occurred. However, accuracy-based tests fail to detect e.g., when the model prediction is wrong, while the model used relevant information from a modality. Instead, we propose MM-SHAP, a performance-agnostic multimodality score based on Shapley values that reliably quantifies in which proportions a multimodal model uses individual modalities. We apply MM-SHAP in two ways: (1) to compare models for their average degree of multimodality, and (2) to measure for individual models the contribution of individual modalities for different tasks and datasets. Experiments with six VL models – LXMERT, CLIP and four ALBEF variants – on four VL tasks highlight that unimodal collapse can occur to different degrees and in different directions, contradicting the wide-spread assumption that unimodal collapse is one-sided. Based on our results, we recommend MM-SHAP for analysing multimodal tasks, to diagnose and guide progress towards multimodal integration. Code available at <url>https://github.com/Heidelberg-NLP/MM-SHAP</url>.</abstract>
-      <url hash="86b9cb56">2023.acl-long.223</url>
+      <url hash="b1be1a8c">2023.acl-long.223</url>
       <bibkey>parcalabescu-frank-2023-mm</bibkey>
       <doi>10.18653/v1/2023.acl-long.223</doi>
       <video href="2023.acl-long.223.mp4"/>
+      <revision id="1" href="2023.acl-long.223v1" hash="86b9cb56"/>
+      <revision id="2" href="2023.acl-long.223v2" hash="b1be1a8c" date="2024-09-18">This revision includes mentions a sponsor in the Acknowledgments section and rectifies the line below Eq. (1).</revision>
     </paper>
     <paper id="224">
       <title>Towards Boosting the Open-Domain Chatbot with Human Feedback</title>