Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correction for Paper Metadata: {2024.iwslt-1.20} #3778 #3795

Merged
merged 2 commits into from
Aug 22, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions data/xml/2024.iwslt.xml
Original file line number Diff line number Diff line change
Expand Up @@ -306,7 +306,12 @@
<title><fixed-case>CMU</fixed-case>’s <fixed-case>IWSLT</fixed-case> 2024 Simultaneous Speech Translation System</title>
<author><first>Xi</first><last>Xu</last><affiliation>Carnegie Mellon University</affiliation></author>
<author><first>Siqi</first><last>Ouyang</last><affiliation>Carnegie Mellon University</affiliation></author>
<author><first>Brian</first><last>Yan</last><affiliation>Carnegie Mellon University</affiliation></author>
<author><first>Patrick</first><last>Fernandes</last><affiliation>Carnegie Mellon University</affiliation></author>
<author><first>William</first><last>Chen</last><affiliation>Carnegie Mellon University</affiliation></author>
<author><first>Lei</first><last>Li</last><affiliation>Carnegie Mellon University</affiliation></author>
<author><first>Graham</first><last>Neubig</last><affiliation>Carnegie Mellon University</affiliation></author>
<author><first>Shinji</first><last>Watanabe</last><affiliation>Carnegie Mellon University</affiliation></author>
<pages>154-159</pages>
<abstract>This paper describes CMU’s submission to the IWSLT 2024 Simultaneous Speech Translation (SST) task for translating English speech to German text in a streaming manner. Our end-to-end speech-to-text (ST) system integrates the WavLM speech encoder, a modality adapter, and the Llama2-7B-Base model as the decoder. We employ a two-stage training approach: initially, we align the representations of speech and text, followed by full fine-tuning. Both stages are trained on MuST-c v2 data with cross-entropy loss. We adapt our offline ST model for SST using a simple fixed hold-n policy. Experiments show that our model obtains an offline BLEU score of 31.1 and a BLEU score of 29.5 under 2 seconds latency on the MuST-C-v2 tst-COMMON.</abstract>
<url hash="fd2a633a">2024.iwslt-1.20</url>
Expand Down
Loading