-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to switch output languages for multilingual models #69
Labels
Comments
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 18, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 18, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 25, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 28, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 28, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 28, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 28, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 28, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 28, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 28, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 28, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 28, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 28, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 28, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 30, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 30, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 30, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 30, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 30, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Aug 30, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 6, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 6, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 6, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 6, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 6, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 13, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 13, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 13, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 13, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 20, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 20, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 20, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 20, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 20, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 20, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 20, 2023
SamDewriter
pushed a commit
to SamDewriter/SimulEval
that referenced
this issue
Sep 20, 2023
ibanesh
pushed a commit
that referenced
this issue
Sep 21, 2023
* Testing Circleci on main * Testing Circleci on main * Testing Circleci on main * Testing Circleci on main * Testing Circleci on main * correct Circle config * correct Circle config * correct Circle config * correct Circle config * Revert "[demo] s2t + s2s agent pipelines (#58)" This reverts commit 075c4d3. * add target language * add target language as a parameter * Test dynamic language * Switch language dynamically * Add ability to switch output language (#69) * Add tgt language argument * Add Namespace to args argument (#69) * Modify code to read target language from a file * Add ability to switch input language * Add a tgt-lang file to test * Add tgt_lang to instance * Add tgt_lang to AgentStates * States * Add tgt_lang from state to test (#69) * Target language to test * Format with Black (#69) * Delete circleci (69) * Remove unused tgt_lang (#69) * Refactor tgt_lang * Remove tgt_lang (#69) * Remove tgt_lang from S2S and Y2T Dataloaders (#69) * Remove tgt_lang from S2S (#69) * Format with Black (#69) * Add tgt_lang to S2S dataloader to pass test(#69) * Add tgt_lang to S2S dataloader to pass test(#69) * Add tgt_lang to S2S dataloader to pass test(#69) * Change tgt_lang to es (#69) * Add tgt_lang to test suites * Add tgt_lang to test suites * Fix tgt-lang issue (#69) * format with black * Add tgt-lang arg (#69) * (#69) * Add tgt-lang (#69) * Change instance prediction (#69) * Format (#69) * Add tgt_lang argument * Resolve tgt_lang (#69) * Remove tgt-lang argument (#69) * Remove tgt-lang argument (#69) * Add tgt-lang arg to dataloader * Preprocess tgt-lang (#69) * Handle tgt-lang list (#69) * Format with black (#69) * Testing Circleci on main * Testing Circleci on main * Testing Circleci on main * Testing Circleci on main * Testing Circleci on main * correct Circle config * correct Circle config * correct Circle config * correct Circle config * Move tgt-lang to DataLoader (#69) * Rewrite tgt_lang (#69) * Initialize tgt-lang (#69) * Format with black * Fix tgt-lang (#69) * Correct tgt_lang logic * Resolve merge conflict * remove tgt_lang check to reduce redundancy (#69) * Lint with black (#69) * Import from typing (#69) * Lint (#69) * Handle when tgt_lang is not for s2s (#69) * Remove comments (#69) * Add check for tgt_lang s2t (#69) * Format with black (#69) --------- Co-authored-by: Mubaraq Sani <{ID}+{username}@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Context:
Currently when loading pipelines for multilingual models we use the runtime cli param/option to set the target language for the translation output.
This is emulated in the dummy model that was recently added:
https://github.com/facebookresearch/SimulEval/blob/main/examples/speech_to_text/counter_in_tgt_lang_agent.py#L22-L24
When used in the demo in its current state,
tgt_lang
needs to be set invad_main.yaml
and loaded from there when building the model.The issue with passing the target language this way is that we would have to reload the pipeline/model when we have to change the target language, which would result in a lot of unnecessary overhead.
To Do:
To avoid the overhead with reloading and to effectively showcase the capabilities of multilingual models, we would like to be able to pass in the target language to the models dynamically.
Refactor the current target language passing mechanism to make it dynamic.
Hint: passing it through the input segment and then through the agent states could work.
Additionally, we would also like to allow specifying multiple target languages (as opposed to just one shown in dummy model), this would eventually help us to design pipelines for simultaneously translating to multiple languages and for getting ASR output.
There are 2 parts to this issue:
Hints/Pointers:
cd SimulEval/examples/speech_to_text
simuleval --agent counter_in_tgt_lang_agent.py --user-dir . --agent-class agents.CounterInTargetLanguageAgent --source-segment-size 1000 --source source.txt --target reference/en.txt --output <path to output folder> --tgt-lang es
instances.log
file under the output folder. You should be able to see the output of the pipeline there as "predictions". Try changing thetgt-lang
and observe how the output changes accordingly.tgt_lang
can be inferred from the source dataset and passed along via the input segment and states ultimately to the agent that uses it.The text was updated successfully, but these errors were encountered: