-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync/v4.12.x #255
Merged
Merged
Sync/v4.12.x #255
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Keras callback to push to hub each epoch, or after N steps * Reworked the callback to use Repository * Use an Enum for save_strategy * Style pass * Correct type for tokenizer * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Adding print message to the final upload * Adding print message to the final upload * Change how we wait for the last process to finish * is_done is a property, not a method, derp * Docstrings and documentation * Style pass * Style edit * Docstring reformat * Docstring rewrite * Replacing print with internal logger Co-authored-by: Sylvain Gugger <[email protected]>
Fix LayoutLM ONNX test error
* Enable readme link synchronization * Style * Reuse regex pattern * Apply suggestions * Update
* Fix length of IterableDatasetShard and add test * Add comments
* add a note about tokenizer * add tips to load model is less RAM * fix link * fix more links
…768) * missing requirement * list both
* use Repository for push_to_hub * update readme * update other flax scripts * update readme * update qa example * fix push_to_hub call * fix typo * fix more typos * update readme * use abosolute path to get repo name * fix glue script
* Init multibert checkpoint conversion script * Rename conversion script * Fix MultiBerts Conversion Script * Apply suggestions from code review Co-authored-by: NielsRogge <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: NielsRogge <[email protected]>
* update * add to docs and init * make fix-copies
…g=True" (#13829) * Removed wrong warning * Raise a warning when `max_length` is given with wrong `truncation` * Update the error message * Update the warning message Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>
* Restore broken merge * Additional args, DDP, remove CommonLanguage * Update examples for V100, add training results * Style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * Remove custom datasets for simplicity, apply suggestions from code review * Add the attention_mask flag, reorganize README Co-authored-by: Sylvain Gugger <[email protected]>
In BartForConditionalGeneration.forward, if labels are provided, decoder_input_ids are set to the labels shifted to the right. This is problematic: if decoder_inputs_embeds is also set, the call to self.model, which eventually gets to BartDecoder.forward, will raise an error. The fix is quite simple, similar to what is there already in BartModel.forward. Mainly, we should not compute decoder_input_ids if decoder_inputs_embeds is provided. Co-authored-by: Silviu Vlad Oprea <[email protected]>
* Add layer-wise scaling * Add reorder & upcasting argument * Add OpenAI GPT-2 weight initialization scheme * start `layer_idx` count at zero for consistency * disentangle attn and reordered and upscaled attn function * rename `scale_attn_by_layer` to `scale_attn_by_layer_id` * make autocast from amp compatible with pytorch<1.6 * fix docstring * style fixes * Add fixes from PR feedback, style tweaks * Fix doc whitespace * Reformat * First pass scale_attn_by_layer_idx and reorder_and_upcast_attn tests * Rename scale_attn_by_layer_idx, add tip * Remove extra newline * add test for weight initialization * update code format * add assert check weights are fp32 * remove assert * Fix incorrect merge * Fix shape mismatch in baddbmm * Add generation test for Mistral flags Co-authored-by: leandro <[email protected]> Co-authored-by: Keshav Santhanam <[email protected]> Co-authored-by: J38 <[email protected]>
* First draft * Make tuple output more readable * Replace assertions by value errors * Make it possible to predict_with_generate for vision and speech models * Adapt Seq2SeqTrainer to work with VisionEncoderDecoder/SpeechEncoderDecoder * Add deprecation warning * Add copied from statements to vision and speech encoder decoders * Fix failing test * Apply @patrickvonplaten's suggestion * Use reshape instead of view for consistency
* Fix docs * Apply suggestions from review + fix bug
* Torch 1.10 * torch scatter for 1.10 * style * Skip tests ok
* Fixing image segmentation for inference mode. * Update src/transformers/pipelines/base.py Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Patrick von Platen <[email protected]>
…241) * Fix of issue #13327: Wrong weight initialization for TF t5 model * run black formatter * fix typo * remove my name tag from comments Co-authored-by: Shirron <[email protected]>
…ds (#14361) * Experimenting with adding proper get_config() and from_config() methods * Adding a test for get/from config * Fix test for get/from config
…d (#14407) * [Wav2Vec2] Make sure that gradient checkpointing is only run if needed * make fix-copies
* Fix gradient_checkpointing backward compatibility * Remove needless line * make sure mask prob is big enough and length small enough * Fix tests Co-authored-by: patrickvonplaten <[email protected]>
…() methods (#14361)" This reverts commit e99a231.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.