Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync/v4.12.x #255

Merged
merged 219 commits into from
Dec 8, 2021
Merged

Sync/v4.12.x #255

merged 219 commits into from
Dec 8, 2021

Conversation

calpt
Copy link
Member

@calpt calpt commented Dec 1, 2021

No description provided.

LysandreJik and others added 30 commits September 27, 2021 14:19
* Keras callback to push to hub each epoch, or after N steps

* Reworked the callback to use Repository

* Use an Enum for save_strategy

* Style pass

* Correct type for tokenizer

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <[email protected]>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <[email protected]>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <[email protected]>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <[email protected]>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <[email protected]>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <[email protected]>

* Adding print message to the final upload

* Adding print message to the final upload

* Change how we wait for the last process to finish

* is_done is a property, not a method, derp

* Docstrings and documentation

* Style pass

* Style edit

* Docstring reformat

* Docstring rewrite

* Replacing print with internal logger

Co-authored-by: Sylvain Gugger <[email protected]>
Fix LayoutLM ONNX test error
* Enable readme link synchronization

* Style

* Reuse regex pattern

* Apply suggestions

* Update
* Fix length of IterableDatasetShard and add test

* Add comments
* add a note about tokenizer

* add  tips to load model is less RAM

* fix link

* fix more links
* use Repository for push_to_hub

* update readme

* update other flax scripts

* update readme

* update qa example

* fix push_to_hub call

* fix typo

* fix more typos

* update readme

* use abosolute path to get repo name

* fix glue script
* Init multibert checkpoint conversion script

* Rename conversion script

* Fix MultiBerts Conversion Script

* Apply suggestions from code review

Co-authored-by: NielsRogge <[email protected]>

Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
* update

* add to docs and init

* make fix-copies
…g=True" (#13829)

* Removed wrong warning

* Raise a warning when `max_length` is given with wrong `truncation`

* Update the error message

* Update the warning message

Co-authored-by: Sylvain Gugger <[email protected]>

Co-authored-by: Sylvain Gugger <[email protected]>
* Restore broken merge

* Additional args, DDP, remove CommonLanguage

* Update examples for V100, add training results

* Style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* Remove custom datasets for simplicity, apply suggestions from code review

* Add the attention_mask flag, reorganize README

Co-authored-by: Sylvain Gugger <[email protected]>
In BartForConditionalGeneration.forward, if labels are provided,
   decoder_input_ids are set to the labels shifted to the right.
   This is problematic: if decoder_inputs_embeds is also set,
   the call to self.model, which eventually gets to BartDecoder.forward,
   will raise an error.
   The fix is quite simple, similar to what is there already in
   BartModel.forward. Mainly, we should not
   compute decoder_input_ids if decoder_inputs_embeds is provided.

Co-authored-by: Silviu Vlad Oprea <[email protected]>
* Add layer-wise scaling

* Add reorder & upcasting argument

* Add OpenAI GPT-2 weight initialization scheme

* start `layer_idx` count at zero for consistency

* disentangle attn and reordered and upscaled attn function

* rename `scale_attn_by_layer` to `scale_attn_by_layer_id`

* make autocast from amp compatible with pytorch<1.6

* fix docstring

* style fixes

* Add fixes from PR feedback, style tweaks

* Fix doc whitespace

* Reformat

* First pass scale_attn_by_layer_idx and reorder_and_upcast_attn tests

* Rename scale_attn_by_layer_idx, add tip

* Remove extra newline

* add test for weight initialization

* update code format

* add assert check weights are fp32

* remove assert

* Fix incorrect merge

* Fix shape mismatch in baddbmm

* Add generation test for Mistral flags

Co-authored-by: leandro <[email protected]>
Co-authored-by: Keshav Santhanam <[email protected]>
Co-authored-by: J38 <[email protected]>
NielsRogge and others added 24 commits October 28, 2021 15:29
* First draft

* Make tuple output more readable

* Replace assertions by value errors

* Make it possible to predict_with_generate for vision and speech models

* Adapt Seq2SeqTrainer to work with VisionEncoderDecoder/SpeechEncoderDecoder

* Add deprecation warning

* Add copied from statements to vision and speech encoder decoders

* Fix failing test

* Apply @patrickvonplaten's suggestion

* Use reshape instead of view for consistency
* Fix docs

* Apply suggestions from review + fix bug
* Torch 1.10

* torch scatter for 1.10

* style

* Skip tests
ok
* Fixing image segmentation for inference mode.

* Update src/transformers/pipelines/base.py

Co-authored-by: Patrick von Platen <[email protected]>

Co-authored-by: Patrick von Platen <[email protected]>
…241)

* Fix of issue #13327: Wrong weight initialization for TF t5 model

* run black formatter

* fix typo

* remove my name tag from comments

Co-authored-by: Shirron <[email protected]>
…ds (#14361)

* Experimenting with adding proper get_config() and from_config() methods

* Adding a test for get/from config

* Fix test for get/from config
…d (#14407)

* [Wav2Vec2] Make sure that gradient checkpointing is only run if needed

* make fix-copies
* Fix gradient_checkpointing backward compatibility

* Remove needless line

* make sure mask prob is big enough and length small enough

* Fix tests

Co-authored-by: patrickvonplaten <[email protected]>
@calpt calpt added the sync label Dec 1, 2021
@calpt calpt marked this pull request as ready for review December 8, 2021 13:45
@calpt calpt merged commit 2b7de67 into adapter-hub:master Dec 8, 2021
@calpt calpt deleted the sync/v4.12.5 branch December 8, 2021 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.