Sync/v4.12.x #255

calpt · 2021-12-01T10:06:27Z

No description provided.

* Keras callback to push to hub each epoch, or after N steps * Reworked the callback to use Repository * Use an Enum for save_strategy * Style pass * Correct type for tokenizer * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Update src/transformers/keras_callbacks.py Co-authored-by: Sylvain Gugger <[email protected]> * Adding print message to the final upload * Adding print message to the final upload * Change how we wait for the last process to finish * is_done is a property, not a method, derp * Docstrings and documentation * Style pass * Style edit * Docstring reformat * Docstring rewrite * Replacing print with internal logger Co-authored-by: Sylvain Gugger <[email protected]>

Fix LayoutLM ONNX test error

* Enable readme link synchronization * Style * Reuse regex pattern * Apply suggestions * Update

* Fix length of IterableDatasetShard and add test * Add comments

* add a note about tokenizer * add tips to load model is less RAM * fix link * fix more links

…768) * missing requirement * list both

* use Repository for push_to_hub * update readme * update other flax scripts * update readme * update qa example * fix push_to_hub call * fix typo * fix more typos * update readme * use abosolute path to get repo name * fix glue script

* Init multibert checkpoint conversion script * Rename conversion script * Fix MultiBerts Conversion Script * Apply suggestions from code review Co-authored-by: NielsRogge <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: NielsRogge <[email protected]>

* update * add to docs and init * make fix-copies

…g=True" (#13829) * Removed wrong warning * Raise a warning when `max_length` is given with wrong `truncation` * Update the error message * Update the warning message Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>

* Restore broken merge * Additional args, DDP, remove CommonLanguage * Update examples for V100, add training results * Style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> * Remove custom datasets for simplicity, apply suggestions from code review * Add the attention_mask flag, reorganize README Co-authored-by: Sylvain Gugger <[email protected]>

In BartForConditionalGeneration.forward, if labels are provided, decoder_input_ids are set to the labels shifted to the right. This is problematic: if decoder_inputs_embeds is also set, the call to self.model, which eventually gets to BartDecoder.forward, will raise an error. The fix is quite simple, similar to what is there already in BartModel.forward. Mainly, we should not compute decoder_input_ids if decoder_inputs_embeds is provided. Co-authored-by: Silviu Vlad Oprea <[email protected]>

* Add layer-wise scaling * Add reorder & upcasting argument * Add OpenAI GPT-2 weight initialization scheme * start `layer_idx` count at zero for consistency * disentangle attn and reordered and upscaled attn function * rename `scale_attn_by_layer` to `scale_attn_by_layer_id` * make autocast from amp compatible with pytorch<1.6 * fix docstring * style fixes * Add fixes from PR feedback, style tweaks * Fix doc whitespace * Reformat * First pass scale_attn_by_layer_idx and reorder_and_upcast_attn tests * Rename scale_attn_by_layer_idx, add tip * Remove extra newline * add test for weight initialization * update code format * add assert check weights are fp32 * remove assert * Fix incorrect merge * Fix shape mismatch in baddbmm * Add generation test for Mistral flags Co-authored-by: leandro <[email protected]> Co-authored-by: Keshav Santhanam <[email protected]> Co-authored-by: J38 <[email protected]>

@patrickvonplaten

* First draft * Make tuple output more readable * Replace assertions by value errors * Make it possible to predict_with_generate for vision and speech models * Adapt Seq2SeqTrainer to work with VisionEncoderDecoder/SpeechEncoderDecoder * Add deprecation warning * Add copied from statements to vision and speech encoder decoders * Fix failing test * Apply @patrickvonplaten's suggestion * Use reshape instead of view for consistency

* Fix docs * Apply suggestions from review + fix bug

* Torch 1.10 * torch scatter for 1.10 * style * Skip tests ok

* Fixing image segmentation for inference mode. * Update src/transformers/pipelines/base.py Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Patrick von Platen <[email protected]>

…241) * Fix of issue #13327: Wrong weight initialization for TF t5 model * run black formatter * fix typo * remove my name tag from comments Co-authored-by: Shirron <[email protected]>

…ds (#14361) * Experimenting with adding proper get_config() and from_config() methods * Adding a test for get/from config * Fix test for get/from config

…d (#14407) * [Wav2Vec2] Make sure that gradient checkpointing is only run if needed * make fix-copies

* Fix gradient_checkpointing backward compatibility * Remove needless line * make sure mask prob is big enough and length small enough * Fix tests Co-authored-by: patrickvonplaten <[email protected]>

…() methods (#14361)" This reverts commit e99a231.

LysandreJik and others added 30 commits September 27, 2021 14:19

Docs for version v4.11.0

11c69b8

Fix filtering in test fetcher utils (#13766)

5e3b4a7

Fix warning for gradient_checkpointing (#13767)

83d3dc0

Implement len in IterableDatasetShard (#13780)

a21ee1f

up (#13777)

aa018a7

Fix LayoutLM ONNX test error (#13710)

a1ea3ad

Fix LayoutLM ONNX test error

Enable readme link synchronization (#13785)

7d84c3a

* Enable readme link synchronization * Style * Reuse regex pattern * Apply suggestions * Update

Fix length of IterableDatasetShard and add test (#13792)

63cc5bd

* Fix length of IterableDatasetShard and add test * Add comments

Add TF notebooks (#13793)

2a51b15

Update doc for v4.11.1

cf4aa35

Merge remote-tracking branch 'origin/master'

55695df

[docs/gpt-j] addd instructions for how minimize CPU RAM usage (#13795)

bf6118e

* add a note about tokenizer * add tips to load model is less RAM * fix link * fix more links

[examples run_glue.py] missing requirements scipy, sklearn (#13…

b90096f

…768) * missing requirement * list both

Fix gather for TPU (#13813)

269c3d1

Update doc for v4.11.2

5f25855

[testing] auto-replay captured streams (#13803)

e1d1c7c

map only on one process (#13810)

44eb8bd

[DPR] Correct init (#13796)

41436d3

* update * add to docs and init * make fix-copies

skip gptj slow generate tests for now (#13809)

8bbb53e

Update CITATION.cff (#13833)

c411372

include megatron_gpt2 in installed modules (#13834)

bcc3f7b

Delete convert_multiberts_checkpoint_to_pytorch.py (#13852)

de94835

[docs/gpt-j] fix typo (#13851)

955fd4f

NielsRogge and others added 24 commits October 28, 2021 15:29

Fix EncoderDecoderModel docs (#14197)

5f3bf65

* Fix docs * Apply suggestions from review + fix bug

Release v4.12.0

62bf536

Torch 1.10 (#14169)

9f3f335

* Torch 1.10 * torch scatter for 1.10 * style * Skip tests ok

Release v4.12.1

e0a5154

Fixing image segmentation with inference mode. (#14204)

cde7d78

* Fixing image segmentation for inference mode. * Update src/transformers/pipelines/base.py Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Patrick von Platen <[email protected]>

Release: v4.12.2

2191373

Add PushToHubCallback in main init (#14246)

ac77639

Bump huggingface_hub

872c4f3

Release v4.12.3

9ab10fc

Add maximum check for hf hub

294a920

Style

3ea15d2

Fix of issue #13327: Wrong weight initialization for TF t5 model (#14…

b6b97c3

…241) * Fix of issue #13327: Wrong weight initialization for TF t5 model * run black formatter * fix typo * remove my name tag from comments Co-authored-by: Shirron <[email protected]>

improve rewrite state_dict missing _metadata (#14276)

c8206b4

Support for TF >= 2.7 (#14345)

6bf2027

enhance rewrite state_dict missing _metadata (#14348)

341a059

Experimenting with adding proper get_config() and from_config() metho…

e99a231

…ds (#14361) * Experimenting with adding proper get_config() and from_config() methods * Adding a test for get/from config * Fix test for get/from config

[Wav2Vec2] Make sure that gradient checkpointing is only run if neede…

db242ae

…d (#14407) * [Wav2Vec2] Make sure that gradient checkpointing is only run if needed * make fix-copies

Fix gradient_checkpointing backward compatibility (#14408)

6f40723

* Fix gradient_checkpointing backward compatibility * Remove needless line * make sure mask prob is big enough and length small enough * Fix tests Co-authored-by: patrickvonplaten <[email protected]>

Release: v4.12.4

527c763

Revert "Experimenting with adding proper get_config() and from_config…

a5211fc

…() methods (#14361)" This reverts commit e99a231.

Release: v4.12.5

ef3cec0

remove files from 'v4.12.5' before merge

b3a8de8

Merge stripped branch 'v4.12.5'

2c63bc1

calpt added the sync label Dec 1, 2021

calpt and others added 2 commits December 7, 2021 18:07

Merge branch 'master' into sync/v4.12.5

f83eb7e

Post-merge examples restoration

3356b30

calpt marked this pull request as ready for review December 8, 2021 13:45

calpt merged commit 2b7de67 into adapter-hub:master Dec 8, 2021

calpt deleted the sync/v4.12.5 branch December 8, 2021 16:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync/v4.12.x #255

Sync/v4.12.x #255

calpt commented Dec 1, 2021

Sync/v4.12.x #255

Sync/v4.12.x #255

Conversation

calpt commented Dec 1, 2021