fixed typo #14

kornosk · 2018-11-12T01:18:24Z

When test with SQuAD

thomwolf · 2018-11-12T07:36:04Z

Hi,
Thanks for the PR, we don't want to add a shell script to the repo.
I will correct the typo,
Best,
Thom

* update kd qa in roberta modeling * fix issues for kd-quac runner

Remove model specific changes for BERT and DistilBERT

Pop

Summary: This pull request introduce a new way to do sharding which allow weights to be sharded in two dimensional mesh, i.e., (fsdp, tensor), and then the input to be sharded according to the fsdp dimension. To enable it, pass --spmd_tensor_sharding 2, 2 is the tensor dimension, the fsdp dimension will be auto calculated according to num_devices // 2. Test Plan: Test it on a V4-8 with 2B LLaMA.

Add CLIP model

Summary: This pull request introduce a new way to do sharding which allow weights to be sharded in two dimensional mesh, i.e., (fsdp, tensor), and then the input to be sharded according to the fsdp dimension. To enable it, pass --spmd_tensor_sharding 2, 2 is the tensor dimension, the fsdp dimension will be auto calculated according to num_devices // 2. Test Plan: Test it on a V4-8 with 2B LLaMA.

* Cohere Model Release (#1) Cohere Model Release * Remove unnecessary files and code (#2) Some cleanup * Delete cohere-model directory (#3) * Make Fix (#5) * Pr fixes (#6) * fixes for pr * pr fixes for the format * pr fixes for the format * src/transformers/models/auto/tokenization_auto.py * Tokenizer test (#8) * tokenizer test * format fix * Adding Docs and other minor changes (#7) * Add modeling tests (#9) * Smol Fix (#11) * tokenization tests are fixed * format fixes * fix pr doc tests * fix pr doc tests * fix pr doc tests * fix pr style check * small changes in cohere.md * FIX: Address final comments for transformers integration (#13) * fix modeling final nits and add proper test file * for now leave empty tests * add integration test * push new test * fix modeling cohere (#14) * Update chat templates to use the new API (#15) --------- Co-authored-by: ahmetustun <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: Matt <[email protected]>

…227-patch-1 Update 17_save_load.py

* Cohere Model Release (#1) Cohere Model Release * Remove unnecessary files and code (#2) Some cleanup * Delete cohere-model directory (#3) * Make Fix (#5) * Pr fixes (#6) * fixes for pr * pr fixes for the format * pr fixes for the format * src/transformers/models/auto/tokenization_auto.py * Tokenizer test (#8) * tokenizer test * format fix * Adding Docs and other minor changes (#7) * Add modeling tests (#9) * Smol Fix (#11) * tokenization tests are fixed * format fixes * fix pr doc tests * fix pr doc tests * fix pr doc tests * fix pr style check * small changes in cohere.md * FIX: Address final comments for transformers integration (#13) * fix modeling final nits and add proper test file * for now leave empty tests * add integration test * push new test * fix modeling cohere (#14) * Update chat templates to use the new API (#15) --------- Co-authored-by: ahmetustun <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: Matt <[email protected]>

adding deformable relative

Fixing gradient checkpointing

kornosk added 2 commits November 11, 2018 20:15

fixed typo

2fd6931

shell scrip for SQuAd test

d050e27

thomwolf closed this Nov 12, 2018

maeotaku mentioned this pull request May 23, 2019

bert->onnx ->caffe2 weird error #633

Closed

HongyanJiao mentioned this pull request Sep 19, 2019

traced_model #1291

Closed

devroy73 mentioned this pull request Nov 10, 2019

Multi GPU dataparallel crash #1779

Closed

4 tasks

stevezheng23 added a commit to stevezheng23/transformers that referenced this pull request Mar 24, 2020

Dev/zheng/quac (huggingface#14)

6cdbcd7

* update kd qa in roberta modeling * fix issues for kd-quac runner

manchandasahil mentioned this pull request Mar 22, 2021

Longformer training : CUDA error: device-side assert triggered #10852

Closed

2 tasks

amathews-amd referenced this pull request in ROCm/transformers Aug 6, 2021

Merge pull request #14 from microsoft/raviskolli/ort

efc9019

Remove model specific changes for BERT and DistilBERT

yananchen1989 mentioned this pull request Aug 20, 2021

Bugs when fine tuning the gpt2 #12965

Closed

jameshennessytempus pushed a commit to jameshennessytempus/transformers that referenced this pull request Jun 1, 2023

Merge pull request huggingface#14 from jamesthesnake/pop

21524d4

Pop

lwmlyy mentioned this pull request Aug 15, 2023

add util for ram efficient loading of model when using fsdp #25107

Merged

1 task

ocavue pushed a commit to ocavue/transformers that referenced this pull request Sep 13, 2023

Merge pull request huggingface#14 from xenova/clip

4fdbc27

Add CLIP model

lcong pushed a commit to lcong/transformers that referenced this pull request Apr 9, 2024

Merge pull request huggingface#14 from kevinlin19910227/kevinlin19910…

63585a0

…227-patch-1 Update 17_save_load.py

LysandreJik pushed a commit to LysandreJik/transformers that referenced this pull request Apr 10, 2024

fix modeling cohere (huggingface#14)

bb7f728

SangbumChoi added a commit to SangbumChoi/transformers that referenced this pull request Aug 22, 2024

Merge pull request huggingface#14 from Superb-AI-Suite/dev/feat

b015820

adding deformable relative

ArthurZucker pushed a commit that referenced this pull request Sep 25, 2024

Merge pull request #14 from huggingface/fix-gradient-checkpointing

781f0c1

Fixing gradient checkpointing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixed typo #14

fixed typo #14

kornosk commented Nov 12, 2018

thomwolf commented Nov 12, 2018

fixed typo #14

fixed typo #14

Conversation

kornosk commented Nov 12, 2018

thomwolf commented Nov 12, 2018