Fix pr 32013 #2

YeLuoSuiYou · 2024-09-03T05:25:07Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Remove conversation pipeline tests

* relaxed rope check * lets also accept rope_type=None, defaulting to the original implementation * type and rope_type can coexist

* let's not warn when someone is running a foward without cache + self.training * more models * fixup

fix resize when deepspeed

* Fix float8_e4m3fn in modeling_utils * style * fix * comment

* support gguf fp16 * support gguf bf16 with pytorch * add gguf f16 test * remove bf16

* No more default chat templates * Add the template to the GPT-SW3 tests since it's not available by default now * Fix GPT2 test * Fix Bloom test * Fix Bloom test * Remove default templates again

…ingface#32198) Replaced deprecated unittest method with the correct one.

* [whisper] fix short-form output type * add test * make style * update long-form tests * fixes * last fix * finalise test

….7.0 (huggingface#32210) remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1.7.0

…gingface#32222) set _supports_param_buffer_assignment to False

fix E721 warnings

* fix * [test_all] trigger full CI --------- Co-authored-by: ydshieh <[email protected]>

* translate philosophy.md to chinese * add the missing link

…tility functions. Default to using the currently active microphone on Mac (huggingface#31846) * use currently active microphone on mac for ffmpeg_microphone * Allow ffmpeg_microphone device to be specified Co-authored-by: amyeroberts <[email protected]> --------- Co-authored-by: amyeroberts <[email protected]>

Fix code snippet for grounding-dino

* fix * move changes to prompt lookup * add test * set eos in assistant model * style * fix flakiness * changes for new `main` * Update tests/generation/test_utils.py Co-authored-by: amyeroberts <[email protected]> * Update tests/generation/test_utils.py Co-authored-by: amyeroberts <[email protected]> * add comment to explain --------- Co-authored-by: amyeroberts <[email protected]>

* llava w/o images * tests

* fix resize when deepspeed * deepsped uses new embeds * we needed this

…gingface#32143) * don't log base model architecture in wandb is log model is false * Update src/transformers/integrations/integration_utils.py Co-authored-by: amyeroberts <[email protected]> * convert log model setting into an enum * fix formatting --------- Co-authored-by: amyeroberts <[email protected]>

* Refactored to remove un-necessary object base class. * small fix.

* adds: extra_repr() to RMSNorm layers in multiple models * adds: extra_repr for deprecated models as well * formatting as per style guide

…tection` for owlv2 (huggingface#31934) * Add check for target_sizes is None in post_process_image_guided_detection * Make sure Owlvit and Owlv2 in sync * Fix incorrect indentation; add check for correct size of target_sizes

…n_implementation==flash_attention_2` (huggingface#32039) * add flash attention check * fix * fix

…ngface#32241) * fix * fix prev test (half of failures) * [run-slow] llama, gemma2 * [run-slow] llama, gemma2

update Co-authored-by: ydshieh <[email protected]>

…uggingface#32244) * replace for loop by tensor ops * rm assert; readability

* bloom dynamic cache * bloom follows standard cache format * no skips for bloom anymore * use cache position when possible * clean up * codestyle * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by: amyeroberts <[email protected]> * pr comments * isinstance fix * address comments * make musicgen test happy * [run-slow] bloom --------- Co-authored-by: amyeroberts <[email protected]>

upload Co-authored-by: ydshieh <[email protected]>

* fix * >= 0.3.0 --------- Co-authored-by: ydshieh <[email protected]>

Do not call torch.repeat_interleave if expand_size is 1

…e#32908) * add chat_template to gguf tokenizer * add template through tokenizer config

…ainer` with `eval_on_start=True` in Jupyter Notebook. (huggingface#32849) fix: `AttributeError` raised when using `Trainer` with `eval_on_start=True` in Jupyter Notebook.

…1691 (huggingface#32921) fix save_pretrained

…on.md to Korean" (huggingface#32334) * docs: ko: tasks/knowledge_distillation_for_image_classification.md * feat: nmt draft * fix: manual edits * Apply suggestions from code review Co-authored-by: Chulhwa (Evan) Han <[email protected]> * Apply suggestions from code review Co-authored-by: Chulhwa (Evan) Han <[email protected]> * Apply suggestions from code review Co-authored-by: Ahnjj_DEV <[email protected]> * Apply suggestions from code review Co-authored-by: Ahnjj_DEV <[email protected]> * Apply suggestions from code review Co-authored-by: Ahnjj_DEV <[email protected]> * Apply suggestions from code review Co-authored-by: Chulhwa (Evan) Han <[email protected]> * Apply suggestions from code review Co-authored-by: Chulhwa (Evan) Han <[email protected]> * Apply suggestions from code review Co-authored-by: Chulhwa (Evan) Han <[email protected]> * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review --------- Co-authored-by: Chulhwa (Evan) Han <[email protected]> Co-authored-by: Ahnjj_DEV <[email protected]>

…che=False` (huggingface#32863)

fix outdated link

…e() (huggingface#31292) * Add .float() in all generation methods logit outputs * Switch float-casting of logits to training only for main models * Add `num_logits_to_keep` in Llama and add it by default in generate * Apply style * Add num_logits_to_keep as arg in prepare_input_for_generation * Add support for Mistral * Revert models except llama and mistral * Fix default None value in _supports_num_logits_to_keep() * Fix dimension of dummy input * Add exception for prophetnet in _supports_num_logits_to_keep() * Update _supports_num_logits_to_keep() to use inspect.signature() * Add deprecation cycle + remove modification with pretraining_tp * Apply style * Add most used models * Apply style * Make `num_logits_to_keep` an int in all cases to remove if-else clause * Add compile check for the warning * Fix torch versions * style * Add gemma2 * Update warning version * Add comment about .float operations in generation utils * Add tests in GenerationTesterMixin and ModelTesterMixin * Fix batch size for assisted decoding in tests * fix small issues in test * refacor test * fix slicing removing dim issue * Add nemotron support (should fix check-copy issue in CIs) * Trigger new CIs * Trigger new CIs * Bump version * Bump version in TODO * Trigger CIs * remove blank space * Trigger CIs

…eprecations in `generate`-related code 🧹 (huggingface#32659) Co-authored-by: amyeroberts <[email protected]>

…uggingface#32860) * add liger integration * fix syntax * fix import issue * add trainer.md * Use _apply_liger_kernel() * Fixed log message * Update docs/source/en/trainer.md Co-authored-by: Marc Sun <[email protected]> * Update docs/source/en/trainer.md Co-authored-by: Marc Sun <[email protected]> * Update src/transformers/training_args.py Co-authored-by: Byron Hsu <[email protected]> * Update src/transformers/trainer.py Co-authored-by: Marc Sun <[email protected]> * Update src/transformers/training_args.py Co-authored-by: Byron Hsu <[email protected]> * Update docs/source/en/trainer.md Co-authored-by: Byron Hsu <[email protected]> * Fixed checkstyle and updated readme * Added test * Fixed checkstyle * fix docstring * rename use_liger to use_liger_kernel * Trigger Build * Added test * add fix-copies * Fixed copy inconsistencies --------- Co-authored-by: shimizust <[email protected]> Co-authored-by: Steven Shimizu <[email protected]> Co-authored-by: Marc Sun <[email protected]> Co-authored-by: Byron Hsu <[email protected]>

…ce#32684) * Add new Jinja features: - Do extension - Break/continue in loops - Call strftime to get current datetime in any format * Add new Jinja features: - Do extension - Break/continue in loops - Call strftime to get current datetime in any format * Fix strftime template * Add template strip() just to be safe * Remove the do extension to make porting easier, and also because it's the least useful * Rename test * strftime -> strftime_now * Split test * Update test to use strftime_now * Refactor everything out into chat_template_utils * Refactor everything out into chat_template_utils * Refactor everything out into chat_template_utils * Refactor everything out into chat_template_utils * Refactor everything out into chat_template_utils

huggingface#32910) * Update modeling_deformable_detr.py * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py Co-authored-by: amyeroberts <[email protected]> * Update ms_deform_attn_cuda.cu * Update modeling_deformable_detr.py * Update modeling_deformable_detr.py * [empty] this is a empty commit --------- Co-authored-by: amyeroberts <[email protected]>

* added doctring to SchedulerType class * Remove trailing whitespace src/transformers/trainer_utils.py Co-authored-by: Steven Liu <[email protected]> * fixup --------- Co-authored-by: Steven Liu <[email protected]>

…#33097)

…n with Chameleon & its finetunes like Anole

amyeroberts and others added 30 commits July 24, 2024 14:03

Remove conversational pipeline tests (huggingface#32099)

165116b

Remove conversation pipeline tests

RoPE: relaxed rope validation (huggingface#32182)

e0182f3

* relaxed rope check * lets also accept rope_type=None, defaulting to the original implementation * type and rope_type can coexist

let's not warn when someone is running a forward (huggingface#32176)

8d2534c

* let's not warn when someone is running a foward without cache + self.training * more models * fixup

Fix resize embedding with Deepspeed (huggingface#32192)

1392a68

fix resize when deepspeed

Fix float8_e4m3fn in modeling_utils (huggingface#32193)

af0e4b7

* Fix float8_e4m3fn in modeling_utils * style * fix * comment

Support dequantizing GGUF FP16 format (huggingface#31783)

1c122a4

* support gguf fp16 * support gguf bf16 with pytorch * add gguf f16 test * remove bf16

🚨 No more default chat templates (huggingface#31733)

edd68f4

* No more default chat templates * Add the template to the GPT-SW3 tests since it's not available by default now * Fix GPT2 test * Fix Bloom test * Fix Bloom test * Remove default templates again

fix: Replaced deprecated unittest method with the correct one (hugg…

85a1269

…ingface#32198) Replaced deprecated unittest method with the correct one.

[whisper] fix short-form output type (huggingface#32178)

5658e74

* [whisper] fix short-form output type * add test * make style * update long-form tests * fixes * last fix * finalise test

remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1…

f53a5de

….7.0 (huggingface#32210) remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1.7.0

Update question_answering.py (huggingface#32208)

1ecedf1

[BigBird Pegasus] set _supports_param_buffer_assignment to False (hug…

9b9a54e

…gingface#32222) set _supports_param_buffer_assignment to False

[warnings] fix E721 warnings (huggingface#32223)

de23188

fix E721 warnings

Follow up for huggingface#31973 (huggingface#32025)

df6eee9

* fix * [test_all] trigger full CI --------- Co-authored-by: ydshieh <[email protected]>

translate philosophy.md to chinese (huggingface#32177)

6ed0bf1

* translate philosophy.md to chinese * add the missing link

Fix code snippet for Grounding DINO (huggingface#32229)

9d6c064

Fix code snippet for grounding-dino

Llava: generate without images (huggingface#32183)

fad15fb

* llava w/o images * tests

Resize embeds with DeepSpeed (huggingface#32214)

c46edfb

* fix resize when deepspeed * deepsped uses new embeds * we needed this

Refactor: Removed un-necessary object base class (huggingface#32230)

b8e5cd5

* Refactored to remove un-necessary object base class. * small fix.

Adds: extra_repr for RMSNorm layers in most models (huggingface#32204)

f9756d9

* adds: extra_repr() to RMSNorm layers in multiple models * adds: extra_repr for deprecated models as well * formatting as per style guide

[tests] fix static cache implementation is not compatible with `att…

27c7f97

…n_implementation==flash_attention_2` (huggingface#32039) * add flash attention check * fix * fix

Flash-Attn: fix generation when no attention mask or no pading (huggi…

81233c0

…ngface#32241) * fix * fix prev test (half of failures) * [run-slow] llama, gemma2 * [run-slow] llama, gemma2

More flexible trigger condition (huggingface#32251)

8da9068

update Co-authored-by: ydshieh <[email protected]>

Llama 3.1: replace for loop by tensor ops at inv_freq initialization (h…

44f6fdd

…uggingface#32244) * replace for loop by tensor ops * rm assert; readability

Upload new model failure report to Hub (huggingface#32264)

f2122cc

upload Co-authored-by: ydshieh <[email protected]>

shubhamugare and others added 29 commits August 22, 2024 15:30

Add SynCode to llm_tutorial (huggingface#32884)

9282413

Fix benchmark script (huggingface#32635)

bf97d4a

* fix * >= 0.3.0 --------- Co-authored-by: ydshieh <[email protected]>

Improve greedy search memory usage (huggingface#32895)

99d67f1

Do not call torch.repeat_interleave if expand_size is 1

Add chat_template for tokenizer extracted from GGUF model (huggingfac…

ee8c01f

…e#32908) * add chat_template to gguf tokenizer * add template through tokenizer config

fix: (issue huggingface#32689) AttributeError raised when using `Tr…

f1d822b

…ainer` with `eval_on_start=True` in Jupyter Notebook. (huggingface#32849) fix: `AttributeError` raised when using `Trainer` with `eval_on_start=True` in Jupyter Notebook.

Gemma2: eager attention by default (huggingface#32865)

975b988

[run_slow] idefics2 (huggingface#32840)

18199b3

Fix regression on Processor.save_pretrained caused by huggingface#3…

273c0af

…1691 (huggingface#32921) fix save_pretrained

Generate: Deprecate returning legacy cache by default; Handle `use_ca…

a26de15

…che=False` (huggingface#32863)

docs: fix outdated link to TF32 explanation (huggingface#32947)

d806fa3

fix outdated link

Forbid PretrainedConfig from saving generate parameters; Update d…

970a16e

…eprecations in `generate`-related code 🧹 (huggingface#32659) Co-authored-by: amyeroberts <[email protected]>

added doctring to SchedulerType class (huggingface#32898)

e3a5f35

* added doctring to SchedulerType class * Remove trailing whitespace src/transformers/trainer_utils.py Co-authored-by: Steven Liu <[email protected]> * fixup --------- Co-authored-by: Steven Liu <[email protected]>

Update Jinja docs with new functions and general cleanup (huggingface…

0a7af19

…#33097)

uniformize kwargs of Chameleon

2cdc473

fix linter nit

43febe0

rm stride default

f59ca5a

add tests for chameleon processor

dcbfd17

fix tests

80fb7bb

fix chameleon tests

e57e988

don't hardcode arg names

ed0e8aa

add comment on get_component

ae5d537

rm Chameleon's slow tokenizer

b252643

add support for image generation and interleaved image-text generatio…

dae439c

…n with Chameleon & its finetunes like Anole

Fix issues in PR huggingface#32013

7607e4c

YeLuoSuiYou closed this Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix pr 32013 #2

Fix pr 32013 #2

YeLuoSuiYou commented Sep 3, 2024

Fix pr 32013 #2

Fix pr 32013 #2

Conversation

YeLuoSuiYou commented Sep 3, 2024

What does this PR do?

Before submitting

Who can review?