Fix CI with change of name of nlp #7054

sgugger · 2020-09-10T18:15:32Z

Fixes #7055 (because yes, I can see the future)

codecov · 2020-09-10T18:38:39Z

Codecov Report

Merging #7054 into master will increase coverage by 2.11%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #7054      +/-   ##
==========================================
+ Coverage   78.74%   80.85%   +2.11%     
==========================================
  Files         168      168              
  Lines       32172    32172              
==========================================
+ Hits        25335    26014     +679     
+ Misses       6837     6158     -679

Impacted Files	Coverage Δ
src/transformers/__init__.py	`99.33% <ø> (ø)`
src/transformers/tokenization_xlm.py	`82.93% <ø> (ø)`
src/transformers/file_utils.py	`82.66% <100.00%> (+0.25%)`	⬆️
src/transformers/trainer.py	`54.68% <100.00%> (ø)`
src/transformers/modeling_tf_funnel.py	`18.53% <0.00%> (-75.51%)`	⬇️
src/transformers/modeling_tf_flaubert.py	`24.53% <0.00%> (-63.81%)`	⬇️
src/transformers/modeling_marian.py	`60.00% <0.00%> (-30.00%)`	⬇️
src/transformers/activations.py	`85.00% <0.00%> (-5.00%)`	⬇️
src/transformers/configuration_bart.py	`90.00% <0.00%> (-4.00%)`	⬇️
src/transformers/modeling_bart.py	`93.77% <0.00%> (-0.68%)`	⬇️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update df4594a...f8b9682. Read the comment docs.

sgugger · 2020-09-10T18:51:05Z

Merging to make the CI green but happy to address any comment in a follow-up PR.

stas00 · 2020-09-10T18:54:47Z

_______________________________________________________ ERROR collecting tests/test_trainer.py _______________________________________________________
ImportError while importing test module '/mnt/nvme1/code/huggingface/transformers-master/tests/test_trainer.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/home/stas/anaconda3/envs/main-37/lib/python3.7/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_trainer.py:3: in <module>
    import datasets
E   ModuleNotFoundError: No module named 'datasets'

stas00 · 2020-09-10T18:56:06Z

and after pip install datasets (needed in setup.py), the failure is still the same as in #7055

sgugger · 2020-09-10T18:56:30Z

I think you need an install from source.

stas00 · 2020-09-10T18:57:26Z

not sure what you mean? install from source datasets?

I did:

git pull
pip install -e .[dev]

in transformers

sgugger · 2020-09-10T18:59:36Z

It's not in the depencies of transformers and requires a separate install from source for now. It does work on the CI and my machine:

git clone https://github.com/huggingface/datasets
cd datasets
pip install -e .

stas00 · 2020-09-10T19:01:13Z

I did what you suggested, same failures.

sgugger · 2020-09-10T19:03:29Z

Are you sure you are in the same env?

nlp was never in setup.py. It is an additional dep required for the full test suite as a source install for now, will become a dep when it's stable enough. I'll add that to the CONTRIBUTING but trying to understand why it fails for you before.

stas00 · 2020-09-10T19:05:13Z

I have 2 gpus, you probably don't?

Indeed, if I run:

CUDA_VISIBLE_DEVICES="" pytest tests/test_trainer.py

it works.

stas00 · 2020-09-10T19:05:50Z

Yup, it's multi-gpu that is the problem. It works if I do CUDA_VISIBLE_DEVICES="0" pytest tests/test_trainer.py

sgugger · 2020-09-10T19:06:16Z

Mmmm, why would the multi-gpu not see a new module. That's weird.

stas00 · 2020-09-10T19:07:51Z

I'm not sure you have looked at the errors #7055 - they are of numeric mismatch nature. Have a look?

12 != 24.0 looks like 1 vs 2 gpu issue.

let's move back into #7055 and continue there.

sgugger · 2020-09-10T19:09:44Z

Oh this is a different error, not a missing model. Looks like those tests need a decorator to run on the CPU only.

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

stas00 · 2020-09-10T19:17:28Z

nlp was never in setup.py. It is an additional dep required for the full test suite as a source install for now, will become a dep when it's stable enough. I'll add that to the CONTRIBUTING but trying to understand why it fails for you before.

datasets needs to be in requirements for dev - otherwise test suite fails.

sgugger · 2020-09-10T19:19:24Z

Yes, like I said you need a separate source install of it. You can't have a source install from dev that is properly up to date AFAIK.

Documented this in #7058

* ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in #6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit 226dad1. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code #3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (#7034) * [s2s] --eval_max_generate_length (#7018) * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <[email protected]> * Apply suggestions from code review Co-authored-by: Lysandre Debut <[email protected]> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Lysandre Debut <[email protected]>

* ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in huggingface#6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit 226dad1. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code huggingface#3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (huggingface#7034) * [s2s] --eval_max_generate_length (huggingface#7018) * Fix CI with change of name of nlp (huggingface#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <[email protected]> * Apply suggestions from code review Co-authored-by: Lysandre Debut <[email protected]> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Lysandre Debut <[email protected]>

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

* ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in #6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit 226dad1. * decouple from bart * remove unused code #1 * remove unused code #2 * remove unused code #3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (#7034) * [s2s] --eval_max_generate_length (#7018) * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <[email protected]> * Apply suggestions from code review Co-authored-by: Lysandre Debut <[email protected]> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Lysandre Debut <[email protected]>

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

@stefan-it

* WIP flax bert * Initial commit Bert Jax/Flax implementation. * Embeddings working and equivalent to PyTorch. * Move embeddings in its own module BertEmbeddings * Added jax.jit annotation on forward call * BertEncoder on par with PyTorch ! :D * Add BertPooler on par with PyTorch !! * Working Jax+Flax implementation of BertModel with < 1e-5 differences on the last layer. * Fix pooled output to take only the first token of the sequence. * Refactoring to use BertConfig from transformers. * Renamed FXBertModel to FlaxBertModel * Model is now initialized in FlaxBertModel constructor and reused. * WIP JaxPreTrainedModel * Cleaning up the code of FlaxBertModel * Added ability to load Flax model saved through save_pretrained() * Added ability to convert Pytorch Bert model to FlaxBert * FlaxBert can now load every Pytorch Bert model with on-the-fly conversion * Fix hardcoded shape values in conversion scripts. * Improve the way we handle LayerNorm conversion from PyTorch to Flax. * Added positional embeddings as parameter of BertModel with default to np.arange. * Let's roll FlaxRoberta ! * Fix missing position_ids parameters on predict for Bert * Flax backend now supports batched inputs Signed-off-by: Morgan Funtowicz <[email protected]> * Make it possible to load msgpacked model on convert from pytorch in last resort. Signed-off-by: Morgan Funtowicz <[email protected]> * Moved save_pretrained to Jax base class along with more constructor parameters. * Use specialized, model dependent conversion functio. * Expose `is_flax_available` in file_utils. * Added unittest for Flax models. * Added run_tests_flax to the CI. * Introduce FlaxAutoModel * Added more unittests * Flax model reference the _MODEL_ARCHIVE_MAP from PyTorch model. * Addressing review comments. * Expose seed in both Bert and Roberta * Fix typo suggested by @stefan-it Co-Authored-By: Stefan Schweter <[email protected]> * Attempt to make style * Attempt to make style in tests too * Added jax & jaxlib to the flax optional dependencies. * Attempt to fix flake8 warnings ... * Redo black again and again * When black and flake8 fight each other for a space ... 💥 💥 💥 * Try removing trailing comma to make both black and flake happy! * Fix invalid is_<framework>_available call, thanks @LysandreJik 🎉 * Fix another invalid import in flax_roberta test * Bump and pin flax release to 0.1.0. * Make flake8 happy, remove unused jax import * Change the type of the catch for msgpack. * Remove unused import. * Put seed as optional constructor parameter. * trigger ci again * Fix too much parameters in BertAttention. * Formatting. * Simplify Flax unittests to avoid machine crashes. * Fix invalid number of arguments when raising issue for an unknown model. * Address @bastings comment in PR, moving jax.jit decorated outside of __call__ * Fix incorrect path to require_flax/require_pytorch functions. Signed-off-by: Morgan Funtowicz <[email protected]> * Attempt to make style. Signed-off-by: Morgan Funtowicz <[email protected]> * Correct rebasing of circle-ci dependencies Signed-off-by: Morgan Funtowicz <[email protected]> * Fix import sorting. Signed-off-by: Morgan Funtowicz <[email protected]> * Fix unused imports. Signed-off-by: Morgan Funtowicz <[email protected]> * Again import sorting... Signed-off-by: Morgan Funtowicz <[email protected]> * Installing missing nlp dependency for flax unittests. Signed-off-by: Morgan Funtowicz <[email protected]> * Fix laoding of model for Flax implementations. Signed-off-by: Morgan Funtowicz <[email protected]> * jit the inner function call to make JAX-compatible Signed-off-by: Morgan Funtowicz <[email protected]> * Format ! Signed-off-by: Morgan Funtowicz <[email protected]> * Flake one more time 🎶 Signed-off-by: Morgan Funtowicz <[email protected]> * Rewrites BERT in Flax to the new Linen API (#7211) * Rewrite Flax HuggingFace PR to Linen * Some fixes * Fix tests * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * Expose `is_flax_available` in file_utils. * Added run_tests_flax to the CI. * Attempt to make style * trigger ci again * Fix import sorting. Signed-off-by: Morgan Funtowicz <[email protected]> * Revert "Rewrites BERT in Flax to the new Linen API (#7211)" This reverts commit 23703a5. * Remove jnp.lax references Signed-off-by: Morgan Funtowicz <[email protected]> * Make style. Signed-off-by: Morgan Funtowicz <[email protected]> * Reintroduce Linen changes ... Signed-off-by: Morgan Funtowicz <[email protected]> * Make style. Signed-off-by: Morgan Funtowicz <[email protected]> * Use jax native's gelu function. Signed-off-by: Morgan Funtowicz <[email protected]> * Renaming BertModel to BertModule to highlight the fact this is the Flax Module object. Signed-off-by: Morgan Funtowicz <[email protected]> * Rewrite FlaxAutoModel test to not rely on pretrained_model_archive_map Signed-off-by: Morgan Funtowicz <[email protected]> * Remove unused variable in BertModule. Signed-off-by: Morgan Funtowicz <[email protected]> * Remove unused variable in BertModule again Signed-off-by: Morgan Funtowicz <[email protected]> * Attempt to have is_flax_available working again. Signed-off-by: Morgan Funtowicz <[email protected]> * Introduce JAX TensorType Signed-off-by: Morgan Funtowicz <[email protected]> * Improve ImportError message when trying to convert to various TensorType format. Signed-off-by: Morgan Funtowicz <[email protected]> * Makes Flax model jittable. Signed-off-by: Morgan Funtowicz <[email protected]> * Ensure flax models are jittable in unittests. Signed-off-by: Morgan Funtowicz <[email protected]> * Remove unused imports. Signed-off-by: Morgan Funtowicz <[email protected]> * Ensure jax imports are guarded behind is_flax_available. Signed-off-by: Morgan Funtowicz <[email protected]> * Make style. Signed-off-by: Morgan Funtowicz <[email protected]> * Make style again Signed-off-by: Morgan Funtowicz <[email protected]> * Make style again again Signed-off-by: Morgan Funtowicz <[email protected]> * Make style again again again Signed-off-by: Morgan Funtowicz <[email protected]> * Update src/transformers/file_utils.py Co-authored-by: Marc van Zee <[email protected]> * Bump flax to it's latest version Co-authored-by: Marc van Zee <[email protected]> * Bump jax version to at least 0.2.0 Signed-off-by: Morgan Funtowicz <[email protected]> * Style. Signed-off-by: Morgan Funtowicz <[email protected]> * Update the unittest to use TensorType.JAX Signed-off-by: Morgan Funtowicz <[email protected]> * isort import in tests. Signed-off-by: Morgan Funtowicz <[email protected]> * Match new flax parameters name "params" Signed-off-by: Morgan Funtowicz <[email protected]> * Remove unused imports. Signed-off-by: Morgan Funtowicz <[email protected]> * Add flax models to transformers __init__ Signed-off-by: Morgan Funtowicz <[email protected]> * Attempt to address all CI related comments. Signed-off-by: Morgan Funtowicz <[email protected]> * Correct circle.yml indent. Signed-off-by: Morgan Funtowicz <[email protected]> * Correct circle.yml indent (2) Signed-off-by: Morgan Funtowicz <[email protected]> * Remove coverage from flax tests Signed-off-by: Morgan Funtowicz <[email protected]> * Addressing many naming suggestions from comments Signed-off-by: Morgan Funtowicz <[email protected]> * Simplify for loop logic to interate over layers in FlaxBertLayerCollection Signed-off-by: Morgan Funtowicz <[email protected]> * use f-string syntax for formatting logs. Signed-off-by: Morgan Funtowicz <[email protected]> * Use config property from FlaxPreTrainedModel. Signed-off-by: Morgan Funtowicz <[email protected]> * use "cls_token" instead of "first_token" variable name. Signed-off-by: Morgan Funtowicz <[email protected]> * use "hidden_state" instead of "h" variable name. Signed-off-by: Morgan Funtowicz <[email protected]> * Correct class reference in docstring to link to Flax related modules. Signed-off-by: Morgan Funtowicz <[email protected]> * Added HF + Google Flax team copyright. Signed-off-by: Morgan Funtowicz <[email protected]> * Make Roberta independent from Bert Signed-off-by: Morgan Funtowicz <[email protected]> * Move activation functions to flax_utils. Signed-off-by: Morgan Funtowicz <[email protected]> * Move activation functions to flax_utils for bert. Signed-off-by: Morgan Funtowicz <[email protected]> * Added docstring for BERT Signed-off-by: Morgan Funtowicz <[email protected]> * Update import for Bert and Roberta tokenizers Signed-off-by: Morgan Funtowicz <[email protected]> * Make style. Signed-off-by: Morgan Funtowicz <[email protected]> * fix-copies Signed-off-by: Morgan Funtowicz <[email protected]> * Correct FlaxRobertaLayer to match PyTorch. Signed-off-by: Morgan Funtowicz <[email protected]> * Use the same store_artifact for flax unittest Signed-off-by: Morgan Funtowicz <[email protected]> * Style. Signed-off-by: Morgan Funtowicz <[email protected]> * Make sure gradient are disabled only locally for flax unittest using torch equivalence. Signed-off-by: Morgan Funtowicz <[email protected]> * Use relative imports Signed-off-by: Morgan Funtowicz <[email protected]> Co-authored-by: Stefan Schweter <[email protected]> Co-authored-by: Marc van Zee <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

* ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in huggingface#6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit 226dad1. * decouple from bart * remove unused code huggingface#1 * remove unused code huggingface#2 * remove unused code huggingface#3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (huggingface#7034) * [s2s] --eval_max_generate_length (huggingface#7018) * Fix CI with change of name of nlp (huggingface#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <[email protected]> * Apply suggestions from code review Co-authored-by: Lysandre Debut <[email protected]> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Lysandre Debut <[email protected]>

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

* ready for PR * cleanup * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST * fix * perfectionism * revert change from another PR * odd, already committed this one * non-interactive upload workaround * backup the failed experiment * store langs in config * workaround for localizing model path * doc clean up as in huggingface#6956 * style * back out debug mode * document: run_eval.py --num_beams 10 * remove unneeded constant * typo * re-use bart's Attention * re-use EncoderLayer, DecoderLayer from bart * refactor * send to cuda and fp16 * cleanup * revert (moved to another PR) * better error message * document run_eval --num_beams * solve the problem of tokenizer finding the right files when model is local * polish, remove hardcoded config * add a note that the file is autogenerated to avoid losing changes * prep for org change, remove unneeded code * switch to model4.pt, update scores * s/python/bash/ * missing init (but doesn't impact the finetuned model) * cleanup * major refactor (reuse-bart) * new model, new expected weights * cleanup * cleanup * full link * fix model type * merge porting notes * style * cleanup * have to create a DecoderConfig object to handle vocab_size properly * doc fix * add note (not a public class) * parametrize * - add bleu scores integration tests * skip test if sacrebleu is not installed * cache heavy models/tokenizers * some tweaks * remove tokens that aren't used * more purging * simplify code * switch to using decoder_start_token_id * add doc * Revert "major refactor (reuse-bart)" This reverts commit 226dad1. * decouple from bart * remove unused code #1 * remove unused code huggingface#2 * remove unused code huggingface#3 * update instructions * clean up * move bleu eval to examples * check import only once * move data+gen script into files * reuse via import * take less space * add prepare_seq2seq_batch (auto-tested) * cleanup * recode test to use json instead of yaml * ignore keys not needed * use the new -y in transformers-cli upload -y * [xlm tok] config dict: fix str into int to match definition (huggingface#7034) * [s2s] --eval_max_generate_length (huggingface#7018) * Fix CI with change of name of nlp (huggingface#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * extending to support allen_nlp wmt models - allow a specific checkpoint file to be passed - more arg settings - scripts for allen_nlp models * sync with changes * s/fsmt-wmt/wmt/ in model names * s/fsmt-wmt/wmt/ in model names (p2) * s/fsmt-wmt/wmt/ in model names (p3) * switch to a better checkpoint * typo * make non-optional args such - adjust tests where possible or skip when there is no other choice * consistency * style * adjust header * cards moved (model rename) * use best custom hparams * update info * remove old cards * cleanup * s/stas/facebook/ * update scores * s/allen_nlp/allenai/ * url maps aren't needed * typo * move all the doc / build /eval generators to their own scripts * cleanup * Apply suggestions from code review Co-authored-by: Lysandre Debut <[email protected]> * Apply suggestions from code review Co-authored-by: Lysandre Debut <[email protected]> * fix indent * duplicated line * style * use the correct add_start_docstrings * oops * resizing can't be done with the core approach, due to 2 dicts * check that the arg is a list * style * style Co-authored-by: Sam Shleifer <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Lysandre Debut <[email protected]>

@stefan-it

* WIP flax bert * Initial commit Bert Jax/Flax implementation. * Embeddings working and equivalent to PyTorch. * Move embeddings in its own module BertEmbeddings * Added jax.jit annotation on forward call * BertEncoder on par with PyTorch ! :D * Add BertPooler on par with PyTorch !! * Working Jax+Flax implementation of BertModel with < 1e-5 differences on the last layer. * Fix pooled output to take only the first token of the sequence. * Refactoring to use BertConfig from transformers. * Renamed FXBertModel to FlaxBertModel * Model is now initialized in FlaxBertModel constructor and reused. * WIP JaxPreTrainedModel * Cleaning up the code of FlaxBertModel * Added ability to load Flax model saved through save_pretrained() * Added ability to convert Pytorch Bert model to FlaxBert * FlaxBert can now load every Pytorch Bert model with on-the-fly conversion * Fix hardcoded shape values in conversion scripts. * Improve the way we handle LayerNorm conversion from PyTorch to Flax. * Added positional embeddings as parameter of BertModel with default to np.arange. * Let's roll FlaxRoberta ! * Fix missing position_ids parameters on predict for Bert * Flax backend now supports batched inputs Signed-off-by: Morgan Funtowicz <[email protected]> * Make it possible to load msgpacked model on convert from pytorch in last resort. Signed-off-by: Morgan Funtowicz <[email protected]> * Moved save_pretrained to Jax base class along with more constructor parameters. * Use specialized, model dependent conversion functio. * Expose `is_flax_available` in file_utils. * Added unittest for Flax models. * Added run_tests_flax to the CI. * Introduce FlaxAutoModel * Added more unittests * Flax model reference the _MODEL_ARCHIVE_MAP from PyTorch model. * Addressing review comments. * Expose seed in both Bert and Roberta * Fix typo suggested by @stefan-it Co-Authored-By: Stefan Schweter <[email protected]> * Attempt to make style * Attempt to make style in tests too * Added jax & jaxlib to the flax optional dependencies. * Attempt to fix flake8 warnings ... * Redo black again and again * When black and flake8 fight each other for a space ... 💥 💥 💥 * Try removing trailing comma to make both black and flake happy! * Fix invalid is_<framework>_available call, thanks @LysandreJik 🎉 * Fix another invalid import in flax_roberta test * Bump and pin flax release to 0.1.0. * Make flake8 happy, remove unused jax import * Change the type of the catch for msgpack. * Remove unused import. * Put seed as optional constructor parameter. * trigger ci again * Fix too much parameters in BertAttention. * Formatting. * Simplify Flax unittests to avoid machine crashes. * Fix invalid number of arguments when raising issue for an unknown model. * Address @bastings comment in PR, moving jax.jit decorated outside of __call__ * Fix incorrect path to require_flax/require_pytorch functions. Signed-off-by: Morgan Funtowicz <[email protected]> * Attempt to make style. Signed-off-by: Morgan Funtowicz <[email protected]> * Correct rebasing of circle-ci dependencies Signed-off-by: Morgan Funtowicz <[email protected]> * Fix import sorting. Signed-off-by: Morgan Funtowicz <[email protected]> * Fix unused imports. Signed-off-by: Morgan Funtowicz <[email protected]> * Again import sorting... Signed-off-by: Morgan Funtowicz <[email protected]> * Installing missing nlp dependency for flax unittests. Signed-off-by: Morgan Funtowicz <[email protected]> * Fix laoding of model for Flax implementations. Signed-off-by: Morgan Funtowicz <[email protected]> * jit the inner function call to make JAX-compatible Signed-off-by: Morgan Funtowicz <[email protected]> * Format ! Signed-off-by: Morgan Funtowicz <[email protected]> * Flake one more time 🎶 Signed-off-by: Morgan Funtowicz <[email protected]> * Rewrites BERT in Flax to the new Linen API (huggingface#7211) * Rewrite Flax HuggingFace PR to Linen * Some fixes * Fix tests * Fix CI with change of name of nlp (huggingface#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * Expose `is_flax_available` in file_utils. * Added run_tests_flax to the CI. * Attempt to make style * trigger ci again * Fix import sorting. Signed-off-by: Morgan Funtowicz <[email protected]> * Revert "Rewrites BERT in Flax to the new Linen API (huggingface#7211)" This reverts commit 23703a5. * Remove jnp.lax references Signed-off-by: Morgan Funtowicz <[email protected]> * Make style. Signed-off-by: Morgan Funtowicz <[email protected]> * Reintroduce Linen changes ... Signed-off-by: Morgan Funtowicz <[email protected]> * Make style. Signed-off-by: Morgan Funtowicz <[email protected]> * Use jax native's gelu function. Signed-off-by: Morgan Funtowicz <[email protected]> * Renaming BertModel to BertModule to highlight the fact this is the Flax Module object. Signed-off-by: Morgan Funtowicz <[email protected]> * Rewrite FlaxAutoModel test to not rely on pretrained_model_archive_map Signed-off-by: Morgan Funtowicz <[email protected]> * Remove unused variable in BertModule. Signed-off-by: Morgan Funtowicz <[email protected]> * Remove unused variable in BertModule again Signed-off-by: Morgan Funtowicz <[email protected]> * Attempt to have is_flax_available working again. Signed-off-by: Morgan Funtowicz <[email protected]> * Introduce JAX TensorType Signed-off-by: Morgan Funtowicz <[email protected]> * Improve ImportError message when trying to convert to various TensorType format. Signed-off-by: Morgan Funtowicz <[email protected]> * Makes Flax model jittable. Signed-off-by: Morgan Funtowicz <[email protected]> * Ensure flax models are jittable in unittests. Signed-off-by: Morgan Funtowicz <[email protected]> * Remove unused imports. Signed-off-by: Morgan Funtowicz <[email protected]> * Ensure jax imports are guarded behind is_flax_available. Signed-off-by: Morgan Funtowicz <[email protected]> * Make style. Signed-off-by: Morgan Funtowicz <[email protected]> * Make style again Signed-off-by: Morgan Funtowicz <[email protected]> * Make style again again Signed-off-by: Morgan Funtowicz <[email protected]> * Make style again again again Signed-off-by: Morgan Funtowicz <[email protected]> * Update src/transformers/file_utils.py Co-authored-by: Marc van Zee <[email protected]> * Bump flax to it's latest version Co-authored-by: Marc van Zee <[email protected]> * Bump jax version to at least 0.2.0 Signed-off-by: Morgan Funtowicz <[email protected]> * Style. Signed-off-by: Morgan Funtowicz <[email protected]> * Update the unittest to use TensorType.JAX Signed-off-by: Morgan Funtowicz <[email protected]> * isort import in tests. Signed-off-by: Morgan Funtowicz <[email protected]> * Match new flax parameters name "params" Signed-off-by: Morgan Funtowicz <[email protected]> * Remove unused imports. Signed-off-by: Morgan Funtowicz <[email protected]> * Add flax models to transformers __init__ Signed-off-by: Morgan Funtowicz <[email protected]> * Attempt to address all CI related comments. Signed-off-by: Morgan Funtowicz <[email protected]> * Correct circle.yml indent. Signed-off-by: Morgan Funtowicz <[email protected]> * Correct circle.yml indent (2) Signed-off-by: Morgan Funtowicz <[email protected]> * Remove coverage from flax tests Signed-off-by: Morgan Funtowicz <[email protected]> * Addressing many naming suggestions from comments Signed-off-by: Morgan Funtowicz <[email protected]> * Simplify for loop logic to interate over layers in FlaxBertLayerCollection Signed-off-by: Morgan Funtowicz <[email protected]> * use f-string syntax for formatting logs. Signed-off-by: Morgan Funtowicz <[email protected]> * Use config property from FlaxPreTrainedModel. Signed-off-by: Morgan Funtowicz <[email protected]> * use "cls_token" instead of "first_token" variable name. Signed-off-by: Morgan Funtowicz <[email protected]> * use "hidden_state" instead of "h" variable name. Signed-off-by: Morgan Funtowicz <[email protected]> * Correct class reference in docstring to link to Flax related modules. Signed-off-by: Morgan Funtowicz <[email protected]> * Added HF + Google Flax team copyright. Signed-off-by: Morgan Funtowicz <[email protected]> * Make Roberta independent from Bert Signed-off-by: Morgan Funtowicz <[email protected]> * Move activation functions to flax_utils. Signed-off-by: Morgan Funtowicz <[email protected]> * Move activation functions to flax_utils for bert. Signed-off-by: Morgan Funtowicz <[email protected]> * Added docstring for BERT Signed-off-by: Morgan Funtowicz <[email protected]> * Update import for Bert and Roberta tokenizers Signed-off-by: Morgan Funtowicz <[email protected]> * Make style. Signed-off-by: Morgan Funtowicz <[email protected]> * fix-copies Signed-off-by: Morgan Funtowicz <[email protected]> * Correct FlaxRobertaLayer to match PyTorch. Signed-off-by: Morgan Funtowicz <[email protected]> * Use the same store_artifact for flax unittest Signed-off-by: Morgan Funtowicz <[email protected]> * Style. Signed-off-by: Morgan Funtowicz <[email protected]> * Make sure gradient are disabled only locally for flax unittest using torch equivalence. Signed-off-by: Morgan Funtowicz <[email protected]> * Use relative imports Signed-off-by: Morgan Funtowicz <[email protected]> Co-authored-by: Stefan Schweter <[email protected]> Co-authored-by: Marc van Zee <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>

This reverts commit 31ed545.

sgugger added 3 commits September 10, 2020 14:11

nlp -> datasets

44d74c8

More nlp -> datasets

25374c2

Woopsie

2434846

sgugger requested a review from sshleifer September 10, 2020 18:15

sgugger added 2 commits September 10, 2020 14:27

More nlp -> datasets

403cc66

One last

f8b9682

sgugger merged commit 5144867 into master Sep 10, 2020

sgugger deleted the fix_ci_nlp branch September 10, 2020 18:51

stas00 pushed a commit to stas00/transformers that referenced this pull request Sep 10, 2020

Fix CI with change of name of nlp (huggingface#7054)

4d69131

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

mfuntowicz pushed a commit that referenced this pull request Sep 18, 2020

Fix CI with change of name of nlp (#7054)

e1e0640

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

mfuntowicz pushed a commit that referenced this pull request Oct 5, 2020

Fix CI with change of name of nlp (#7054)

104ce1b

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

mfuntowicz pushed a commit that referenced this pull request Oct 19, 2020

Fix CI with change of name of nlp (#7054)

be5106b

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

mfuntowicz pushed a commit that referenced this pull request Oct 19, 2020

Fix CI with change of name of nlp (#7054)

a48403e

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

Zigur pushed a commit to Zigur/transformers that referenced this pull request Oct 26, 2020

Fix CI with change of name of nlp (huggingface#7054)

a8a311f

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

fabiocapsouza pushed a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020

Fix CI with change of name of nlp (huggingface#7054)

31ed545

* nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last

fabiocapsouza added a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020

Revert "Fix CI with change of name of nlp (huggingface#7054)"

bcb9439

This reverts commit 31ed545.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix CI with change of name of nlp #7054

Fix CI with change of name of nlp #7054

sgugger commented Sep 10, 2020 •

edited

Loading

codecov bot commented Sep 10, 2020

sgugger commented Sep 10, 2020

stas00 commented Sep 10, 2020

stas00 commented Sep 10, 2020 •

edited

Loading

sgugger commented Sep 10, 2020

stas00 commented Sep 10, 2020 •

edited

Loading

sgugger commented Sep 10, 2020 •

edited

Loading

stas00 commented Sep 10, 2020

sgugger commented Sep 10, 2020

stas00 commented Sep 10, 2020

stas00 commented Sep 10, 2020 •

edited

Loading

sgugger commented Sep 10, 2020

stas00 commented Sep 10, 2020 •

edited

Loading

sgugger commented Sep 10, 2020

stas00 commented Sep 10, 2020

sgugger commented Sep 10, 2020 •

edited

Loading

Fix CI with change of name of nlp #7054

Fix CI with change of name of nlp #7054

Conversation

sgugger commented Sep 10, 2020 • edited Loading

codecov bot commented Sep 10, 2020

Codecov Report

sgugger commented Sep 10, 2020

stas00 commented Sep 10, 2020

stas00 commented Sep 10, 2020 • edited Loading

sgugger commented Sep 10, 2020

stas00 commented Sep 10, 2020 • edited Loading

sgugger commented Sep 10, 2020 • edited Loading

stas00 commented Sep 10, 2020

sgugger commented Sep 10, 2020

stas00 commented Sep 10, 2020

stas00 commented Sep 10, 2020 • edited Loading

sgugger commented Sep 10, 2020

stas00 commented Sep 10, 2020 • edited Loading

sgugger commented Sep 10, 2020

stas00 commented Sep 10, 2020

sgugger commented Sep 10, 2020 • edited Loading

sgugger commented Sep 10, 2020 •

edited

Loading

stas00 commented Sep 10, 2020 •

edited

Loading

stas00 commented Sep 10, 2020 •

edited

Loading

sgugger commented Sep 10, 2020 •

edited

Loading

stas00 commented Sep 10, 2020 •

edited

Loading

stas00 commented Sep 10, 2020 •

edited

Loading

sgugger commented Sep 10, 2020 •

edited

Loading