Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on T5 NLG #187

Open
AtheerAlgherairy opened this issue Jan 22, 2024 · 4 comments
Open

Question on T5 NLG #187

AtheerAlgherairy opened this issue Jan 22, 2024 · 4 comments

Comments

@AtheerAlgherairy
Copy link

I have a question regarding the training loss for T5 NLG. If we do not set 'metric_for_best_model' to 'bleu,' as shown in the picture below, is it automatically set to 'loss'? What is the best practice for training T5 NLG?

Screen Shot 1445-07-10 at 11 47 57 AM

@zqwerty
Copy link
Member

zqwerty commented Jan 24, 2024

Yes, the default metric is loss. According to some NLG studies/practices, continually training the model after achieving the lowest validation loss can still improve metrics like BLEU.

@AtheerAlgherairy
Copy link
Author

Thanks..

@AtheerAlgherairy
Copy link
Author

AtheerAlgherairy commented Feb 18, 2024

Hi.. I used T5-base for NLG.. I got the following results..

Screen Shot 1445-08-08 at 2 02 15 PM

Screen Shot 1445-08-08 at 2 03 08 PM

However, I got 'err': 0.5966753105391215 . Any idea how to improve the "err"?

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10.0

Framework versions

  • Transformers 4.24.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.7.1
  • Tokenizers 0.13.2

@zqwerty
Copy link
Member

zqwerty commented Mar 11, 2024

Sorry for the late reply. Does "err" mean slot error rate? Maybe you could try some pre-training?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants