Add option to use scheduled sampling in CopyNet #309

JohnGiorgi · 2021-11-22T19:22:53Z

This PR adds the ability to use scheduled sampling in CopyNetSeq2Seq by supplying an argument for scheduled_sampling_ratio that's greater than zero. It is essentially a copy/paste from SimpleSeq2Seq.

This helps reduce the differences in the SimpleSeq2Seq and CopyNetSeq2Seq model arguments. It is also backwards compatible with a default value of 0 (no scheduled sampling i.e. teacher forcing).

epwalsh

Thanks @JohnGiorgi! I think this is a good addition. I just have a couple of suggestions:

I think we should have a test
I think the default for scheduled_sampling_ratio should be None. And then when it is None we shouldn't call torch.rand(). That way there is no performance penalty for this feature.

JohnGiorgi · 2021-12-10T16:01:09Z

@epwalsh Awesome, thanks for the feedback.

Updated the if statement so that if scheduled_sampling_ratio is falsey, torch.rand is not called. You could set the default to None, but this makes the default value for this parameter different than simple_seq2seq.
Added a simple test that the model can train with 0 < scheduled_sampling_ratio < 1. I wasn't really sure how to write a more significant test due to the randomness. Let me know if you have a different test in mind!

I could also update simple_seq2seq so that torch.rand is not called when scheduled_sampling_ratio is falsey, to save a tiny bit of performance there! It is currently calling torch.rand for the default scheduled_sampling_ratio of 0.0.

JohnGiorgi · 2021-12-10T16:29:37Z

allennlp_models/generation/models/copynet_seq2seq.py

+                # Use gold tokens at test time and at a rate of 1 - _scheduled_sampling_ratio
+                # during training.
+                # shape: (batch_size,)
+                input_choices = last_predictions


Ah, I realized this implementation doesn't work, because last_predictions is never updated. I would have had to take the index of the token with the highest probability for this timestep under the model. Something like:

last_predictions = torch.max(torch.cat((generation_scores, copy_scores), -1), -1)

@epwalsh does this make sense?

Hmm, yeup, good catch. To avoid duplicate computation you could use all_scores from the _get_ll_contrib() method. And note that you will need to take into account this mask. So I suggest returning all_scores and mask from _get_ll_contrib so you can use them here.

Gotcha. Could I just return log_probs from _get_ll_contrib()? Its computed like: log_probs = util.masked_log_softmax(all_scores, mask).

Yes, good point.

Awesome, just pushed that change.

allennlp_models/generation/models/copynet_seq2seq.py

epwalsh · 2021-12-13T16:59:12Z

allennlp_models/generation/models/copynet_seq2seq.py

+                # Use gold tokens at test time and at a rate of 1 - _scheduled_sampling_ratio
+                # during training.
+                # shape: (batch_size,)
+                input_choices = last_predictions


Hmm, yeup, good catch. To avoid duplicate computation you could use all_scores from the _get_ll_contrib() method. And note that you will need to take into account this mask. So I suggest returning all_scores and mask from _get_ll_contrib so you can use them here.

tests/generation/models/copynet_test.py

Co-authored-by: Pete <[email protected]>

JohnGiorgi · 2021-12-13T17:49:35Z

tests/generation/models/simple_seq2seq_test.py

+    def test_model_can_train_with_scheduled_sampling_ratio(self):
+        train_model_from_file(
+            self.param_file,
+            self.TEST_DIR,
+            overrides="{'model.scheduled_sampling_ratio':0.5}",
+        )
+


@epwalsh Added the same test for scheduled sampling to simple_seq2seq.

JohnGiorgi · 2021-12-13T17:50:21Z

allennlp_models/generation/models/simple_seq2seq.py

+            if (
+                self.training
+                and self._scheduled_sampling_ratio > 0.0
+                and torch.rand(1).item() < self._scheduled_sampling_ratio
+            ):


@epwalsh Added a similar condition to simple_seq2seq to avoid the call to torch.rand when _scheduled_sampling_ratio is 0.0.

epwalsh

This LGTM! Can you just update the CHANGELOG? Then I think this is good to go.

JohnGiorgi · 2021-12-13T17:52:49Z

Cool! Changelog updated 👍

Add option to use scheduled sampling in copynet

dd349ff

epwalsh suggested changes Dec 3, 2021

View reviewed changes

JohnGiorgi added 3 commits December 10, 2021 10:44

Don't call torch.rand if sampling_ratio is None or 0.0

2a3a702

Add test for copynet with scheduled_sampling_ratio

a15356f

Merge branch 'main' into add-scheduled-sampling-to-copynet

bc6c191

Avoid call to torch.rand in sampling_ration falsey

61792c2

JohnGiorgi commented Dec 10, 2021

View reviewed changes

epwalsh reviewed Dec 13, 2021

View reviewed changes

JohnGiorgi and others added 5 commits December 13, 2021 12:25

Update allennlp_models/generation/models/copynet_seq2seq.py

acd7f29

Co-authored-by: Pete <[email protected]>

Correctly compute last_predictions

4ce4c7b

Update copynet get_ll_contrib test

c4d5444

Avoid call to torch.rand if scheduled_sampling is 0

0cc0197

Add test for scheduled sampling

39ef089

JohnGiorgi commented Dec 13, 2021

View reviewed changes

epwalsh reviewed Dec 13, 2021

View reviewed changes

Update changelog

36d08f7

epwalsh enabled auto-merge (squash) December 13, 2021 17:57

epwalsh approved these changes Dec 13, 2021

View reviewed changes

epwalsh merged commit 4866862 into allenai:main Dec 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to use scheduled sampling in CopyNet #309

Add option to use scheduled sampling in CopyNet #309

JohnGiorgi commented Nov 22, 2021 •

edited

Loading

epwalsh left a comment •

edited

Loading

JohnGiorgi commented Dec 10, 2021 •

edited

Loading

JohnGiorgi Dec 10, 2021 •

edited

Loading

epwalsh Dec 13, 2021

JohnGiorgi Dec 13, 2021

epwalsh Dec 13, 2021

JohnGiorgi Dec 13, 2021

epwalsh Dec 13, 2021

JohnGiorgi Dec 13, 2021 •

edited

Loading

JohnGiorgi Dec 13, 2021

epwalsh left a comment

JohnGiorgi commented Dec 13, 2021

Add option to use scheduled sampling in CopyNet #309

Add option to use scheduled sampling in CopyNet #309

Conversation

JohnGiorgi commented Nov 22, 2021 • edited Loading

epwalsh left a comment • edited Loading

Choose a reason for hiding this comment

JohnGiorgi commented Dec 10, 2021 • edited Loading

JohnGiorgi Dec 10, 2021 • edited Loading

Choose a reason for hiding this comment

epwalsh Dec 13, 2021

Choose a reason for hiding this comment

JohnGiorgi Dec 13, 2021

Choose a reason for hiding this comment

epwalsh Dec 13, 2021

Choose a reason for hiding this comment

JohnGiorgi Dec 13, 2021

Choose a reason for hiding this comment

epwalsh Dec 13, 2021

Choose a reason for hiding this comment

JohnGiorgi Dec 13, 2021 • edited Loading

Choose a reason for hiding this comment

JohnGiorgi Dec 13, 2021

Choose a reason for hiding this comment

epwalsh left a comment

Choose a reason for hiding this comment

JohnGiorgi commented Dec 13, 2021

JohnGiorgi commented Nov 22, 2021 •

edited

Loading

epwalsh left a comment •

edited

Loading

JohnGiorgi commented Dec 10, 2021 •

edited

Loading

JohnGiorgi Dec 10, 2021 •

edited

Loading

JohnGiorgi Dec 13, 2021 •

edited

Loading