Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] SWA and SE with non cyclic schedulers #395

Merged
merged 4 commits into from
Mar 9, 2022
Merged

Conversation

ravinkohli
Copy link
Contributor

This PR fixes a bug in swa and se when they are used with noncyclic schedulers. Also enables learned entity embeddings for reg_cocktails.

@ravinkohli ravinkohli changed the base branch from master to reg_cocktails February 28, 2022 14:32
@ravinkohli ravinkohli changed the title Fix swa se [FIX] SWA and SE with non cyclic schedulers Feb 28, 2022
@codecov
Copy link

codecov bot commented Feb 28, 2022

Codecov Report

Merging #395 (989f3ac) into reg_cocktails (484ead4) will increase coverage by 0.56%.
The diff coverage is 79.31%.

❗ Current head 989f3ac differs from pull request most recent head 0cc6b46. Consider uploading reports for the commit 0cc6b46 to get more accurate results

Impacted file tree graph

@@                Coverage Diff                @@
##           reg_cocktails     #395      +/-   ##
=================================================
+ Coverage          82.89%   83.46%   +0.56%     
=================================================
  Files                172      172              
  Lines              10336    10369      +33     
  Branches            1815     1827      +12     
=================================================
+ Hits                8568     8654      +86     
+ Misses              1244     1184      -60     
- Partials             524      531       +7     
Impacted Files Coverage Δ
...ine/components/setup/network_embedding/__init__.py 65.95% <63.63%> (-1.17%) ⬇️
autoPyTorch/pipeline/tabular_regression.py 82.56% <75.00%> (+11.28%) ⬆️
autoPyTorch/pipeline/tabular_classification.py 82.17% <87.50%> (+9.44%) ⬆️
...peline/components/training/trainer/base_trainer.py 95.58% <95.00%> (-0.07%) ⬇️
...eline/components/setup/optimizer/base_optimizer.py 80.95% <0.00%> (-19.05%) ⬇️
...abular_preprocessing/base_tabular_preprocessing.py 66.66% <0.00%> (-13.34%) ⬇️
...ne/components/setup/lr_scheduler/base_scheduler.py 93.10% <0.00%> (-6.90%) ⬇️
...up/network_initializer/base_network_initializer.py 93.54% <0.00%> (-6.46%) ⬇️
...nts/setup/early_preprocessor/EarlyPreprocessing.py 85.29% <0.00%> (-5.89%) ⬇️
.../pipeline/components/setup/network/base_network.py 89.87% <0.00%> (-5.07%) ⬇️
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 484ead4...0cc6b46. Read the comment docs.

Copy link
Contributor

@nabenabe0928 nabenabe0928 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Thanks for the PR.
I checked except base_trainer.

Comment on lines 298 to 299
exclude=exclude if bool(exclude) else None,
include=include if bool(include) else None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use if statements here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes because our get_available_components logic checks if include or exclude are None or not. I think though we dont need to use get_available_components as I only need the string names. I'll fix this.

Comment on lines 292 to 313
# Disable CyclicLR until todo is completed.
if 'lr_scheduler' in self.named_steps.keys() and 'trainer' in self.named_steps.keys():
trainers = cs.get_hyperparameter('trainer:__choice__').choices
for trainer in trainers:
available_schedulers = self.named_steps['lr_scheduler'].get_available_components(
dataset_properties=dataset_properties,
exclude=exclude if bool(exclude) else None,
include=include if bool(include) else None)
# TODO: update cyclic lr to use n_restarts and adjust according to batch size
cyclic_lr_name = 'CyclicLR'
if cyclic_lr_name in available_schedulers:
# disable snapshot ensembles and stochastic weight averaging
cs.add_forbidden_clause(ForbiddenAndConjunction(
ForbiddenEqualsClause(cs.get_hyperparameter(
f'trainer:{trainer}:use_snapshot_ensemble'), True),
ForbiddenEqualsClause(cs.get_hyperparameter('lr_scheduler:__choice__'), cyclic_lr_name)
))
cs.add_forbidden_clause(ForbiddenAndConjunction(
ForbiddenEqualsClause(cs.get_hyperparameter(
f'trainer:{trainer}:use_stochastic_weight_averaging'), True),
ForbiddenEqualsClause(cs.get_hyperparameter('lr_scheduler:__choice__'), cyclic_lr_name)
))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make a function for it because we do the same thing for both tabular_{classification, regression}?
And this method is long.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Comment on lines +161 to +190
updates = self._get_search_space_updates()
if '__choice__' in updates.keys():
choice_hyperparameter = updates['__choice__']
if not set(choice_hyperparameter.value_range).issubset(available_embedding):
raise ValueError("Expected given update for {} to have "
"choices in {} got {}".format(self.__class__.__name__,
available_embedding,
choice_hyperparameter.value_range))
if len(categorical_columns) == 0:
assert len(choice_hyperparameter.value_range) == 1
if 'NoEmbedding' not in choice_hyperparameter.value_range:
raise ValueError("Provided {} in choices, however, the dataset "
"is incompatible with it".format(choice_hyperparameter.value_range))
embedding = CSH.CategoricalHyperparameter('__choice__',
choice_hyperparameter.value_range,
default_value=choice_hyperparameter.default_value)
else:

if len(categorical_columns) == 0:
default = 'NoEmbedding'
if include is not None and default not in include:
raise ValueError("Provided {} in include, however, the dataset "
"is incompatible with it".format(include))
embedding = CSH.CategoricalHyperparameter('__choice__',
['NoEmbedding'],
default_value=default)
else:
embedding = CSH.CategoricalHyperparameter('__choice__',
list(available_embedding.keys()),
default_value=default)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a method for this?

    def _get_embedding_config(self, updates, avail_embeddings, default: str = "NoEmbedding"):
        if '__choice__' in updates.keys():
            choices, default = updates['__choice__'].value_range, updates['__choice__'].default_value
            if not set(choices).issubset(avail_embeddings):
                raise ValueError(
                    f"The choices for {self.__class__.__name__} must be in {avail_embeddings}, but got {choices}"
                )
            if len(categorical_columns) == 0:
                assert len(choices) == 1
                if 'NoEmbedding' not in choices:
                    raise ValueError(f"The choices must include `NoEmbedding`, but got {choices}")

            embedding = CSH.CategoricalHyperparameter('__choice__', choices, default_value=default)
        elif len(categorical_columns) == 0:
            default = 'NoEmbedding'
            if include is not None and default not in include:
                raise ValueError(f"default `{default}` must be in `include`: {include}")

            embedding = CSH.CategoricalHyperparameter('__choice__', ['NoEmbedding'], default_value=default)
        else:
            choices = list(available_embedding.keys())
            embedding = CSH.CategoricalHyperparameter('__choice__', choices, default_value=default)

        return embedding

Copy link
Contributor Author

@ravinkohli ravinkohli Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me do that after we are done with the experiments. There is an issue raised for this. #377 .

Comment on lines -374 to -375
model_copy = deepcopy(self.swa_model) if self.use_stochastic_weight_averaging \
else deepcopy(self.model)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to add the condition is_last_epoch in the new implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He caught a bug, when there was a wrong implementation initially for se and swa when both activated and it was fixed, it was not fixed for the case of a non-cyclic scheduler. Hence now that you fixed it, it is different from the previous implementation since you had no if last epoch check to add the swa model only in the end, but you add 3 copies of it throughout.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, now it is ok and correct.

Copy link

@ArlindKadra ArlindKadra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had one small question and except that everything looks good, especially refactoring to remove the code duplication.

))
break
except ValueError:
# change the default and try again

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we get an error here and why do we need to change the default?
It is not the case that we might have a default value that is not in the list of choices right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will get a value error in case LearnedEntityEmbedding was the default value for the __choice__ hyperparameter

@ravinkohli ravinkohli merged commit ce89f92 into reg_cocktails Mar 9, 2022
ravinkohli added a commit that referenced this pull request Mar 9, 2022
* Enable learned embeddings, fix bug with non cyclic schedulers

* add forbidden condition cyclic lr

* refactor base_pipeline forbidden conditions

* Apply suggestions from code review

Co-authored-by: nabenabe0928 <[email protected]>

Co-authored-by: nabenabe0928 <[email protected]>
@ravinkohli ravinkohli deleted the fix_swa_se branch June 15, 2022 13:15
ravinkohli added a commit that referenced this pull request Jul 26, 2022
* Enable learned embeddings, fix bug with non cyclic schedulers

* add forbidden condition cyclic lr

* refactor base_pipeline forbidden conditions

* Apply suggestions from code review

Co-authored-by: nabenabe0928 <[email protected]>

Co-authored-by: nabenabe0928 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants