Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude bias and norm from weight decay #291

Merged
merged 4 commits into from
Aug 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/tutorials/offline_linear_eval.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ However, in this tutorial, we will simply define all the needed parameters to pe
"precision": 16,
"lars": False,
"lr": 0.1,
"exclude_bias_n_norm": False,
"exclude_bias_n_norm_lars": False,
"gpus": "0",
"weight_decay": 0,
"extra_optimizer_args": {"momentum": 0.9},
Expand Down
4 changes: 2 additions & 2 deletions docs/source/tutorials/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ However, for now, we won't rely on this, so let's just define all the needed par
"grad_clip_lars": True,
"weight_decay": 0.00001,
"classifier_lr": 0.5,
"exclude_bias_n_norm": True,
"exclude_bias_n_norm_lars": True,
"accumulate_grad_batches": 1,
"extra_optimizer_args": {"momentum": 0.9},
"scheduler": "warmup_cosine",
Expand Down Expand Up @@ -206,7 +206,7 @@ And that's it, we basically replicated a small version of ``main_pretrain.py``.
--lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.3 \
--weight_decay 1e-4 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/finetune/imagenet-100/mae.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ python3 main_linear.py \
--num_workers 10 \
--data_format h5 \
--name mae-finetune-eval-imagenet100 \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $2 \
--project solo-learn \
--entity unitn-mhug \
--save_checkpoint \
Expand Down
2 changes: 1 addition & 1 deletion scripts/knn/imagenet-100/knn.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ python3 main_knn.py \
--val_data_path /datasets/imagenet-100/val \
--batch_size 16 \
--num_workers 10 \
--pretrained_checkpoint_dir PATH \
--pretrained_checkpoint_dir $1 \
--k 1 2 5 10 20 50 100 200 \
--temperature 0.01 0.02 0.05 0.07 0.1 0.2 0.5 1 \
--feature_type backbone projector \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/barlow_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ python3 main_linear.py \
--num_workers 4 \
--data_format dali \
--name barlow-imagenet100-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
4 changes: 2 additions & 2 deletions scripts/linear/imagenet-100/byol_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@ python3 main_linear.py \
--num_workers 4 \
--data_format dali \
--name byol-imagenet100-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
--save_checkpoint \
--auto_resume
--auto_resume
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/deepclusterv2_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ python3 main_linear.py \
--num_workers 5 \
--data_format dali \
--name deepclusterv2-imagenet100-linear-eval \
--pretrained_feature_extractor PATH --project solo-learn \
--pretrained_feature_extractor $1 --project solo-learn \
--entity unitn-mhug \
--wandb \
--save_checkpoint \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/dino_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ python3 main_linear.py \
--num_workers 4 \
--data_format dali \
--name dino-imagenet100-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/general_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ python3 main_linear.py \
--num_workers 10 \
--data_format dali \
--name method-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/mocov2plus_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ python3 main_linear.py \
--num_workers 10 \
--data_format dali \
--name mocov2plus-imagenet100-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/mocov3_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ python3 main_linear.py \
--num_workers 10 \
--data_format dali \
--name mocov3-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/mocov3_vit_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ python3 main_linear.py \
--num_workers 10 \
--data_format dali \
--name mocov3-vit-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--pretrain_method mocov3 \
--project solo-learn \
--entity unitn-mhug \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/nnclr_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ python3 main_linear.py \
--num_workers 10 \
--data_format dali \
--name nnclr-imagenet100-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/ressl_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ python3 main_linear.py \
--num_workers 10 \
--data_format dali \
--name ressl-imagenet100-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/simclr_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ python3 main_linear.py \
--num_workers 10 \
--data_format dali \
--name simclr-imagenet100-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/simsiam_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ python3 main_linear.py \
--num_workers 10 \
--data_format dali \
--name simsiam-imagenet100-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/swav_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ python3 main_linear.py \
--num_workers 5 \
--data_format dali \
--name swav-imagenet100-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/vibcreg_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ python3 main_linear.py \
--num_workers 5 \
--data_format dali \
--name vibcreg-imagenet100-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet-100/vicreg_linear.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ python3 main_linear.py \
--num_workers 5 \
--data_format dali \
--name vicreg-imagenet100-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet/barlow.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ python3 main_linear.py \
--batch_size 256 \
--num_workers 10 \
--data_format dali \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--name barlow-resnet50-imagenet-linear-eval \
--entity unitn-mhug \
--project solo-learn \
Expand Down
2 changes: 1 addition & 1 deletion scripts/linear/imagenet/byol.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ python3 main_linear.py \
--batch_size 256 \
--num_workers 10 \
--data_format dali \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--name byol-resnet50-imagenet-linear-eval \
--entity unitn-mhug \
--project solo-learn \
Expand Down
4 changes: 2 additions & 2 deletions scripts/linear/imagenet/mocov2plus.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@ python3 main_linear.py \
--num_workers 5 \
--data_format dali \
--name mocov2plus-imagenet-linear-eval \
--pretrained_feature_extractor PATH \
--pretrained_feature_extractor $1 \
--project solo-learn \
--entity unitn-mhug \
--wandb \
--save_checkpoint \
--auto_resume
--auto_resume
2 changes: 1 addition & 1 deletion scripts/pretrain/cifar/barlow.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.3 \
--weight_decay 1e-4 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/cifar/byol.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 1.0 \
--classifier_lr 0.1 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/cifar/dino.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.3 \
--classifier_lr 0.1 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/cifar/mocov3.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ python3 main_pretrain.py \
--precision 16 \
--optimizer lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.3 \
--classifier_lr 0.3 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/cifar/nnbyol.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 1.0 \
--classifier_lr 0.1 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/cifar/nnclr.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.4 \
--classifier_lr 0.1 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/cifar/simclr.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.4 \
--classifier_lr 0.1 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/cifar/supcon.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.4 \
--classifier_lr 0.1 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/cifar/vibcreg.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.3 \
--weight_decay 1e-4 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/cifar/vicreg.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.3 \
--weight_decay 1e-4 \
Expand Down
4 changes: 2 additions & 2 deletions scripts/pretrain/custom/byol.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
python3 main_pretrain.py \
--dataset custom \
--backbone resnet18 \
--train_data_path PATH_TO_TRAIN_DIR \
--train_data_path $1_TO_TRAIN_DIR \
--no_labels \
--max_epochs 400 \
--devices 0,1 \
Expand All @@ -16,7 +16,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 1.0 \
--classifier_lr 0.1 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/imagenet-100/barlow.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.3 \
--weight_decay 1e-4 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/imagenet-100/byol.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.5 \
--classifier_lr 0.1 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/imagenet-100/deepclusterv2.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.6 \
--min_lr 0.0006 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/imagenet-100/dino.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.3 \
--classifier_lr 0.1 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/imagenet-100/mocov3.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ python3 main_pretrain.py \
--precision 16 \
--optimizer lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.3 \
--classifier_lr 0.3 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/imagenet-100/multicrop/byol.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.5 \
--classifier_lr 0.1 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/imagenet-100/multicrop/simclr.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.3 \
--weight_decay 1e-4 \
Expand Down
2 changes: 1 addition & 1 deletion scripts/pretrain/imagenet-100/multicrop/supcon.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ python3 main_pretrain.py \
--optimizer lars \
--grad_clip_lars \
--eta_lars 0.02 \
--exclude_bias_n_norm \
--exclude_bias_n_norm_lars \
--scheduler warmup_cosine \
--lr 0.3 \
--weight_decay 1e-4 \
Expand Down
Loading