Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: Add finetune_depth parameter #471

Merged
merged 24 commits into from
Oct 15, 2024
Merged

FEAT: Add finetune_depth parameter #471

merged 24 commits into from
Oct 15, 2024

Conversation

marcopeix
Copy link
Contributor

Add the finetune_depth parameter to control how many layers are finetuned.
Adjust tutorials and capabilities with new parameter.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

github-actions bot commented Sep 10, 2024

Experiment Results

Experiment 1: air-passengers

Description:

variable experiment
h 12
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 12.6793 11.0623 47.8333 76
mape 0.027 0.0232 0.0999 0.1425
mse 213.936 199.132 2571.33 10604.2
total_time 1.8765 1.8137 0.0055 0.004

Plot:

Experiment 2: air-passengers

Description:

variable experiment
h 24
season_length 12
freq MS
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 58.1031 58.4587 71.25 115.25
mape 0.1257 0.1267 0.1552 0.2358
mse 4040.21 4110.79 5928.17 18859.2
total_time 0.5784 1.0226 0.0045 0.004

Plot:

Experiment 3: electricity-multiple-series

Description:

variable experiment
h 24
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 178.293 268.121 269.23 1331.02
mape 0.0234 0.0311 0.0304 0.1692
mse 121588 219457 213677 4.68961e+06
total_time 0.5359 3.2004 0.0055 0.0051

Plot:

Experiment 4: electricity-multiple-series

Description:

variable experiment
h 168
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 465.532 346.984 398.956 1119.26
mape 0.062 0.0437 0.0512 0.1583
mse 835120 403787 656723 3.17316e+06
total_time 0.5502 1.1355 0.0059 0.0053

Plot:

Experiment 5: electricity-multiple-series

Description:

variable experiment
h 336
season_length 24
freq H
level None
n_windows 1

Results:

metric timegpt-1 timegpt-1-long-horizon SeasonalNaive Naive
mae 558.649 459.769 602.926 1340.95
mape 0.0697 0.0566 0.0787 0.17
mse 1.22721e+06 739135 1.61572e+06 6.04619e+06
total_time 0.6212 0.7403 0.006 0.0053

Plot:

@marcopeix marcopeix marked this pull request as ready for review September 10, 2024 18:37
Copy link
Contributor

@elephaint elephaint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Few comments, and I think we should include a test in nixtla_client.ipynb.

nixtla/nixtla_client.py Outdated Show resolved Hide resolved
nixtla/nixtla_client.py Outdated Show resolved Hide resolved
nbs/docs/capabilities/forecast/07_finetuning.ipynb Outdated Show resolved Hide resolved
nbs/docs/tutorials/06_finetuning.ipynb Outdated Show resolved Hide resolved
nbs/src/nixtla_client.ipynb Show resolved Hide resolved
Copy link
Contributor

@elephaint elephaint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I removed last couple of mentions of layers.

Copy link
Member

@jmoralez jmoralez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add a test verifying that as the finetune depth increases the loss is lower in the nbs/docs/tutorials/06_finetuning.ipynb notebook.

nixtla/nixtla_client.py Outdated Show resolved Hide resolved
nixtla/nixtla_client.py Outdated Show resolved Hide resolved
@elephaint
Copy link
Contributor

Please also add a test verifying that as the finetune depth increases the loss is lower in the nbs/docs/tutorials/06_finetuning.ipynb notebook.

Such a test doesn't always work; i.e. it's not always the case that finetuning improves the model. So I'm removing it again....

@jmoralez
Copy link
Member

jmoralez commented Oct 3, 2024

What changed between now and 7bc1b5d? That one has very different results (nb link). This is supposed to be deterministic, isn't it? I'd expect to be able to reproduce the metrics from that commit every time, especially the monotonic part, right now 2 and 3 yield the same result which is highly suspicious.
image
image

@elephaint
Copy link
Contributor

What changed between now and 7bc1b5d? That one has very different results (nb link). This is supposed to be deterministic, isn't it? I'd expect to be able to reproduce the metrics from that commit every time, especially the monotonic part, right now 2 and 3 yield the same result which is highly suspicious. image image

Nothing really changed, the issue is that it doesn't hold in general that:

  • Finetune depth higher -> better performence
  • Finetune steps more -> better performance

I tweaked the parameters so that the results are not good but monotonic (however as said before, finetuning isn't guaranteed to provide results that are strictly better when increasing the parameters)

@jmoralez
Copy link
Member

jmoralez commented Oct 3, 2024

Thanks! So the test would pass now? It'd be great having loss_depth1 > loss_depth2 > loss_depth3 to detect possible regressions or the parameter not being passed through correctly.

@elephaint
Copy link
Contributor

elephaint commented Oct 3, 2024

Thanks! So the test would pass now? It'd be great having loss_depth1 > loss_depth2 > loss_depth3 to detect possible regressions or the parameter not being passed through correctly.

No, because:

  • Finetune depth higher does not lead to better performance in general
  • Finetune steps higher does not lead to better performance in general

so we shouldn't market that view, either. And a test on that is useless too; if it fails, the results might still be better than before.

I've updated the example to explain that also (so users see that increasing depth can also worsen performance, and it's a bit of trial and error).

The tutorial fails if the parameter isn't passed through correctly, so we're covered there anyways.

cchallu
cchallu previously requested changes Oct 3, 2024
Copy link
Contributor

@cchallu cchallu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see notebook

@marcopeix marcopeix requested a review from cchallu October 8, 2024 18:48
Copy link
Contributor

@elephaint elephaint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I only fixed some typos in the latest revision

@elephaint elephaint dismissed cchallu’s stale review October 15, 2024 16:15

As discussed on Slack

@elephaint elephaint merged commit 0359bea into main Oct 15, 2024
12 checks passed
@elephaint elephaint deleted the feature/finetune_depth branch October 15, 2024 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants