Skip to content

Shift of exogenous data #1254

Merged
merged 19 commits into from
May 10, 2023
Merged

Shift of exogenous data #1254

merged 19 commits into from
May 10, 2023

Conversation

brsnw250
Copy link
Collaborator

@brsnw250 brsnw250 commented May 3, 2023

Before submitting (must do checklist)

  • Did you read the contribution guide?
  • Did you update the docs? We use Numpy format for all the methods and classes.
  • Did you write any new necessary tests?
  • Did you update the CHANGELOG?

Proposed Changes

Closing issues

closes #1234

@brsnw250 brsnw250 self-assigned this May 3, 2023
@github-actions
Copy link

github-actions bot commented May 3, 2023

@github-actions github-actions bot temporarily deployed to pull request May 3, 2023 08:07 Inactive
@codecov-commenter
Copy link

codecov-commenter commented May 3, 2023

Codecov Report

Merging #1254 (6d4fcb7) into master (634a5c6) will increase coverage by 0.11%.
The diff coverage is 100.00%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##           master    #1254      +/-   ##
==========================================
+ Coverage   87.88%   87.99%   +0.11%     
==========================================
  Files         186      186              
  Lines       10649    10749     +100     
==========================================
+ Hits         9359     9459     +100     
  Misses       1290     1290              
Impacted Files Coverage Δ
etna/transforms/__init__.py 100.00% <100.00%> (ø)
etna/transforms/math/__init__.py 100.00% <100.00%> (ø)
etna/transforms/math/lags.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@github-actions github-actions bot temporarily deployed to pull request May 3, 2023 09:01 Inactive
@Mr-Geekman Mr-Geekman self-requested a review May 4, 2023 07:38
etna/transforms/math/lags.py Show resolved Hide resolved
etna/transforms/math/lags.py Outdated Show resolved Hide resolved
etna/transforms/math/lags.py Outdated Show resolved Hide resolved
etna/transforms/math/lags.py Outdated Show resolved Hide resolved
etna/transforms/math/lags.py Outdated Show resolved Hide resolved
etna/transforms/math/lags.py Outdated Show resolved Hide resolved
etna/transforms/math/lags.py Outdated Show resolved Hide resolved
etna/transforms/math/lags.py Outdated Show resolved Hide resolved
etna/transforms/math/lags.py Show resolved Hide resolved
etna/transforms/base.py Outdated Show resolved Hide resolved
@github-actions github-actions bot temporarily deployed to pull request May 5, 2023 09:16 Inactive
@brsnw250 brsnw250 requested a review from Mr-Geekman May 5, 2023 09:45
@github-actions github-actions bot temporarily deployed to pull request May 5, 2023 12:46 Inactive
The fitted transform instance.
"""
df_exog = ts.df_exog
if df_exog is not None and isinstance(df_exog, pd.DataFrame):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need isinstance check here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we put all the necessary logic of checking isinstance inside of _save_exog_last_date? I think we should also set self._exog_last_date = {} if there are no exogs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check was needed for the base class. Now it seems irrelevant.

if not self._auto:
return self.lag # type: ignore

freq = pd.infer_freq(df.index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we somehow get freq from ts.freq?

Copy link
Collaborator Author

@brsnw250 brsnw250 May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can save this frequency in fit method and reuse it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are also using frequency in transform.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To work, this transform needs similar frequency both on fit and transform.

raise ValueError("Transform is not fitted!")

result = df
freq = pd.infer_freq(df.index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same about freq.

"""
df_exog = ts.df_exog
if df_exog is not None and isinstance(df_exog, pd.DataFrame):
self._save_exog_last_date(df_exog=df_exog)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we passing df_exog here? We can only shift something that was in df_exog?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, the use case of this transform is to shift additional regressors when they are not available all the way down to the horizon. Is this transform necessary in case of other types of features?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think we will save us from some trouble if we will keep working only with df_exog.

The problem that we can have quantiles in df_exog in theory. It shouldn't be possible but it can work like this right now...
I think we can ingore this problem in this transform for now. It should be fixed on the level of handling quantiles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could put logic with checking None inside of self._save_exog_last_date.

@github-actions github-actions bot temporarily deployed to pull request May 10, 2023 08:44 Inactive
@brsnw250 brsnw250 requested a review from Mr-Geekman May 10, 2023 09:28
"""
df_exog = ts.df_exog
if df_exog is not None and isinstance(df_exog, pd.DataFrame):
self._save_exog_last_date(df_exog=df_exog)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could put logic with checking None inside of self._save_exog_last_date.

etna/transforms/math/lags.py Show resolved Hide resolved
feature_names = list(self._exog_last_date.keys())

else:
feature_names = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we really reach this else clause?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for example when dataset has no exog varibales.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rewrite it like this?

        if self._exog_last_date is not None:
            feature_names = list(self._exog_last_date.keys())

        else:
            feature_names = []

Because we have a guarantee that if self._exog_last_date isn't None it is dict.

It also can't be called with self._exog_last_date is None unless it is called directly.

etna/transforms/math/lags.py Show resolved Hide resolved
@github-actions github-actions bot temporarily deployed to pull request May 10, 2023 10:53 Inactive
@brsnw250 brsnw250 requested a review from Mr-Geekman May 10, 2023 11:40
@github-actions github-actions bot temporarily deployed to pull request May 10, 2023 12:15 Inactive
@brsnw250 brsnw250 merged commit e1f642f into master May 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Shift of exogenous data
3 participants