Make trend transforms work with NaNs #456

alex-hse-repository · 2022-01-17T14:03:37Z

IMPORTANT: Please do not create a Pull Request without creating an issue first.

Before submitting (must do checklist)

Did you read the contribution guide?
Did you update the docs? We use Numpy format for all the methods and classes.
Did you write any new necessary tests?
Did you update the CHANGELOG?

Type of Change

Examples / docs / tutorials / contributors update
Bug fix (non-breaking change which fixes an issue)
Improvement (non-breaking change which improves an existing feature)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Proposed Changes

Related Issue

Closing issues

closes #417

…ssue-417

codecov-commenter · 2022-01-18T05:30:23Z

Codecov Report

Merging #456 (09d1bec) into master (8b93c44) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #456      +/-   ##
==========================================
- Coverage   87.98%   87.97%   -0.01%     
==========================================
  Files         115      115              
  Lines        5435     5440       +5     
==========================================
+ Hits         4782     4786       +4     
- Misses        653      654       +1

Impacted Files	Coverage Δ
...na/transforms/decomposition/change_points_trend.py	`99.06% <100.00%> (+0.01%)`	⬆️
etna/transforms/decomposition/detrend.py	`98.33% <100.00%> (ø)`
etna/transforms/decomposition/stl.py	`94.28% <100.00%> (+0.25%)`	⬆️
etna/transforms/decomposition/trend.py	`100.00% <100.00%> (ø)`
etna/datasets/tsdataset.py	`89.52% <0.00%> (-0.34%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8b93c44...09d1bec. Read the comment docs.

martins0n

It seems ok

But something like this not working anyway and we should think about it

import pandas as pd
from etna import pipeline

from etna.datasets.tsdataset import TSDataset
from etna.pipeline import Pipeline
from etna.metrics import SMAPE
from etna.transforms import TrendTransform

from etna.models.sarimax import SARIMAXModel
from sklearn.linear_model import LinearRegression
import numpy as np

from etna.transforms.missing_values import imputation

def example_df():
    df1 = pd.DataFrame()
    df1["timestamp"] = pd.date_range(start="2020-01-01", end="2020-02-01", freq="H")
    df1["segment"] = "segment_1"
    df1["target"] = np.arange(len(df1))
        #+ 2 * np.random.normal(size=len(df1)


    df2 = pd.DataFrame()
    df2["timestamp"] = pd.date_range(start="2020-01-01", end="2020-02-01", freq="H")
    df2["segment"] = "segment_2"
    df2["target"] = np.sqrt(np.arange(len(df2)) + 2 * np.cos(np.arange(len(df2))))

    return pd.concat([df1, df2], ignore_index=True)


def df_with_nans_in_tails(example_df):
    df = TSDataset.to_dataset(example_df)
    df.loc[:4, pd.IndexSlice["segment_1", "target"]] = None
    df.loc[-3:, pd.IndexSlice["segment_1", "target"]] = None
    return df
 
example_df = example_df()
df_with_nans_in_tails = df_with_nans_in_tails(example_df)       


from etna.datasets import *
from etna.models import *
from etna.transforms import *
from etna.pipeline import Pipeline
pipeline = Pipeline(model=NaiveModel(), transforms=[LinearTrendTransform("target"), TimeSeriesImputerTransform()], horizon=5)

pipeline.fit(TSDataset(df_with_nans_in_tails, freq="1H"))
print(pipeline.forecast())

martins0n · 2022-01-18T13:10:33Z

etna/transforms/decomposition/change_points_trend.py

-        series = df.loc[df[self.in_column].first_valid_index() :, self.in_column]
+        series = df.loc[df[self.in_column].first_valid_index() : df[self.in_column].last_valid_index(), self.in_column]
+        if series.isnull().values.any():
+            raise ValueError("The input column contains NaNs in the middle of the series! Try to use the imputer.")


Before PR. Did it work with nulls in middle of TSDataset?

…ssue-417

alex-hse-repository added 6 commits January 17, 2022 12:14

Fix detrend

da9bdb0

Fix stl

9ae5837

Fix binseg

38e6ee2

Fix trend

0c99493

Add nan fixtures

be64f7e

Fix linting

f281616

alex-hse-repository added the enhancement New feature or request label Jan 17, 2022

alex-hse-repository self-assigned this Jan 17, 2022

alex-hse-repository marked this pull request as draft January 17, 2022 14:04

alex-hse-repository added 3 commits January 18, 2022 08:19

Fix tests

c64ebcc

Merge branch 'master' of https://github.com/tinkoff-ai/etna-ts into i…

3d2edaa

…ssue-417

Update Changelog

1c5bc36

alex-hse-repository marked this pull request as ready for review January 18, 2022 05:39

martins0n self-requested a review January 18, 2022 07:40

martins0n approved these changes Jan 18, 2022

View reviewed changes

Merge branch 'master' of https://github.com/tinkoff-ai/etna-ts into i…

09d1bec

…ssue-417

martins0n merged commit 75fd188 into master Jan 20, 2022

martins0n deleted the issue-417 branch January 20, 2022 07:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make trend transforms work with NaNs #456

Make trend transforms work with NaNs #456

alex-hse-repository commented Jan 17, 2022 •

edited

Loading

codecov-commenter commented Jan 18, 2022 •

edited

Loading

martins0n left a comment

martins0n Jan 18, 2022

Make trend transforms work with NaNs #456

Make trend transforms work with NaNs #456

Conversation

alex-hse-repository commented Jan 17, 2022 • edited Loading

Before submitting (must do checklist)

Type of Change

Proposed Changes

Related Issue

Closing issues

codecov-commenter commented Jan 18, 2022 • edited Loading

Codecov Report

martins0n left a comment

Choose a reason for hiding this comment

martins0n Jan 18, 2022

Choose a reason for hiding this comment

alex-hse-repository commented Jan 17, 2022 •

edited

Loading

codecov-commenter commented Jan 18, 2022 •

edited

Loading