Skip to content

New data access methods in TSDataset #809

Merged
merged 15 commits into from
Jul 27, 2022
Merged

Conversation

alex-hse-repository
Copy link
Collaborator

@alex-hse-repository alex-hse-repository commented Jul 21, 2022

Before submitting (must do checklist)

  • Did you read the contribution guide?
  • Did you update the docs? We use Numpy format for all the methods and classes.
  • Did you write any new necessary tests?
  • Did you update the CHANGELOG?

Proposed Changes

Closing issues

closes #793

@alex-hse-repository alex-hse-repository added the enhancement New feature or request label Jul 21, 2022
@alex-hse-repository alex-hse-repository self-assigned this Jul 21, 2022
@alex-hse-repository alex-hse-repository changed the title Issue 793 New data access methods in TSDataset Jul 21, 2022
@github-actions
Copy link

github-actions bot commented Jul 21, 2022

🚀 Deployed on https://deploy-preview-809--etna-docs.netlify.app

@github-actions github-actions bot temporarily deployed to pull request July 21, 2022 12:33 Inactive
return self.to_flatten(self.df)
if columns is None:
return self.df.copy()
return self.df.loc[:, self.idx[:, columns]].copy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should investigate is it a problem to pass ":" here in place of segments.

The problem is described here: #775.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that it is important here, as this method just return the copy of dataframe with the specified columns

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it leads to swapping values of columns (that was a mistake), then you will get a broken dataframe. Ask the author of that bug about it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it deterministic, add comment in #775 to fix this place either

@codecov-commenter
Copy link

codecov-commenter commented Jul 21, 2022

Codecov Report

Merging #809 (cb14b02) into tsdataset_2 (e475d5d) will decrease coverage by 34.58%.
The diff coverage is 36.66%.

@@               Coverage Diff                @@
##           tsdataset_2     #809       +/-   ##
================================================
- Coverage        84.01%   49.43%   -34.59%     
================================================
  Files              125      125               
  Lines             7193     7218       +25     
================================================
- Hits              6043     3568     -2475     
- Misses            1150     3650     +2500     
Impacted Files Coverage Δ
etna/datasets/tsdataset.py 64.32% <36.66%> (-26.38%) ⬇️
etna/commands/__init__.py 0.00% <0.00%> (-100.00%) ⬇️
etna/commands/backtest_command.py 0.00% <0.00%> (-97.06%) ⬇️
etna/commands/forecast_command.py 0.00% <0.00%> (-94.88%) ⬇️
etna/commands/__main__.py 0.00% <0.00%> (-87.50%) ⬇️
etna/commands/resolvers.py 0.00% <0.00%> (-80.00%) ⬇️
etna/analysis/outliers/density_outliers.py 22.44% <0.00%> (-75.52%) ⬇️
etna/datasets/datasets_generation.py 27.02% <0.00%> (-72.98%) ⬇️
etna/transforms/timestamp/time_flags.py 27.02% <0.00%> (-72.98%) ⬇️
etna/transforms/timestamp/fourier.py 28.00% <0.00%> (-72.00%) ⬇️
... and 75 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

"""Return pandas DataFrame with flatten index.

Parameters
----------
df:
DataFrame in ETNA format.

columns:
List of columns to return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want to check that columns are present? Or we are content with error from pandas itself?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I do not get an error on pandas 1.3.5, anyway error from pandas won't be ambiguous, it will be clear that there is no such column in the dataset

@@ -595,7 +602,9 @@ def to_pandas(self, flatten: bool = False) -> pd.DataFrame:
* If False, return pd.DataFrame with multiindex

* If True, return with flatten index

columns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Don't we want to check that columns are present? Or we are content with error from pandas itself?

Copy link
Contributor

@Mr-Geekman Mr-Geekman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix documentation and look at comments above.

@alex-hse-repository alex-hse-repository linked an issue Jul 22, 2022 that may be closed by this pull request
@martins0n martins0n self-requested a review July 25, 2022 07:47
@github-actions github-actions bot temporarily deployed to pull request July 26, 2022 06:54 Inactive
@github-actions github-actions bot temporarily deployed to pull request July 27, 2022 09:58 Inactive
Copy link
Contributor

@martins0n martins0n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

CHANGELOG.md Outdated Show resolved Hide resolved
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create data access methods in TSDataset
4 participants