Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added rnn layer and TimeSeries conversion #615

Merged
merged 150 commits into from
May 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
150 commits
Select commit Hold shift + click to select a range
167796a
added rnn layer
Gerhardsa0 Apr 9, 2024
7ad84be
fixed linter and code cov
Gerhardsa0 Apr 10, 2024
ec62b5a
style: apply automated linter fixes
megalinter-bot Apr 10, 2024
c8208c7
havin trouble loading time seires properly into nn interface
Gerhardsa0 Apr 10, 2024
fd5fbcf
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 Apr 10, 2024
155fa90
added into dataloader to time_series class
Gerhardsa0 Apr 11, 2024
0e5fe76
Merge branch 'main' of https://github.com/Safe-DS/Library into 614-fe…
Gerhardsa0 Apr 11, 2024
2acf28f
added into dataloader to time_series class
Gerhardsa0 Apr 11, 2024
4f34d67
psuhed linter changes
Gerhardsa0 Apr 11, 2024
cfd15f0
psuhed linter changes
Gerhardsa0 Apr 11, 2024
cd0dd43
style: apply automated linter fixes
megalinter-bot Apr 11, 2024
1fc40ab
pushed code coverage
Gerhardsa0 Apr 12, 2024
7a328a7
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 Apr 12, 2024
d821cee
pushed linter changes
Gerhardsa0 Apr 12, 2024
123381d
style: apply automated linter fixes
megalinter-bot Apr 12, 2024
39303f5
Merge branch 'main' of https://github.com/Safe-DS/Library into 614-fe…
Gerhardsa0 Apr 15, 2024
6392be8
save changes
Gerhardsa0 Apr 17, 2024
4279c08
style: apply automated linter fixes
megalinter-bot Apr 17, 2024
616e230
Merge branch 'main' of https://github.com/Safe-DS/Library into 614-fe…
Gerhardsa0 Apr 19, 2024
3f1ff9b
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 Apr 19, 2024
ce58122
added lstm layer and added TS Conversion
Gerhardsa0 Apr 21, 2024
8c3a88f
Merge branch 'main' of https://github.com/Safe-DS/Library into 614-fe…
Gerhardsa0 Apr 23, 2024
f387b51
LSTM workflow runs through
Gerhardsa0 Apr 23, 2024
b218be1
linter changes
Gerhardsa0 Apr 23, 2024
5843674
linter changes
Gerhardsa0 Apr 23, 2024
c80d4f7
linter changes
Gerhardsa0 Apr 23, 2024
5ec464e
style: apply automated linter fixes
megalinter-bot Apr 23, 2024
54c6eef
style: apply automated linter fixes
megalinter-bot Apr 23, 2024
eb1237a
updated code coverage
Gerhardsa0 Apr 23, 2024
eb3d0d4
added tests, because nn only create 0 outputs
Gerhardsa0 Apr 23, 2024
9ff6c38
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 Apr 23, 2024
31a158e
TimeSEries LSTM run works
Gerhardsa0 Apr 24, 2024
0a9a436
updated tests
Gerhardsa0 Apr 25, 2024
3a20254
Merge branch 'main' of https://github.com/Safe-DS/Library into 614-fe…
Gerhardsa0 Apr 25, 2024
4463ee3
updated tests
Gerhardsa0 Apr 25, 2024
e65184a
updated tests
Gerhardsa0 Apr 25, 2024
a5f02fc
updated tests
Gerhardsa0 Apr 25, 2024
2300dbd
updated tests
Gerhardsa0 Apr 25, 2024
c89ca4f
style: apply automated linter fixes
megalinter-bot Apr 25, 2024
5ffa2ea
style: apply automated linter fixes
megalinter-bot Apr 25, 2024
0162413
updated tests
Gerhardsa0 Apr 25, 2024
2d082e8
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 Apr 25, 2024
e4b9569
style: apply automated linter fixes
megalinter-bot Apr 25, 2024
9f5edf9
style: apply automated linter fixes
megalinter-bot Apr 25, 2024
7e9afc7
code cob bugged
Gerhardsa0 Apr 25, 2024
af88159
style: apply automated linter fixes
megalinter-bot Apr 25, 2024
005bd86
updated test
Gerhardsa0 Apr 26, 2024
95bbc17
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 Apr 26, 2024
5869c7a
changed Input Conversion so the User does not to specify the features…
Gerhardsa0 Apr 26, 2024
5912219
refactoring
Gerhardsa0 Apr 26, 2024
1526e79
updated Linter
Gerhardsa0 Apr 29, 2024
7cc631f
updated Linter
Gerhardsa0 Apr 29, 2024
2e9be46
updated Linter
Gerhardsa0 Apr 29, 2024
e4c0f0d
updated Linter
Gerhardsa0 Apr 29, 2024
2e675ec
updated Linter
Gerhardsa0 Apr 29, 2024
2aa0757
updated Linter
Gerhardsa0 Apr 29, 2024
aeebcfb
updated Linter
Gerhardsa0 Apr 29, 2024
0188c72
updated Linter
Gerhardsa0 Apr 29, 2024
cce24ce
updated Linter
Gerhardsa0 Apr 29, 2024
2f6ca09
updated Linter
Gerhardsa0 Apr 29, 2024
d5daefc
updated Linter
Gerhardsa0 Apr 29, 2024
38e93f0
updated Linter
Gerhardsa0 Apr 29, 2024
c7fbfb0
style: apply automated linter fixes
megalinter-bot Apr 29, 2024
1bd6716
style: apply automated linter fixes
megalinter-bot Apr 29, 2024
9163f37
activate code cov
Gerhardsa0 Apr 29, 2024
97a3c82
style: apply automated linter fixes
megalinter-bot Apr 29, 2024
a37093c
added code cov
Gerhardsa0 May 2, 2024
fede995
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 May 2, 2024
ec1310e
Merge branch 'main' of https://github.com/Safe-DS/Library into 614-fe…
Gerhardsa0 May 2, 2024
6271d00
refactored complete TimeSeries Class and adapted it to the TimeSeries…
Gerhardsa0 May 2, 2024
35c838c
Merge branch 'main' of https://github.com/Safe-DS/Library into 614-fe…
Gerhardsa0 May 2, 2024
9e66cad
refactored complete TimeSeries Class and adapted it to the TimeSeries…
Gerhardsa0 May 2, 2024
6bde123
removed plotting from the timeseries dataset
Gerhardsa0 May 3, 2024
29c0547
Merge branch 'main' of https://github.com/Safe-DS/Library into 614-fe…
Gerhardsa0 May 3, 2024
2928f4a
merged and snapshots
Gerhardsa0 May 3, 2024
817ae3d
linter changes
Gerhardsa0 May 3, 2024
5a5d619
linter changes
Gerhardsa0 May 3, 2024
333fb84
linter changes
Gerhardsa0 May 3, 2024
44f7963
linter changes
Gerhardsa0 May 3, 2024
8dea72c
style: apply automated linter fixes
megalinter-bot May 3, 2024
17c8667
style: apply automated linter fixes
megalinter-bot May 3, 2024
1b443d3
linter changes
Gerhardsa0 May 3, 2024
0854fd8
added and removed code cov
Gerhardsa0 May 3, 2024
7b37c58
linter changes
Gerhardsa0 May 3, 2024
5d447d1
style: apply automated linter fixes
megalinter-bot May 3, 2024
27d45c0
code cov
Gerhardsa0 May 3, 2024
60db23b
code cov
Gerhardsa0 May 3, 2024
93bcfe8
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 May 3, 2024
37ba38b
linter changes
Gerhardsa0 May 3, 2024
b124053
style: apply automated linter fixes
megalinter-bot May 3, 2024
f5c43f2
code cov and moved lag plot into Column
Gerhardsa0 May 5, 2024
d6aa709
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 May 5, 2024
5918f2c
pushed image for test
Gerhardsa0 May 5, 2024
20d8b20
Merge branch 'main' into 614-feat-add-rnn-layer
Gerhardsa0 May 5, 2024
6dce1e1
style: apply automated linter fixes
megalinter-bot May 5, 2024
e5b1789
Merge branch 'main' into 614-feat-add-rnn-layer
Gerhardsa0 May 5, 2024
8dbc993
Merge branch 'main' into 614-feat-add-rnn-layer
Gerhardsa0 May 6, 2024
79d4494
linter changes
Gerhardsa0 May 6, 2024
2c594d3
style: apply automated linter fixes
megalinter-bot May 6, 2024
518a78f
Merge branch 'main' of https://github.com/Safe-DS/Library into 614-fe…
Gerhardsa0 May 6, 2024
77b5cbc
Update src/safeds/data/labeled/containers/_time_series_dataset.py
Gerhardsa0 May 6, 2024
cfd445b
Update src/safeds/data/labeled/containers/_time_series_dataset.py
Gerhardsa0 May 6, 2024
65e8b77
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 May 6, 2024
18a811e
linter changes
Gerhardsa0 May 6, 2024
9354a16
Merge branch 'main' of https://github.com/Safe-DS/Library into 614-fe…
Gerhardsa0 May 6, 2024
78984f5
merged
Gerhardsa0 May 6, 2024
60cd980
linter changes
Gerhardsa0 May 6, 2024
7a3409a
linter changes
Gerhardsa0 May 6, 2024
102a45f
linter changes
Gerhardsa0 May 6, 2024
5c57806
linter changes
Gerhardsa0 May 6, 2024
09d2285
style: apply automated linter fixes
megalinter-bot May 6, 2024
782eeb5
linter changes
Gerhardsa0 May 6, 2024
1525765
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 May 6, 2024
418d033
linter changes
Gerhardsa0 May 6, 2024
cf9e177
style: apply automated linter fixes
megalinter-bot May 6, 2024
6ae2787
style: apply automated linter fixes
megalinter-bot May 6, 2024
725d205
code cov
Gerhardsa0 May 6, 2024
d1c928c
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 May 6, 2024
272e9f9
style: apply automated linter fixes
megalinter-bot May 6, 2024
b99cd27
l
Gerhardsa0 May 6, 2024
1ceb77e
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 May 6, 2024
ab397f1
l
Gerhardsa0 May 6, 2024
c110f93
l
Gerhardsa0 May 6, 2024
0a8ff7e
test: re-enabled and changed assertions in cnn-workflow
Marsmaennchen221 May 7, 2024
dc73aa0
style: apply automated linter fixes
megalinter-bot May 7, 2024
38db0c9
dataloader only using torch now
Gerhardsa0 May 7, 2024
e9fd423
Update src/safeds/data/labeled/containers/_time_series_dataset.py
Gerhardsa0 May 7, 2024
13c1c4d
Update src/safeds/ml/nn/_lstm_layer.py
Gerhardsa0 May 7, 2024
b45b0c6
Update src/safeds/ml/nn/_input_conversion_time_series.py
Gerhardsa0 May 7, 2024
d7903df
Update src/safeds/data/tabular/containers/_column.py
Gerhardsa0 May 7, 2024
e37456a
Update src/safeds/data/tabular/containers/_column.py
Gerhardsa0 May 7, 2024
a5d7149
Update src/safeds/data/labeled/containers/_time_series_dataset.py
Gerhardsa0 May 7, 2024
a6dec43
Update src/safeds/data/labeled/containers/_time_series_dataset.py
Gerhardsa0 May 7, 2024
5344a79
Update src/safeds/data/labeled/containers/_time_series_dataset.py
Gerhardsa0 May 7, 2024
5a535a6
Update src/safeds/data/labeled/containers/_time_series_dataset.py
Gerhardsa0 May 7, 2024
0cb4cda
requested changes added
Gerhardsa0 May 7, 2024
6588fd7
linter changes
Gerhardsa0 May 7, 2024
1235f50
linter changes
Gerhardsa0 May 7, 2024
3e4409d
linter changes
Gerhardsa0 May 7, 2024
ef3619c
style: apply automated linter fixes
megalinter-bot May 7, 2024
666530a
Merge branch 'main' into 614-feat-add-rnn-layer
Gerhardsa0 May 7, 2024
2f9a8fd
Apply suggestions from code review
Gerhardsa0 May 7, 2024
2a580ae
style: apply automated linter fixes
megalinter-bot May 7, 2024
a5bc3a0
chaged on request
Gerhardsa0 May 7, 2024
9c90bca
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 May 7, 2024
0f87277
style: apply automated linter fixes
megalinter-bot May 7, 2024
22fa5a4
chaged on request
Gerhardsa0 May 7, 2024
0aa6e96
Merge remote-tracking branch 'origin/614-feat-add-rnn-layer' into 614…
Gerhardsa0 May 7, 2024
feb0a82
style: apply automated linter fixes
megalinter-bot May 7, 2024
e5c10c6
Update src/safeds/ml/nn/_input_conversion_time_series.py
Marsmaennchen221 May 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions src/safeds/data/labeled/containers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,19 @@
if TYPE_CHECKING:
from ._image_dataset import ImageDataset
from ._tabular_dataset import TabularDataset
from ._time_series_dataset import TimeSeriesDataset

apipkg.initpkg(
__name__,
{
"ImageDataset": "._image_dataset:ImageDataset",
"TabularDataset": "._tabular_dataset:TabularDataset",
"TimeSeriesDataset": "._time_series_dataset:TimeSeriesDataset",
},
)

__all__ = [
"ImageDataset",
"TabularDataset",
"TimeSeriesDataset",
]
328 changes: 328 additions & 0 deletions src/safeds/data/labeled/containers/_time_series_dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,328 @@
from __future__ import annotations

import sys
from typing import TYPE_CHECKING

from safeds._utils import _structural_hash
from safeds.data.tabular.containers import Column, Table

if TYPE_CHECKING:
from collections.abc import Mapping, Sequence
from typing import Any

import torch
from torch.utils.data import DataLoader, Dataset


class TimeSeriesDataset:
"""
A time series dataset maps feature and time columns to a target column. Not like the TabularDataset a TimeSeries needs to contain one target and one time column, but can have empty features.

Create a time series dataset from a mapping of column names to their values.

Parameters
----------
data:
The data.
target_name:
Name of the target column.
time_name:
Name of the time column.
extra_names:
Names of the columns that are neither features nor target. If None, no extra columns are used, i.e. all but
the target column are used as features.

Raises
------
ColumnLengthMismatchError
If columns have different lengths.
ValueError
If the target column is also an extra column.
ValueError
If no feature column remains.

Examples
--------
>>> from safeds.data.labeled.containers import TabularDataset
>>> dataset = TimeSeriesDataset(
... {"id": [1, 2, 3], "feature": [4, 5, 6], "target": [1, 2, 3], "error":[0,0,1]},
... target_name="target",
... time_name = "id",
... extra_names=["error"]
... )
"""

# ------------------------------------------------------------------------------------------------------------------
# Dunder methods
# ------------------------------------------------------------------------------------------------------------------
def __init__(
self,
data: Table | Mapping[str, Sequence[Any]],
target_name: str,
time_name: str,
extra_names: list[str] | None = None,
):
# Preprocess inputs
if not isinstance(data, Table):
data = Table(data)
if extra_names is None:
extra_names = []

# Derive feature names
feature_names = [name for name in data.column_names if name not in {target_name, *extra_names, time_name}]

# Validate inputs
if time_name in extra_names:
raise ValueError(f"Column '{time_name}' cannot be both time and extra.")
if target_name in extra_names:
raise ValueError(f"Column '{target_name}' cannot be both target and extra.")
if len(feature_names) == 0:
feature_names = []

# Set attributes
self._table: Table = data
self._features: Table = data.keep_only_columns(feature_names)
self._target: Column = data.get_column(target_name)
self._time: Column = data.get_column(time_name)
self._extras: Table = data.keep_only_columns(extra_names)

def __eq__(self, other: object) -> bool:
"""
Compare two time series datasets.

Returns
-------
equals:
'True' if features, time, target and extras are equal, 'False' otherwise.
"""
if not isinstance(other, TimeSeriesDataset):
return NotImplemented
return (self is other) or (
self.target == other.target
and self.features == other.features
and self.extras == other.extras
and self.time == other.time
)

def __hash__(self) -> int:
"""
Return a deterministic hash value for this time series dataset.

Returns
-------
hash:
The hash value.
"""
return _structural_hash(self.target, self.features, self.extras, self.time)

def __sizeof__(self) -> int:
"""
Return the complete size of this object.

Returns
-------
size:
Size of this object in bytes.
"""
return (
sys.getsizeof(self._target)
+ sys.getsizeof(self._features)
+ sys.getsizeof(self.extras)
+ sys.getsizeof(self._time)
)

# ------------------------------------------------------------------------------------------------------------------
# Properties
# ------------------------------------------------------------------------------------------------------------------

@property
def features(self) -> Table:
"""The feature columns of the time series dataset."""
return self._features

@property
def target(self) -> Column:
"""The target column of the time series dataset."""
return self._target

@property
def time(self) -> Column:
"""The time column of the time series dataset."""
return self._time

@property
def extras(self) -> Table:
"""
Additional columns of the time series dataset that are neither features, target nor time.

These can be used to store additional information about instances, such as IDs.
"""
return self._extras

# ------------------------------------------------------------------------------------------------------------------
# Conversion
# ------------------------------------------------------------------------------------------------------------------

def to_table(self) -> Table:
"""
Return a new `Table` containing the feature columns, the target column, the time column and the extra columns.

The original `TimeSeriesDataset` is not modified.

Returns
-------
table:
A table containing the feature columns, the target column, the time column and the extra columns.
"""
return self._table

def _into_dataloader_with_window(self, window_size: int, forecast_horizon: int, batch_size: int) -> DataLoader:
"""
Return a Dataloader for the data stored in this time series, used for training neural networks.

It splits the target column into windows, uses them as feature and creates targets for the time series, by
forecast length. The original time series dataset is not modified.

Parameters
----------
window_size:
The size of the created windows
forecast_horizon:
The length of the forecast horizon, where all datapoints are collected until the given lag.
batch_size:
The size of data batches that should be loaded at one time.

Raises
ValueError:
If the size is smaller or even than forecast_horizon+window_size

Gerhardsa0 marked this conversation as resolved.
Show resolved Hide resolved
Returns
-------
result:
The DataLoader.
"""
import torch
from torch.utils.data import DataLoader

target_tensor = torch.tensor(self.target._data.values, dtype=torch.float32)

x_s = []
y_s = []

size = target_tensor.size(0)
if window_size < 1:
raise ValueError("window_size must be greater than or equal to 1")
if forecast_horizon < 1:
raise ValueError("forecast_horizon must be greater than or equal to 1")
if size <= forecast_horizon + window_size:
raise ValueError("Can not create windows with window size less then forecast horizon + window_size")
# create feature windows and for that features targets lagged by forecast len
# every feature column wird auch gewindowed
# -> [i, win_size],[target]
feature_cols = self.features.to_columns()
for i in range(size - (forecast_horizon + window_size)):
window = target_tensor[i : i + window_size]
label = target_tensor[i + window_size + forecast_horizon]
for col in feature_cols:
data = torch.tensor(col._data.values, dtype=torch.float32)
window = torch.cat((window, data[i : i + window_size]), dim=0)
x_s.append(window)
y_s.append(label)
x_s_tensor = torch.stack(x_s)
y_s_tensor = torch.stack(y_s)
dataset = _create_dataset(x_s_tensor, y_s_tensor)
return DataLoader(dataset=dataset, batch_size=batch_size)

def _into_dataloader_with_window_predict(
self,
window_size: int,
forecast_horizon: int,
batch_size: int,
) -> DataLoader:
"""
Return a Dataloader for the data stored in this time series, used for training neural networks.

It splits the target column into windows, uses them as feature and creates targets for the time series, by
forecast length. The original time series dataset is not modified.

Parameters
----------
window_size:
The size of the created windows
batch_size:
The size of data batches that should be loaded at one time.

Gerhardsa0 marked this conversation as resolved.
Show resolved Hide resolved
Returns
-------
result:
The DataLoader.
"""
import torch
from torch.utils.data import DataLoader

target_tensor = torch.tensor(self.target._data.values, dtype=torch.float32)
x_s = []

size = target_tensor.size(0)
feature_cols = self.features.to_columns()
for i in range(size - (forecast_horizon + window_size)):
window = target_tensor[i : i + window_size]
for col in feature_cols:
data = torch.tensor(col._data.values, dtype=torch.float32)
window = torch.cat((window, data[i : i + window_size]), dim=-1)
x_s.append(window)

x_s_tensor = torch.stack(x_s)

dataset = _create_dataset_predict(x_s_tensor)
return DataLoader(dataset=dataset, batch_size=batch_size)

# ------------------------------------------------------------------------------------------------------------------
# IPython integration
# ------------------------------------------------------------------------------------------------------------------

def _repr_html_(self) -> str:
"""
Return an HTML representation of the time series dataset.

Returns
-------
output:
The generated HTML.
"""
return self._table._repr_html_()


def _create_dataset(features: torch.Tensor, target: torch.Tensor) -> Dataset:
from torch.utils.data import Dataset

class _CustomDataset(Dataset):
def __init__(self, features_dataset: torch.Tensor, target_dataset: torch.Tensor):
self.X = features_dataset
self.Y = target_dataset.unsqueeze(-1)
self.len = self.X.shape[0]

def __getitem__(self, item: int) -> tuple[torch.Tensor, torch.Tensor]:
return self.X[item], self.Y[item]

def __len__(self) -> int:
return self.len

return _CustomDataset(features, target)


def _create_dataset_predict(features: torch.Tensor) -> Dataset:
from torch.utils.data import Dataset

class _CustomDataset(Dataset):
def __init__(self, features: torch.Tensor):
self.X = features
self.len = self.X.shape[0]

def __getitem__(self, item: int) -> torch.Tensor:
return self.X[item]

def __len__(self) -> int:
return self.len

return _CustomDataset(features)
3 changes: 0 additions & 3 deletions src/safeds/data/tabular/containers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
from ._experimental_polars_table import ExperimentalPolarsTable
from ._row import Row
from ._table import Table
from ._time_series import TimeSeries

apipkg.initpkg(
__name__,
Expand All @@ -24,7 +23,6 @@
"ExperimentalPolarsTable": "._experimental_polars_table:ExperimentalPolarsTable",
"Row": "._row:Row",
"Table": "._table:Table",
"TimeSeries": "._time_series:TimeSeries",
},
)

Expand All @@ -36,5 +34,4 @@
"ExperimentalPolarsTable",
"Row",
"Table",
"TimeSeries",
]
Loading