Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore the option to develop a more general testbed than the OpenSTEF backtesting pipeline #17

Open
MartijnCa opened this issue Mar 23, 2023 · 4 comments
Assignees

Comments

@MartijnCa
Copy link
Contributor

MartijnCa commented Mar 23, 2023

As an AIFES researcher I want to be able to have a testbed setup which allows me to easily and quickly iterate over different forecasting pipelines*, so I can compare many forecasting pipelines* and discover which one performs best.

*We define the concept forecasting pipeline as: a combination of feature engineering steps and a(n) (ensemble of) predictive (timeseries) model(s)

Context:

  • Current testbed based on the OpenSTEF backtesting pipeline: https://github.com/alliander-opensource/AIFES/blob/8477a589c2dafc3f1d5684996fd5f343c59b0490/project/00.Evaluate_performance_using_Backtest_Pipeline.ipynb
  • Under the hood of the OpenSTEF backtesting pipeline:
    • OpenSTEF feature engineering is designed so that the models can treat the resulting data as cross-sectional data.
      • This design choice leads to complications when considering statistical timeseries models (e.g. ARIMA or (G)ARCH).
    • OpenSTEF feature engineering requires the feature engineering to be done before the train test splitting.
      • This causes complications when experimenting with different types of feature engineering strategies. During development, extra care is required that no test set information is leaked into the training set, since it can be hard to validate there is no information leakage without diving deep into the code.
    • Within the OpenSTEF backtesting pipeline's cross validation, days are randomly divided into folds without taking into account the following temporal limitation: it is impossible in an operational setting to train a model using future days.

What/How:

  • The purpose of this work item is to explore ways to generalize and improve the testbed setup, addressing the above mentioned limitations of the exististing setup.
    • The most promising direction that is being explored at the moment, is using the sktime library, leveraging its built-in sliding window functionality. If this works out, it will be easy to test any sktime compatible model, while it remains easy to test any sklearn compatible regression model).
  • To reduce complexity at this stage, it is okay/recommended to only consider the 24h ahead forecasting/training horizon.
@FrankKr
Copy link
Collaborator

FrankKr commented Apr 3, 2023

@MartijnCa If you could update this description with the latest insights, that would be great!

@MartijnCa MartijnCa changed the title Create testbed which takes an sklearn model as input and simple metrics as an output Develop more general testbed than the OpenSTEF backtesting pipeline Apr 5, 2023
@MartijnCa MartijnCa changed the title Develop more general testbed than the OpenSTEF backtesting pipeline Explore the option to develop a more general testbed than the OpenSTEF backtesting pipeline Apr 5, 2023
@MartijnCa
Copy link
Contributor Author

@FrankKr @wfstoel, I updated the title and the description of the work item. Let me know if you have any remarks!

@MartijnCa
Copy link
Contributor Author

During the retrospective, we reflected on the project progress and we think on the short term we are best of focussing on #55. Depending on the outcome of that issue we can re-evaluate whether or not the generalisation of the testbed is essential during the PoC phase of the project. For now, let's wrap up this explorative issue on the generalisation of the testbed and document/summarize the results and recommendations.

@MartijnCa
Copy link
Contributor Author

@wfstoel , quick question: have you finished wrapping this up and documented the preliminary findings? Let me know, so I can update the status of this issue accordingly and add a link to this issue which points to the resulting documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏗 In progress
Development

No branches or pull requests

3 participants