ClimateFiller is a python framework that implements various data-driven methods to make manipulating in-situ climate time series easier. It offers various services such as (1) automating gap-filling (2) using machine learning and ERA5-Land for bias correction (3) using Isolation Forest, Local Outlier Factor, and quantiles to detect and eliminate outliers. It was tested on several Automatic Weather Stations (AWS) installed in Morocco.
DISCLAIMER: Please note that this project is currently in the BETA stage and will remain experimental for the foreseeable future. As a result, there is a high probability of Classes, methods names, and other functionalities undergoing modifications.
-
Obtain a Climate Data Store (CDS) API key:
To access the Climate Data Store API, you'll need to register and obtain an API key from their website: https://cds.climate.copernicus.eu/user/register
Go to your profile and copy your key from the API key section -
Configure the API key:
Once you have the API key, create a file named .cdsapirc in the project's root directory and add the API key to it: - Replace YOUR_CDS_API_KEY with your actual CDS API key.
{
"url": "https://cds.climate.copernicus.eu/api/v2",
"key": "YOUR_CDS_API_KEY"
}
-
Sign up for Google Earth Engine:
To use Google Earth Engine, sign up for an Earth Engine account at https://earthengine.google.com/signup/ -
Download and install Google Cloud SDK at:
https://cloud.google.com/sdk/docs/install -
Ceate a new project:
-
Authenticate the Earth Engine API by running in terminal:
earthengine authenticate
Follow the instructions to authorize the Earth Engine API with your Google account. This method, enables you access the API without having to request authorization every time.
- Clone this GitHub repository to your local machine using the following command:
- Install project dependencies using conda (Preferred way):
git clone https://github.com/elhachimi-ch/climatefiller.git
conda env create -f environment.yml
conda activate climate_filler
python kitchen.py
Or you can install project dependencies using pip (not recommended):
pip install -r requirements.txt
import time
from data_science_toolkit.dataframe import DataFrame
from climatefiller import ClimateFiller
# Read the time series
data = DataFrame("data/los_angeles_sair_temperature.csv")
# Rename target colmn
data.rename_columns({'air_temperature':'Ta'})
# Initilize the ClimateFiller object
climate_filler = ClimateFiller(data.get_dataframe(), data_type='df', datetime_column_name='datetime')
# Replace missing values with 0
climate_filler.missing_data(filling_dict_colmn_val={'Ta': 0})
# Detect and eliminate outliers
climate_filler.eliminate_outliers('Ta')
More information can be found on the ClimateFiller framework documentation site.
Contrubution and suggestions are welcome via GitHub Pull Requests.
We're actively enhacing the repo with new algorithms.
The research paper of the project is under review