Skip to content

Scripts for creating a replica of CSIRO's commercially licensed version of the AGCD dataset on NCI

Notifications You must be signed in to change notification settings

AusClimateService/agcd-csiro

Repository files navigation

agcd-csiro

This repository contains scripts for creating a replica of CSIRO's commercially licensed version of the AGCD dataset on NCI.

The commercially licensed AGCD dataset is located at /datasets/work/af-cdp/work/agcd/ on Petrichor. (The full catalogue for the Digiscape Climate Data Portal is here.) The replica data is located at /g/data/xv83/agcd-csiro/ on NCI.

Daily data

Data transfer

The data transfer scripts/commands need to be run from Petrichor.

The transfer of the daily historical data files was done using the transfer script:

$ ssh [email protected]
$ git clone https://github.com/AusClimateService/agcd-csiro.git
$ cd agcd-csiro
$ bash transfer_agcd-historical_daily.sh {variable} {nci_username} {nci_password}

The latest daily data (updated daily) can be transferred using scp. e.g.

$ scp /datasets/work/af-cdp/work/agcd/climate/tmax.nc [email protected]:/g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20180209-20220402_daily.nc

Data processing

The latest data has some issues (e.g. duplication of time steps) and overlaps in time with the historical data.

These issues can be addressed by running the processing script on NCI:

$ cd /g/data/xv83/agcd-csiro/replica-code
$ git pull origin main
$ python process_current.py /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20180209-20220402_daily.nc tmax /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20190101-20191231_daily.nc /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20200101-20201231_daily.nc /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20210101-20211231_daily.nc /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20220101-20220402_daily.nc
$ rm /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_20180209-20220402_daily.nc

Rechunking

$ python to_rechunked_zarr.py /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_*_daily.nc tmax /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_19100101-20220402_daily_time-chunked.zarr /g/data/xv83/agcd-csiro/tmax/daily/tmax_AGCD-CSIRO_r005_19100101-20220402_daily_space-chunked.zarr /g/data/xv83/agcd-csiro/tmax/daily/temporary.zarr

Monthly data

There's no historical data included in the commercially licensed AGCD dataset. A research-only copy of the monthly historical data (up to mid-2020) is available on NCI in project zv2 (/g/data/zv2/agcd/v2/precip/total/r005/01month).

The latest monthly data (from 2020 or late 2019 onwards) is available as part of the commercial dataset for precipitation, tmax and tmin (there's no monthly vapour pressure data).

Data transfer

The data transfer commands need to be run from Petrichor:

scp /datasets/work/af-cdp/work/agcd/tmax/data/IDCKZX1A90_tmax_mean_r005_*.nc [email protected]:/g/data/xv83/agcd-csiro/tmax/monthly
scp /datasets/work/af-cdp/work/agcd/tmin/data/IDCKZN1A90_tmin_mean_r005_*.nc [email protected]:/g/data/xv83/agcd-csiro/tmin/monthly
scp /datasets/work/af-cdp/work/agcd/precip/data/IDCK2R1AT0_precip_total_r005_*.nc [email protected]:/g/data/xv83/agcd-csiro/precip/monthly

For the precipitation data it looks like IDCK2R1AT0 is v2 and IDCKZR1AT0 is v1, although when comparing against the data in zv2 the values aren't exactly the same. (See precip_comparison.ipynb for details.)

Data processing

The files then simply need to be merged using cdo. For example:

cdo mergetime /g/data/xv83/agcd-csiro/tmax/monthly/IDCKZX1A90_tmax_mean_r005_2021*.nc /g/data/xv83/agcd-csiro/tmax/monthly/agcd_v1_tmax_mean_r005_monthly_2021.nc

About

Scripts for creating a replica of CSIRO's commercially licensed version of the AGCD dataset on NCI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published