Skip to content

Commit

Permalink
Merge pull request #26 from climate-resource/mlo-ch4-data-sources
Browse files Browse the repository at this point in the history
Mlo ch4 data sources
  • Loading branch information
znichollscr authored Apr 16, 2024
2 parents c357b9f + 943f559 commit 37648ef
Show file tree
Hide file tree
Showing 22 changed files with 1,638 additions and 20 deletions.
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,9 @@ all-ci: $(CI_CONFIG_ABSOLUTE_YAML) ## compile all outputs using the CI run-id
all-dev: $(DEV_CONFIG_ABSOLUTE_YAML) ## compile all outputs using the dev run-id
DOIT_CONFIGURATION_FILE=$(DEV_CONFIG_ABSOLUTE_YAML) DOIT_RUN_ID=$(DEV_RUN_ID) DOIT_DB_BACKEND=$(DEV_BACKEND) DOIT_DB_FILE=$(DEV_BACKEND_FILE) poetry run doit run --verbosity=2

all-dev-parallel: $(DEV_CONFIG_ABSOLUTE_YAML) ## compile all outputs using the dev run-id
DOIT_CONFIGURATION_FILE=$(DEV_CONFIG_ABSOLUTE_YAML) DOIT_RUN_ID=$(DEV_RUN_ID) DOIT_DB_BACKEND=$(DEV_BACKEND) DOIT_DB_FILE=$(DEV_BACKEND_FILE) poetry run doit run --verbosity=2 -n 6

all-debug-dev: $(DEV_CONFIG_ABSOLUTE_YAML) ## compile all outputs using the dev run-id, falling to debugger on failure
DOIT_CONFIGURATION_FILE=$(DEV_CONFIG_ABSOLUTE_YAML) DOIT_RUN_ID=$(DEV_RUN_ID) DOIT_DB_BACKEND=$(DEV_BACKEND) DOIT_DB_FILE=$(DEV_BACKEND_FILE) poetry run doit run --pdb

Expand Down
34 changes: 14 additions & 20 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,19 @@ As the to-do's become concrete, take them out and turn them into [issues](https:

- Download NEEM into same style as NOAA and AGAGE

- Download CH4 budget into same style as NOAA and AGAGE
- https://www.icos-cp.eu/GCP-CH4-2019

- Download Scripps merged data into same style as NOAA and AGAGE
- probably don't use, as I think we have all raw data from elsewhere (although I am puzzled by why we don't have MLO pre 1968 in other sources)
- https://scrippsco2.ucsd.edu/assets/data/atmospheric/merged_ice_core_mlo_spo/spline_merged_ice_core_yearly.csv
- get permissions etc. from here: https://keelingcurve.ucsd.edu/permissions-and-data-sources/
- very confusing what this is
- why don't measurements pre-1968 appear in other networks?
- are all their stations in e.g. NOAA, or only some?
- should we be using Scripps' ice age core merged product and their spline?
- need to ask Paul/Peter

- Add checking of data formats before trying to process everything

- reach out to AGAGE authors to ask if processing has made the right choice re polluted and unpolluted
- Malte's paper used polluted
Expand Down Expand Up @@ -196,27 +205,12 @@ Off the table improvements for now:
- may also be relevant: https://stackoverflow.com/questions/32418045/running-python-code-by-clicking-a-button-in-bokeh?rq=4
- probably also google 'bokeh update plot on click'

### Data sources

- NOAA: done
- AGAGE: done

- Law dome ice core
- https://data.csiro.au/collection/csiro:37077
- EPICA Dronning Maud Laud ice core
- https://doi.pangaea.de/10.1594/PANGAEA.552232
- https://doi.pangaea.de/10.1594/PANGAEA.552232?format=textfile
- NEEM methane
- https://doi.pangaea.de/10.1594/PANGAEA.899040
- start with outliers removed (https://doi.pangaea.de/10.1594/PANGAEA.899037), then ask providers
- https://doi.pangaea.de/10.1594/PANGAEA.899037?format=textfile
- Scripps
- very confusing what this is
- why don't measurements pre-1968 appear in other networks?
- are all their stations in e.g. NOAA, or only some?
- should we be using Scripps' ice age core merged product and their spline?
### Data sources to follow up on

- UCI network? (I emailed them on 2024-03-27 asking about data access)

- All the data sources for gases other than CO2, CH4 and N2O (that aren't AGAGAE)

- Full list of data sources is in Table 12 of M17
- https://gmd.copernicus.org/articles/10/2057/2017/gmd-10-2057-2017.pdf

Expand Down
99 changes: 99 additions & 0 deletions dev-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,105 @@ retrieve_and_process_law_dome_data:
data/raw/law_dome/data/Law_Dome_GHG_2000years.xlsx: f7dd24e36565b2e213b20f90c88c990e
processed_data_with_loc_file: data/interim/law_dome/law_dome_with_location.csv

retrieve_and_process_scripps_data:
- step_config_id: "only"
merged_ice_core_data:
known_hash: 7a89e63bc92bd0666058e627d01be9698bf0ae0ae0f5c1764ea0ad32002b21aa
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/merged_ice_core_mlo_spo/spline_merged_ice_core_yearly.csv
merged_ice_core_data_processed_data_file: data/interim/scripps/merged_ice_core.csv
station_data:
- station_code: mlo
lat: 19.5 N
lon: 155.6 W
url_source:
known_hash: 3883b992cbef27e9346bce8e5f7a7fefd479b151dbba4b9471d0ed499a7d94bf
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/in_situ_co2/monthly/monthly_in_situ_co2_mlo.csv
- station_code: mlo
lat: 19.5 N
lon: 155.6 W
url_source:
known_hash: bf4b0cd8f80ab41d216cc09318a6fb65ac0450c0e4005740cf76a65451b2dab8
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/flask_co2/monthly/monthly_flask_co2_mlo.csv
- station_code: alt
lat: 82.3 N
lon: 62.3 W
url_source:
known_hash: 90f36c9ad1b9f853e0e05fa99a858206dbcdc449100d46ce87dc1dc32cd768fd
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/flask_co2/monthly/monthly_flask_co2_alt.csv
- station_code: ptb
lat: 71.3 N
lon: 156.6 W
url_source:
known_hash: 20293a1820fe896cc44d86d9fde1aa19562eccad08d216052ee2f0761de827e8
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/merged_in_situ_and_flask/monthly/monthly_merge_co2_ptb.csv
- station_code: ljo
lat: 32.9 N
lon: 117.3 W
url_source:
known_hash: c77ca1ef1f34b3b2193480c5b92857ff09acbbc95b4d717fc1499ec4eb311b8b
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/merged_in_situ_and_flask/monthly/monthly_merge_co2_ljo.csv
- station_code: kum
lat: 19.5 N
lon: 154.8 W
url_source:
known_hash: 81d3cfdbc04f6ec508d5fc3b44f505b5825f49b37bd045acd3aa406b66a198b0
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/flask_co2/monthly/monthly_flask_co2_kum.csv
- station_code: fan
lat: 3.9 N
lon: 159.4 W
url_source:
known_hash: 46f5b4def0f51ac0c355b3fc93f69df95c06fb06fd23705bb9e08f8ebb86cb46
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/flask_co2/monthly/monthly_flask_co2_fan.csv
- station_code: chr
lat: 2.0 N
lon: 157.3 W
url_source:
known_hash: 640918d8d8117a551d2530ad1f1559714a530fb4d727115a552e00c92fc24d83
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/flask_co2/monthly/monthly_flask_co2_chr.csv
- station_code: sam
lat: 14.2 S
lon: 170.6 W
url_source:
known_hash: 4155ff5a75df75b05f1c4883310be4a37ed878af67f819368cc8968ed1667c53
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/flask_co2/monthly/monthly_flask_co2_sam.csv
- station_code: ker
lat: 29.2 S
lon: 177.9 W
url_source:
known_hash: cc2e75642bc76538eb671ffe9cec55a34c467cfa2ba21ee22b2ce8b4636960e6
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/flask_co2/monthly/monthly_flask_co2_ker.csv
- station_code: spo
lat: 90.0 S
lon: 180.0 W
url_source:
known_hash: e2e893108f8806a94ba7e5cccd200b6f85b0b4bab2dade2e43798cb4f0059e7a
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/merged_in_situ_and_flask/monthly/monthly_merge_co2_spo.csv
- station_code: nzd
lat: 41.4 S
lon: 174.9 E
url_source:
known_hash: 7efbdbf808857d7c127747f921c7fccb8a474f3a77b76df5446006e2ede86c74
url: https://scrippsco2.ucsd.edu/assets/data/atmospheric/stations/flask_co2/monthly/monthly_flask_co2_nzd.csv

raw_dir: data/raw/scripps
processed_data_with_loc_file: data/interim/scripps/monthly.csv

retrieve_and_process_epica_data:
- step_config_id: "only"
raw_dir: data/raw/epica
download_url:
known_hash: 26c9259d69bfe390f521d1f651de8ea37ece5bbb95b43df749ba4e00f763e9fd
url: https://doi.pangaea.de/10.1594/PANGAEA.552232?format=textfile
processed_data_with_loc_file: data/interim/epica/epica_with_location.csv

retrieve_and_process_neem_data:
- step_config_id: "only"
raw_dir: data/raw/neem
download_url:
known_hash: 3b57ca16db32f729a414422347f9292f2083c8d602f1f13d47a7fe7709d63d2d
url: https://doi.pangaea.de/10.1594/PANGAEA.899039?format=textfile
processed_data_with_loc_file: data/interim/neem/neem_with_location.csv

plot_input_data_overviews:
- step_config_id: only

Expand Down
89 changes: 89 additions & 0 deletions notebooks/004y_process-scripps-data/0040_download-scripps.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# ---
# jupyter:
# jupytext:
# text_representation:
# extension: .py
# format_name: percent
# format_version: '1.3'
# jupytext_version: 1.16.1
# kernelspec:
# display_name: Python 3 (ipykernel)
# language: python
# name: python3
# ---

# %% [markdown]
# # Scripps - download
#
# Download data from the [Scripps CO$_2$ program](https://scrippsco2.ucsd.edu/).

# %% [markdown]
# ## Imports

# %% editable=true slideshow={"slide_type": ""}
import openscm_units
import pint
import pooch
from pydoit_nb.checklist import generate_directory_checklist
from pydoit_nb.config_handling import get_config_for_step_id

from local.config import load_config_from_file

# %%
pint.set_application_registry(openscm_units.unit_registry) # type: ignore

# %% [markdown] editable=true slideshow={"slide_type": ""}
# ## Define branch this notebook belongs to

# %% editable=true slideshow={"slide_type": ""}
step: str = "retrieve_and_process_scripps_data"

# %% [markdown] editable=true slideshow={"slide_type": ""}
# ## Parameters

# %% editable=true slideshow={"slide_type": ""} tags=["parameters"]
config_file: str = "../../dev-config-absolute.yaml" # config file
step_config_id: str = "only" # config ID to select for this branch

# %% [markdown] editable=true slideshow={"slide_type": ""}
# ## Load config

# %% editable=true slideshow={"slide_type": ""}
config = load_config_from_file(config_file)
config_step = get_config_for_step_id(
config=config, step=step, step_config_id=step_config_id
)

# %% [markdown]
# ## Action

# %% [markdown]
# ### Download merged ice core data
#
# We probably won't use this directly, but it is handy to have as a comparison point.

# %%
pooch.retrieve(
url=config_step.merged_ice_core_data.url,
known_hash=config_step.merged_ice_core_data.known_hash,
fname=config_step.merged_ice_core_data.url.split("/")[-1],
path=config_step.raw_dir,
progressbar=True,
)

# %% [markdown]
# ### Download station data

# %%
for scripps_source in config_step.station_data:
outfile = pooch.retrieve(
url=scripps_source.url_source.url,
known_hash=scripps_source.url_source.known_hash,
fname=scripps_source.url_source.url.split("/")[-1],
path=config_step.raw_dir,
progressbar=True,
)
assert scripps_source.station_code in outfile

# %%
generate_directory_checklist(config_step.raw_dir)
Loading

0 comments on commit 37648ef

Please sign in to comment.