Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Data Clients #3000

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions examples/remote/basic.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Copyright (c) 2023 MetPy Developers.
# Distributed under the terms of the BSD 3-Clause License.
# SPDX-License-Identifier: BSD-3-Clause
"""
==================
Remote Data Access
==================

Use MetPy to access data hosted in known AWS S3 buckets
"""
from datetime import datetime, timedelta

from metpy.remote import NEXRADLevel2Archive, NEXRADLevel3Archive, GOES16Archive, GOES18Archive

###################
# NEXRAD Level 2

# Get the nearest product to a time
prod = NEXRADLevel2Archive().get_product('KTLX', datetime(2013, 5, 22, 21, 53))

# Open using MetPy's Level2File class
l2 = prod.parse()

###################
# NEXRAD Level 3
start = datetime(2022, 10, 30, 15)
end = start + timedelta(hours=2)
products = NEXRADLevel3Archive().get_range('FTG', 'N0B', start, end)

# Get all the file names--could also get a file-like object or open with MetPy Level3File
print([prod.name for prod in products])

################
#GOES Archives
prod = GOES16Archive().get_product('ABI-L1b-RadC', datetime.utcnow(), channel=2)

# Retrieve using xarray + netcdf-c's S3 support
nc = prod.parse()
20 changes: 20 additions & 0 deletions src/metpy/remote/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Copyright (c) 2015,2016,2018,2021 MetPy Developers.
# Distributed under the terms of the BSD 3-Clause License.
# SPDX-License-Identifier: BSD-3-Clause
"""Tools for reading various file formats.

Check warning on line 4 in src/metpy/remote/__init__.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/__init__.py#L4

Added line #L4 was not covered by tests

Classes supporting formats are written to take both file names (for local files) or file-like
objects; this allows reading files that are already in memory
(using :class:`python:io.StringIO`) or remote files
(using :func:`~python:urllib.request.urlopen`).

``station_info`` is an instance of `StationLookup` to find information about station locations
(e.g. latitude, longitude, altitude) from various sources.
"""

from .aws import * # noqa: F403

Check notice

Code scanning / CodeQL

'import *' may pollute namespace

Import pollutes the enclosing namespace, as the imported module [metpy.remote.aws](1) does not define '__all__'.
from ..package_tools import set_module

Check warning on line 16 in src/metpy/remote/__init__.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/__init__.py#L15-L16

Added lines #L15 - L16 were not covered by tests

__all__ = aws.__all__[:] # pylint: disable=undefined-variable

Check warning on line 18 in src/metpy/remote/__init__.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/__init__.py#L18

Added line #L18 was not covered by tests

set_module(globals())

Check warning on line 20 in src/metpy/remote/__init__.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/__init__.py#L20

Added line #L20 was not covered by tests
232 changes: 232 additions & 0 deletions src/metpy/remote/aws.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,232 @@
# Copyright (c) 2023 MetPy Developers.
# Distributed under the terms of the BSD 3-Clause License.
# SPDX-License-Identifier: BSD-3-Clause
"""Tools for reading known collections of data that are hosted on Amazon Web Services (AWS).

Check warning on line 4 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L4

Added line #L4 was not covered by tests

"""
import bisect
from datetime import datetime, timedelta
import itertools
from pathlib import Path
import shutil

Check warning on line 11 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L7-L11

Added lines #L7 - L11 were not covered by tests

import boto3
import botocore
from botocore.client import Config
import xarray as xr

Check warning on line 16 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L13-L16

Added lines #L13 - L16 were not covered by tests

from ..io import Level2File, Level3File
from ..package_tools import Exporter

Check warning on line 19 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L18-L19

Added lines #L18 - L19 were not covered by tests

exporter = Exporter(globals())

Check warning on line 21 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L21

Added line #L21 was not covered by tests


class Product:
def __init__(self, obj, reader):
self.path = obj.key
self._obj = obj
self._reader = reader

Check warning on line 28 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L24-L28

Added lines #L24 - L28 were not covered by tests

@property
def url(self):
return f'https://{self._obj.Bucket().name}.s3.amazonaws.com/{self.path}'

Check warning on line 32 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L30-L32

Added lines #L30 - L32 were not covered by tests

@property
def name(self):
return Path(self.path).name

Check warning on line 36 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L34-L36

Added lines #L34 - L36 were not covered by tests

@property
def file(self):
return self._obj.get()['Body']

Check warning on line 40 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L38-L40

Added lines #L38 - L40 were not covered by tests

def download(self, path=None):
if path is None:
path = Path() / self.name
elif (path := Path(path)).is_dir():
path = path / self.name

Check warning on line 46 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L42-L46

Added lines #L42 - L46 were not covered by tests
else:
path = Path(path)

Check warning on line 48 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L48

Added line #L48 was not covered by tests

with open(path, 'wb') as outfile:
shutil.copyfileobj(self.file, outfile)

Check warning on line 51 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L50-L51

Added lines #L50 - L51 were not covered by tests

def parse(self):
return self._reader(self)

Check warning on line 54 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L53-L54

Added lines #L53 - L54 were not covered by tests


def date_iterator(start, end, **step_kw):
while start < end:
yield start
start = start + timedelta(**step_kw)

Check warning on line 60 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L57-L60

Added lines #L57 - L60 were not covered by tests


class S3DataStore:
s3 = boto3.resource('s3', config=Config(signature_version=botocore.UNSIGNED,

Check warning on line 64 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L63-L64

Added lines #L63 - L64 were not covered by tests
user_agent_extra='Resource'))

def __init__(self, bucket_name, delimiter):
self.bucket_name = bucket_name
self.delimiter = delimiter
self._bucket = self.s3.Bucket(bucket_name)

Check warning on line 70 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L67-L70

Added lines #L67 - L70 were not covered by tests

def common_prefixes(self, prefix, delim=None):
delim = delim or self.delimiter
try:
return (p['Prefix'] for p in

Check warning on line 75 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L72-L75

Added lines #L72 - L75 were not covered by tests
self._bucket.meta.client.list_objects_v2(
Bucket=self.bucket_name, Prefix=prefix,
Delimiter=delim)['CommonPrefixes'])
except KeyError:
return []

Check warning on line 80 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L79-L80

Added lines #L79 - L80 were not covered by tests

def objects(self, prefix):
return self._bucket.objects.filter(Prefix=prefix)

Check warning on line 83 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L82-L83

Added lines #L82 - L83 were not covered by tests

def _build_result(self, obj):
return Product(obj, lambda s: None)

Check warning on line 86 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L85-L86

Added lines #L85 - L86 were not covered by tests


@exporter.export
class NEXRADLevel3Archive(S3DataStore):
def __init__(self):
super().__init__('unidata-nexrad-level3', '_')

Check warning on line 92 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L89-L92

Added lines #L89 - L92 were not covered by tests

def sites(self):

Check warning on line 94 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L94

Added line #L94 was not covered by tests
"""Return sites available."""
return (item.rstrip(self.delimiter) for item in self.common_prefixes(''))

Check warning on line 96 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L96

Added line #L96 was not covered by tests

def product_ids(self, site='TLX'):

Check warning on line 98 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L98

Added line #L98 was not covered by tests
"""Return product_ids available.

Takes a site, defaults to TLX.
"""
return (item.split(self.delimiter)[-2] for item in

Check warning on line 103 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L103

Added line #L103 was not covered by tests
self.common_prefixes(f'{site}{self.delimiter}'))

def build_key(self, site, prod_id, dt, depth=None):
parts = [site, prod_id, f'{dt:%Y}', f'{dt:%m}', f'{dt:%d}', f'{dt:%H}', f'{dt:%M}',

Check warning on line 107 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L106-L107

Added lines #L106 - L107 were not covered by tests
f'{dt:%S}']
return self.delimiter.join(parts[slice(0, depth)])

Check warning on line 109 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L109

Added line #L109 was not covered by tests

Check failure

Code scanning / CodeQL

Unhashable object hashed

This [instance](1) of [slice](2) is unhashable.

def dt_from_key(self, key):
return datetime.strptime(key.split(self.delimiter, maxsplit=2)[-1],

Check warning on line 112 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L111-L112

Added lines #L111 - L112 were not covered by tests
'%Y_%m_%d_%H_%M_%S')

def get_range(self, site, prod_id, start, end):
for dt in date_iterator(start, end, days=1):
for obj in self.objects(self.build_key(site, prod_id, dt, depth=5)):
if start <= self.dt_from_key(obj.key) < end:
yield self._build_result(obj)

Check warning on line 119 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L115-L119

Added lines #L115 - L119 were not covered by tests

def get_product(self, site, prod_id, dt):
search_key = self.build_key(site, prod_id, dt)
bounding_keys = [self.build_key(site, prod_id, dt, 2) + self.delimiter]
for depth in range(3, 8):
prefixes = list(itertools.chain(*(self.common_prefixes(b) for b in bounding_keys)))
loc = bisect.bisect_left(prefixes, search_key)
rng = slice(loc - 1, loc + 1) if loc else slice(0, 1)
bounding_keys = prefixes[rng]

Check warning on line 128 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L121-L128

Added lines #L121 - L128 were not covered by tests

Check failure

Code scanning / CodeQL

Unhashable object hashed

This [instance](1) of [slice](2) is unhashable. This [instance](3) of [slice](2) is unhashable.

min_obj = min(itertools.chain(*(self.objects(p) for p in bounding_keys)),

Check warning on line 130 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L130

Added line #L130 was not covered by tests
key=lambda o: abs((self.dt_from_key(o.key) - dt).total_seconds()))

return self._build_result(min_obj)

Check warning on line 133 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L133

Added line #L133 was not covered by tests

def _build_result(self, obj):
return Product(obj, lambda s: Level3File(s.file))

Check warning on line 136 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L135-L136

Added lines #L135 - L136 were not covered by tests


@exporter.export
class NEXRADLevel2Archive(S3DataStore):
def __init__(self):
super().__init__('noaa-nexrad-level2', '/')

Check warning on line 142 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L139-L142

Added lines #L139 - L142 were not covered by tests

def sites(self, dt=None):

Check warning on line 144 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L144

Added line #L144 was not covered by tests
"""Return sites available for a date."""
if dt is None:
dt = datetime.utcnow()
prefix = self.build_key('', dt, depth=3) + self.delimiter
return (item.split('/')[-2] for item in self.common_prefixes(prefix))

Check warning on line 149 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L146-L149

Added lines #L146 - L149 were not covered by tests

def build_key(self, site, dt, depth=None):
parts = [f'{dt:%Y}', f'{dt:%m}', f'{dt:%d}', site, f'{site}{dt:%Y%m%d_%H%M%S}']
return self.delimiter.join(parts[slice(0, depth)])

Check warning on line 153 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L151-L153

Added lines #L151 - L153 were not covered by tests

Check failure

Code scanning / CodeQL

Unhashable object hashed

This [instance](1) of [slice](2) is unhashable.

def dt_from_key(self, key):
return datetime.strptime(key.rsplit(self.delimiter, maxsplit=1)[-1][4:19],

Check warning on line 156 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L155-L156

Added lines #L155 - L156 were not covered by tests
'%Y%m%d_%H%M%S')

def get_range(self, site, start, end):
for dt in date_iterator(start, end, days=1):
for obj in self.objects(self.build_key(site, dt, depth=4)):
try:
if start <= self.dt_from_key(obj.key) < end:
yield self._build_result(obj)
except ValueError:
continue

Check warning on line 166 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L159-L166

Added lines #L159 - L166 were not covered by tests

def get_product(self, site, dt):
search_key = self.build_key(site, dt)
prefix = search_key.split('_')[0]
min_obj = min(self.objects(prefix),

Check warning on line 171 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L168-L171

Added lines #L168 - L171 were not covered by tests
key=lambda o: abs((self.dt_from_key(o.key) - dt).total_seconds()))

return self._build_result(min_obj)

Check warning on line 174 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L174

Added line #L174 was not covered by tests

def _build_result(self, obj):
return Product(obj, lambda s: Level2File(s.file))

Check warning on line 177 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L176-L177

Added lines #L176 - L177 were not covered by tests


@exporter.export
class GOESArchive(S3DataStore):
def __init__(self, satellite):
super().__init__(f'noaa-goes{satellite}', delimiter='/')

Check warning on line 183 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L180-L183

Added lines #L180 - L183 were not covered by tests

def product_ids(self):

Check warning on line 185 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L185

Added line #L185 was not covered by tests
"""Return product_ids available."""
return (item.rstrip(self.delimiter) for item in self.common_prefixes(''))

Check warning on line 187 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L187

Added line #L187 was not covered by tests

def build_key(self, product, dt, depth=None):
parts = [product, f'{dt:%Y}', f'{dt:%j}', f'{dt:%H}', f'OR_{product}']
return self.delimiter.join(parts[slice(0, depth)])

Check warning on line 191 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L189-L191

Added lines #L189 - L191 were not covered by tests

Check failure

Code scanning / CodeQL

Unhashable object hashed

This [instance](1) of [slice](2) is unhashable.

def _subprod_prefix(self, prefix, mode, channel):
subprods = set(item.rstrip('_').rsplit('-', maxsplit=1)[-1] for item in

Check warning on line 194 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L193-L194

Added lines #L193 - L194 were not covered by tests
self.common_prefixes(prefix + '-', '_'))
if len(subprods) > 1:
if modes := set(item[1] for item in subprods):
if len(modes) == 1:
mode = next(iter(modes))
if str(mode) in modes:
prefix += f'-M{mode}'

Check warning on line 201 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L196-L201

Added lines #L196 - L201 were not covered by tests
else:
raise ValueError(

Check warning on line 203 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L203

Added line #L203 was not covered by tests
f'Need to specify a valid operating mode. Available options are '
f'{", ".join(sorted(modes))}')
if channels := set(item[-2:] for item in subprods):
if len(channels) == 1:
channel = next(iter(channels))
if str(channel) in channels:
prefix += f'C{channel}'
elif isinstance(channel, int) and f'{channel:02d}' in channels:
prefix += f'C{channel:02d}'

Check warning on line 212 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L206-L212

Added lines #L206 - L212 were not covered by tests
else:
raise ValueError(

Check warning on line 214 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L214

Added line #L214 was not covered by tests
f'Need to specify a valid channel. Available options are '
f'{", ".join(sorted(channels))}')
return prefix

Check warning on line 217 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L217

Added line #L217 was not covered by tests

def dt_from_key(self, key):
start_time = key.split('_')[-3]
return datetime.strptime(start_time[:-1], 's%Y%j%H%M%S')

Check warning on line 221 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L219-L221

Added lines #L219 - L221 were not covered by tests

def get_product(self, product, dt, mode=None, channel=None):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the kwarg should be "band" instead of "channel". To my knowledge the word "channel" was since the 90s more and more replaced with the synonymous word "band" e.g. AVHRR has "channels" but MODIS, VIIRS have "bands". Sometimes the word "channel" is still used today in official documents when talking more about the hardware side of the instruments. The GOES-R SERIES PRODUCT DEFINITION AND USERS’ GUIDE with 726 pages yields 25 search results for "channel" but several hundreds for "band". Oddly the GOES-ABI L2 filenames use C13, "C" for Channel though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad I'm not the only one scratching my head over when to say "band" or "channel" 😂

It does seem "band" is the preferred term in the GOES NetCDF files; some examples...

band_id_C01:long_name = "ABI channel 1" ;
band_id_C01:standard_name = "sensor_band_identifier" ;

band_wavelength_C01 = 0.47
band_wavelength_C01:long_name = "ABI band 1 central wavelength" ;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the "C" prefix for band/channel is the only inconsistency from the GOES side with regards to what we call it.

I'm happy to just use "band", but I think that's why "channel" comes to my mind first.

prefix = self.build_key(product, dt)
prefix = self._subprod_prefix(prefix, mode, channel)
min_obj = min(self.objects(prefix),

Check warning on line 226 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L223-L226

Added lines #L223 - L226 were not covered by tests
key=lambda o: abs((self.dt_from_key(o.key) - dt).total_seconds()))

return self._build_result(min_obj)

Check warning on line 229 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L229

Added line #L229 was not covered by tests

def _build_result(self, obj):
return Product(obj, lambda s: xr.open_dataset(s.url + '#mode=bytes', engine='netcdf4'))

Check warning on line 232 in src/metpy/remote/aws.py

View check run for this annotation

Codecov / codecov/patch

src/metpy/remote/aws.py#L231-L232

Added lines #L231 - L232 were not covered by tests