-
Notifications
You must be signed in to change notification settings - Fork 5
SpatioTemporal Asset Catalogs (STAC)
SpatioTemporal Asset Catalog (STAC) specification was designed to establish a standard, unified language to talk about geospatial data, with the purpose of making data easier to discover and share across the web. STAC uses json and geojson files to describe (i.e., metadata) geospatial assets of nearly any type, including imagery, point clouds, datacubes, and vector data.
Developing STAC has been an open collaborative effort across the geospatial community with the goal of standardizing the way geospatial assets are described and shared across the web. Groups like NASA, European Space Agency, USGS, Google Earth Engine, Open Topography, Maxar, Planet, Radiant Earth, and Microsoft Planetary Computer are describing their geospatial data using the STAC protocol.
Using the STAC model, geospatial assets can be stored anywhere in cloud storage (like s3, blob, google cloud, or Cyverse). The json and geojson files that are used to describe and locate the assets can also be located anywhere, but are all indexed in a central place, the STAC Browser and the STAC Index. By using this model we are essentially creating one giant open catalog of geospatial data.
The Radiant Earth STAC Browser provides a nice graphical interface to browse through content and a map to preview the data. There will typically be descriptions of the data including the data provider, the data license, and the data format. There will also be links to download or stream the actual data from it's location in cloud storage.
Fig: Browsing Geospatial Data with STAC Browser
Static STAC catalogs consist of the four STAC(catalogs, collections, items, assets) components hosted on a web server or in cloud storage. A dynamic STAC catalog is the same thing but is implemented in a data API. The API enables programmatic spatiotemporal queries of the STAC, whereas the static catalogs do not. The STAC API is a RESTful endpoint that enables search of STAC Items, specified in OpenAPI, following OGC Web Feature Service 3.0.
STAC consists of nested and hierarchical json and geojson files that link to each other. There are four components to making a given STAC run. They can be used independently of one another, but most often they are all used together:
Component | Definition | Format |
---|---|---|
catalogs |
a file of links that provides a structure to organize and browse STAC Items and Collections | JSON |
collections |
additional information such as the extents, license, keywords, providers, etc that describe STAC Items | JSON |
item |
core atomic unit, representing a single spatiotemporal item | GeoJSON |
assets |
the actual datasets presented through STAC | Geotiffs, Point Clouds, Zarr, etc |
A catalog
is a simple, flexible JSON file with links that provides the structure to organize and browse STAC collections
& items
.
catalog
specification
STAC Catalog Relation and Media Types
collections
include important annotation metadata about multiple catalogs
and items
.
An item
is described by GeoJSON with metadata which describe an asset
and links to the actual data hosted on the internet.
You can query and interact with STAC catalogs programmatically by using python libraries such as pystac and pystac_client.
Learn to access STAC using python tools with the Jupyter Notebook tutorial
Generating your own STACs can be done manually, programmatically, or using a templated editor.
Create a Catalog with the python library PyStac.
Tutorials to read/write STAC using the python library Pystac
Official STAC Learning Examples
An example python script to create STAC can be found here.
UArizona DataLab, Data Science Institute, University of Arizona, 2024.