Download data from a STAC API using GDAL and the command line
This tutorial walks through how to use the STACIT GDAL Driver to retrieve data from a STAC catalog or collection using GDAL’s command line interface (CLI). We’ll be using data from Planetary Computer’s USGS Land Change Monitoring, Assessment, and Projection (LCMAP) collection as an example.
We’re going to assume that you’ve got GDAL
installed (version 3.4 or newer), and are working in
a Bash-like shell (one that lets you define variables with VAR=
and
call them with $VAR
) with gdalwarp and gdalinfo available. We’ll also
use curl, head
and jq once to look at the results of an
API query – but you can follow the rest of the tutorial without those
installed.
To download the 2021 LCMAP primary land cover classification (lcpri
)
for the New York State bounding box, we can use gdalwarp with the STACIT
driver like so:
gdalwarp "STACIT:\"https://planetarycomputer.microsoft.com/api/stac/v1/search?&collections=usgs-lcmap-conus-v13&datetime=2021-01-01/2021-12-31&bbox=-79.762,40.496,-71.856,45.013\":asset=lcpri" output.tif
Processing STACIT:"https://planetarycomputer.microsoft.com/api/stac/v1/search?&collections=usgs-lcmap-conus-v13&datetime=2021-01-01/2021-12-31&bbox=-79.762,40.496,-71.856,45.013":asset=lcpri [1/1] : 0Using internal nodata values (e.g. 0) for image STACIT:"https://planetarycomputer.microsoft.com/api/stac/v1/search?&collections=usgs-lcmap-conus-v13&datetime=2021-01-01/2021-12-31&bbox=-79.762,40.496,-71.856,45.013":asset=lcpri.
...10...20...30...40...50...60...70...80...90...100 - done.
That one-liner finds all the lcpri
rasters for our spatiotemporal area
of interest on Planetary Computer and downloads them, merging them into
a single output file as it does so. Depending how familiar you are with
HTTP queries, the one-liner probably either makes perfect sense or looks
like complete gibberish. The rest of this tutorial will walk through the
components of that one-liner, to try and help it make perfect sense to
everyone.
Most of that one-liner is taken up by a single URL, which we use to find
out where the relevant rasters we’re after are on the Planetary
Computer. The base of that URL points to the Item-Search API
endpoint,
which is a standardized API endpoint that STAC APIs like Planetary
Computer’s provide to let users search through the various collections
and items available in the API. Let’s put the URL for that endpoint in a
variable called QUERY_URL
:
QUERY_URL="https://planetarycomputer.microsoft.com/api/stac/v1/search"
echo $QUERY_URL
https://planetarycomputer.microsoft.com/api/stac/v1/search
Another long chunk of the URL in our one-liner is made up by query
parameters, which we use to filter down the items provided by the API to
just the data products and spatiotemporal range that we want. For
instance, we’ll want to set the collections
parameter to filter our
search to only include LCMAP data:
COLLECTION="usgs-lcmap-conus-v13"
QUERY_URL="$QUERY_URL?&collections=$COLLECTION"
echo $QUERY_URL
https://planetarycomputer.microsoft.com/api/stac/v1/search?&collections=usgs-lcmap-conus-v13
We’ll also want to filter our results to only return items that fall
within our spatiotemporal area of interest. We can set the temporal
range of our query using the datetime
parameter, providing a date
formatted in RFC 3339 Section
5.6 format.
We’ll limit our query to only return data for 2021:
DATETIME="2021-01-01/2021-12-31"
QUERY_URL="$QUERY_URL&datetime=$DATETIME"
echo $QUERY_URL
https://planetarycomputer.microsoft.com/api/stac/v1/search?&collections=usgs-lcmap-conus-v13&datetime=2021-01-01/2021-12-31
We also need to limit the spatial range of our results, using a bounding box in the WGS 84 coordinate reference system:
WGS84_BBOX="-79.762,40.496,-71.856,45.013"
QUERY_URL="$QUERY_URL&bbox=$WGS84_BBOX"
echo $QUERY_URL
https://planetarycomputer.microsoft.com/api/stac/v1/search?&collections=usgs-lcmap-conus-v13&datetime=2021-01-01/2021-12-31&bbox=-79.762,40.496,-71.856,45.013
This is a complete item search query string! If we visited this URL – or accessed it via curl or another utility – we’d see a feature collection listing metadata about all the available LCMAP rasters falling within our area of interest. We could optionally use curl and jq, if they’re installed, to take a peek at what this JSON document looks like:
curl -s $QUERY_URL | head -n 1 | jq > query.txt
head -n 18 query.txt
{
"type": "FeatureCollection",
"features": [
{
"id": "LCMAP_CU_030007_2021_V13_CCDC",
"bbox": [
-72.93784998930636,
39.594812916149024,
-70.76420061777068,
41.227407332200016
],
"type": "Feature",
"links": [
{
"rel": "collection",
"type": "application/json",
"href": "https://planetarycomputer.microsoft.com/api/stac/v1/collections/usgs-lcmap-conus-v13"
},
Now that our query URL is constructed, we need to add a few
configuration options to inform GDAL we want to download data from this
query using the STACIT driver. We’ll start off by prepending STACIT:
in front of our query url, which we’ll also wrap in quotes (using \"
to make sure those quotes are preserved):
QUERY_URL="STACIT:\"$QUERY_URL\""
echo $QUERY_URL
STACIT:"https://planetarycomputer.microsoft.com/api/stac/v1/search?&collections=usgs-lcmap-conus-v13&datetime=2021-01-01/2021-12-31&bbox=-79.762,40.496,-71.856,45.013"
And last but not least, we’ll specify that we only want to download the
lcpri
asset from each of the items returned by our query, by appending
:asset=lcpri
to the end of this URL:
ASSET="lcpri"
QUERY_URL="$QUERY_URL:asset=$ASSET"
echo $QUERY_URL
STACIT:"https://planetarycomputer.microsoft.com/api/stac/v1/search?&collections=usgs-lcmap-conus-v13&datetime=2021-01-01/2021-12-31&bbox=-79.762,40.496,-71.856,45.013":asset=lcpri
We’ve now constructed the URL that we used in the one-liner at the start of this tutorial! Let’s look at what all of those elements look like when combined in a single chunk:
QUERY_URL="https://planetarycomputer.microsoft.com/api/stac/v1/search"
COLLECTION="usgs-lcmap-conus-v13"
DATETIME="2021-01-01/2021-12-31"
WGS84_BBOX="-79.762,40.496,-71.856,45.013"
ASSET="lcpri"
QUERY_URL="$QUERY_URL?&collections=$COLLECTION"
QUERY_URL="$QUERY_URL&datetime=$DATETIME"
QUERY_URL="$QUERY_URL&bbox=$WGS84_BBOX"
QUERY_URL="STACIT:\"$QUERY_URL\""
QUERY_URL="$QUERY_URL:asset=$ASSET"
echo $QUERY_URL
STACIT:"https://planetarycomputer.microsoft.com/api/stac/v1/search?&collections=usgs-lcmap-conus-v13&datetime=2021-01-01/2021-12-31&bbox=-79.762,40.496,-71.856,45.013":asset=lcpri
We’re now able to use this query URL with any GDAL utility, letting us work with this remote data as if it were local. For instance, we can use gdalinfo to get information about the URLs we’d use to download the data we requested, as well as the extent, resolution, and CRS of this dataset:
gdalinfo $QUERY_URL
Driver: VRT/Virtual Raster
Files: /vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/030007/2021/LCMAP_CU_030007_2021_20220721_V13_CCDC/LCMAP_CU_030007_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/030006/2021/LCMAP_CU_030006_2021_20220721_V13_CCDC/LCMAP_CU_030006_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/030005/2021/LCMAP_CU_030005_2021_20220721_V13_CCDC/LCMAP_CU_030005_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/029007/2021/LCMAP_CU_029007_2021_20220721_V13_CCDC/LCMAP_CU_029007_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/029006/2021/LCMAP_CU_029006_2021_20220721_V13_CCDC/LCMAP_CU_029006_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/029005/2021/LCMAP_CU_029005_2021_20220721_V13_CCDC/LCMAP_CU_029005_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/029004/2021/LCMAP_CU_029004_2021_20220721_V13_CCDC/LCMAP_CU_029004_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/028008/2021/LCMAP_CU_028008_2021_20220721_V13_CCDC/LCMAP_CU_028008_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/028007/2021/LCMAP_CU_028007_2021_20220721_V13_CCDC/LCMAP_CU_028007_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/028006/2021/LCMAP_CU_028006_2021_20220721_V13_CCDC/LCMAP_CU_028006_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/028005/2021/LCMAP_CU_028005_2021_20220721_V13_CCDC/LCMAP_CU_028005_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/028004/2021/LCMAP_CU_028004_2021_20220721_V13_CCDC/LCMAP_CU_028004_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/027008/2021/LCMAP_CU_027008_2021_20220721_V13_CCDC/LCMAP_CU_027008_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/027007/2021/LCMAP_CU_027007_2021_20220721_V13_CCDC/LCMAP_CU_027007_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/027006/2021/LCMAP_CU_027006_2021_20220721_V13_CCDC/LCMAP_CU_027006_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/027005/2021/LCMAP_CU_027005_2021_20220721_V13_CCDC/LCMAP_CU_027005_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/027004/2021/LCMAP_CU_027004_2021_20220721_V13_CCDC/LCMAP_CU_027004_2021_20220628_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/026008/2021/LCMAP_CU_026008_2021_20220721_V13_CCDC/LCMAP_CU_026008_2021_20220629_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/026007/2021/LCMAP_CU_026007_2021_20220721_V13_CCDC/LCMAP_CU_026007_2021_20220629_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/026006/2021/LCMAP_CU_026006_2021_20220721_V13_CCDC/LCMAP_CU_026006_2021_20220629_V13_LCPRI.tif
/vsicurl?pc_url_signing=yes&pc_collection=usgs-lcmap-conus-v13&url=https%3A//landcoverdata.blob.core.windows.net/lcmap/CU/V13/025007/2021/LCMAP_CU_025007_2021_20220721_V13_CCDC/LCMAP_CU_025007_2021_20220629_V13_LCPRI.tif
Size is 30000, 25000
Coordinate System is:
PROJCRS["AEA WGS84",
BASEGEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]],
CONVERSION["unnamed",
METHOD["Albers Equal Area",
ID["EPSG",9822]],
PARAMETER["Latitude of false origin",23,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8821]],
PARAMETER["Longitude of false origin",-96,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8822]],
PARAMETER["Latitude of 1st standard parallel",29.5,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8823]],
PARAMETER["Latitude of 2nd standard parallel",45.5,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8824]],
PARAMETER["Easting at false origin",0,
LENGTHUNIT["metre",1],
ID["EPSG",8826]],
PARAMETER["Northing at false origin",0,
LENGTHUNIT["metre",1],
ID["EPSG",8827]]],
CS[Cartesian,2],
AXIS["easting",east,
ORDER[1],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]],
AXIS["northing",north,
ORDER[2],
LENGTHUNIT["metre",1,
ID["EPSG",9001]]]]
Data axis to CRS axis mapping: 1,2
Origin = (1184415.000000000000000,2714805.000000000000000)
Pixel Size = (30.000000000000000,-30.000000000000000)
Corner Coordinates:
Upper Left ( 1184415.000, 2714805.000) ( 80d32' 7.55"W, 46d33'11.84"N)
Lower Left ( 1184415.000, 1964805.000) ( 81d58'10.65"W, 39d54'46.32"N)
Upper Right ( 2084415.000, 2714805.000) ( 69d16'10.42"W, 44d45'58.75"N)
Lower Right ( 2084415.000, 1964805.000) ( 71d40'23.06"W, 38d18' 3.23"N)
Center ( 1634415.000, 2339805.000) ( 75d50'28.69"W, 42d29'27.07"N)
Band 1 Block=128x128 Type=Byte, ColorInterp=Palette
NoData Value=0
Or we can use gdalwarp to download these assets and merge them into a single file:
OUTPUT_FILE="lcpri_nys.tif"
gdalwarp $QUERY_URL $OUTPUT_FILE
Processing STACIT:"https://planetarycomputer.microsoft.com/api/stac/v1/search?&collections=usgs-lcmap-conus-v13&datetime=2021-01-01/2021-12-31&bbox=-79.762,40.496,-71.856,45.013":asset=lcpri [1/1] : 0Using internal nodata values (e.g. 0) for image STACIT:"https://planetarycomputer.microsoft.com/api/stac/v1/search?&collections=usgs-lcmap-conus-v13&datetime=2021-01-01/2021-12-31&bbox=-79.762,40.496,-71.856,45.013":asset=lcpri.
...10...20...30...40...50...60...70...80...90...100 - done.
The STACIT driver is smart enough to know how to follow the URLs provided by this feature collection to find and download our desired assets, and will even automatically handle authorizing our requests to the Planetary Computer.