ESA Project Results Repository (PRR) Data Access and Collections Preview

This notebook has been created to support the access to the users of EarthCODE and APEX, who would like to exploit available products and project results stored in the ESA Project Results Repository (PRR). PRR provides access to data, workflows, experiments and documentation from ESA EOP-S Projects organised across Collections, accessible via OGC Records e STAC API.

Each collection contains STAC Items, with their related assets stored within the PRR storage.

Scientists/commercial companies can access the PRR via the EarthCODE and APEx projects.

Use following notebook cells to preview the content of the ESA PRR and request the download of selected products.

Loading Libraries and set up logging level¶

import os
import logging
import pprint
import shutil
from urllib.parse import urljoin
from urllib.request import urlretrieve

#Make sure you have installed pystac_client before running this
from pystac_client import Client

# set pystac_client logger to DEBUG to see API calls
logging.basicConfig()
logger = logging.getLogger("pystac_client")
logger.setLevel(logging.DEBUG)

Connect to ESA PRR Catalog and display the list of collections available¶

# URL of the STAC Catalog to query
catalog_url = "https://eoresults.esa.int/stac"

# custom headers
headers = []

cat = Client.open(catalog_url, headers=headers)
cat # display the basic informaiton about PRR Catalog in STAC Format

Use the cell below to access entire list of collections available in ESA PRR.

collection_search = cat.collection_search(limit=150)
print(f"Total number of collections found in ESA PRR is {collection_search.matched()}")

# Display the name of the names of collection (collection-ids) to be used to filter the colleciton of interest
for collection in collection_search.collections_as_dicts():
    print(collection.get("id", "Unnamed Collection"))

Alternatively, you can display the metadata of all STAC Collections available

# Or they can be displayed with their full metadata
collection_search = cat.collection_search(
    datetime='2023-04-02T00:00:00Z/2024-08-10T23:59:59Z',  #this is an additional filter to be added to filter the collections based on the date.
    limit=10
)
print(f"{collection_search.matched()} collections found")
print("PRR available Collections\n")

for results in collection_search.collections_as_dicts():  # maybe this part should not display entire dic
    pp = pprint.PrettyPrinter(depth=4)
    pp.pprint(results)

Open Sentinel-3 AMPLI Ice Sheet Elevation collection¶

To access specific collection, we will use the collection id from the cell above. Type sentinel3-ampli-ice-sheet-elevation to connect to selected collection and display its metadata.

collection = cat.get_collection("sentinel3-ampli-ice-sheet-elevation") # place here the id of the selected collection
#collection # or use simply json metadata to display the information 
print("PRR Sentinel-3 AMPLI Collection\n")
pp = pprint.PrettyPrinter(depth=4)
pp.pprint(collection.to_dict())

#Or display it in the STAC file format to better visualise the attributes and properties 
collection

From the cell below, we will retrieve and explore queryable fields from a STAC API, which allows us to understand what parameters we can use for filtering our searches.

queryable = collection.get_queryables()

pp = pprint.PrettyPrinter(depth=4)
pp.pprint(queryable)

Display STAC Items from Sentinel-3 AMPLI Ice Sheet Elevation collection¶

By executing the cell below you will get the ids of items that can be found in the specific collection (requested above).
First five items from the list are printed out.

items = collection.get_items()

# flush stdout so we can see the exact order that things happen
def get_five_items(items):
    for i, item in enumerate(items):
        print(f"{i}: {item}", flush=True)
        if i == 4:
            return
        
print("First page", flush=True)
get_five_items(items)

print("Second page", flush=True)
get_five_items(items)

Now execute a search with a set of parameters. In this case it returns just one item because we filter on one queryable parameter (id)

#Search for items based on spatio-temporal properties

# AOI entire world
geom = {
    "type": "Polygon",
    "coordinates": [
        [
            [-180, -90],
            [-180, 90],
            [180 , 90],
            [180, -90],
            [-180, -90],
        ]
    ],
}

# limit sets the # of items per page so we can see multiple pages getting fetched
#In this search we apply also filtering on ID that is one of the searchable parameters for the colletion
search = cat.search(
    max_items=7,
    limit=5,
    collections="sentinel3-ampli-ice-sheet-elevation",        # specify collection id
    intersects=geom,
    query={"id": {"eq": "sentinel-3a-antarctica-cycle107"}},  # search for the specific Item in the collection 
    datetime="2023-04-02T00:00:00Z/2024-08-10T23:59:59Z",     # specify the start and end date of the time frame to perform the search 
)

items = list(search.items())

print(len(items))

pp = pprint.PrettyPrinter(depth=4)
pp.pprint([i.to_dict() for i in items])

If you do not know the item id, search through available satellite instrument name, region, number of the cycle and the datetime range of the products of interest.

You can specify them by filtering based on following possible values:

missions: 3a or 3b
regions: anarctica or greenland
cycle range: for sentinel-3a possible cycle range is from 005 to 112; while sentinel-3b has range from 011-093
datetime: specify the time frame of the products from the range between: 2016-06-01 00:00:00 UTC – 2024-05-09 00:00:00 UTC

#Search for items from specific mission and type of the instrument (based on the id) and the region as well as cycle number 
# Define your cycle range and mission types
cycle_range = [f"{i:03d}" for i in range(90, 111)] #005 to 111   # for sentinel-3a possible cycle range is from 005 to 111; while s3b has range from 011-092
missions = ["3b"]          # select the mission and sensor type from:"sentinel-3a" or "sentinel-3b"]  
regions = ["antarctica"]              # specify the region from: "antarctica" or "greenland"

# AOI entire world
geom = {
    "type": "Polygon",
    "coordinates": [
        [
            [-180, -90],
            [-180, 90],
            [180 , 90],
            [180, -90],
            [-180, -90],
        ]
    ],
}

# limit sets the # of items per page so we can see multiple pages getting fetched
#In this search we apply also filtering on ID that is one of the searchable parameters for the colletion
search = cat.search(
    max_items=7,
    limit=5,
    collections="sentinel3-ampli-ice-sheet-elevation",
    intersects=geom,  # search for the specific Item in the collection 
    datetime="2021-04-02T00:00:00Z/2024-08-10T23:59:59Z",     # specify the start and end date of the time frame to perform the search which are: 2016-06-01 00:00:00 UTC – 2024-05-09 00:00:00 UTC
)
items = list(search.items())
print(f"Number of items found: {len(items)}")
print(items)

pp = pprint.PrettyPrinter(depth=4)

filtered = [
    item for item in items
    if any(m in item.id.lower()  for m in missions)
    and any(r in item.id.lower()  for r in regions)
    and any(f"cycle{c}" in item.id.lower() for c in cycle_range)
]


#for i, item in enumerate(filtered, 2):
   # print(f"{i}. {item.id} @ {item.datetime}")

## Print number of filtered items
print(f"Number of filtered items: {len(filtered)}")
for i, item in enumerate(filtered, 2):
    print(f"{i}. {item.id} @ {item.datetime}")

Download all assets from the selected item
¶

Based on the selection done in the previous cell, download the products to the downloads folder in your workspace

base_url = "https://eoresults.esa.int"

item_to_be_downloaded = 3
target = items[item_to_be_downloaded]

output_dir = f"downloads/{target.id}"
os.makedirs(output_dir, exist_ok=True)

assets_total=len(target.assets.items())
assets_current=0
for asset_key, asset in target.assets.items():
    filename = os.path.basename(asset.href)
    full_href = urljoin(base_url, asset.href)
    local_path = os.path.join(output_dir, filename)
    assets_current+=1
    print(f"[{assets_current}/{assets_total}] Downloading {filename}...")
    try:
        urlretrieve(full_href, local_path)
    except Exception as e:
        print(f"Failed to download {full_href}. {e}")

Download filtered items
¶

Based on the selection done in the previous cell, download the products to the downloads folder in your workspace. You will download here the items which result from further filtering options (by mission type, cycle number, region etc.)

target = filtered[0] if len(filtered) > 0 else None

output_dir = f"downloads/{target.id}"
os.makedirs(output_dir, exist_ok=True)

assets_total=len(target.assets.items())
assets_current=0
for asset_key, asset in target.assets.items():
    filename = os.path.basename(asset.href)
    full_href = urljoin(base_url, asset.href)
    local_path = os.path.join(output_dir, filename)
    assets_current+=1
    print(f"[{assets_current}/{assets_total}] Downloading {filename}...")
    try:
        urlretrieve(full_href, local_path)
    except Exception as e:
        print(f"Failed to download {full_href}. {e}")

base_url = "https://eoresults.esa.int"
for index, item in enumerate(filtered, 2):
    output_dir = f"filtered/{item.id}"
    os.makedirs(output_dir, exist_ok=True)

    assets_total = len(item.assets.items())
    assets_current = 0

    for asset_key, asset in item.assets.items():
        filename = os.path.basename(asset.href)
        full_href = urljoin(base_url, asset.href)
        local_path = os.path.join(output_dir, filename)

        assets_current += 1
        print(f"[{index}] [{assets_current}/{assets_total}] Downloading {filename} for item {item.id}...")

        try:
            urlretrieve(full_href, local_path)
        except Exception as e:
            print(f"Failed to download {full_href}. {e}")

print(f"Downloaded assets for {len(filtered)} items.")

(Optional) Read some data to ensure all items are downloaded properly¶

import xarray as xr
import numpy as np

# change this to a downloaded file
example_filepath = f'./downloads/{target.id}/S3A_SR_2_TDP_LI_20240403T201315_20240403T201615_20250416T191921_0180_111_014______CNE_GRE_V001.nc'

# Open selected product and check the values
# Note: You can select another group of values to read : satellite_and_altimeter, or ESA_L2_processing
ds = xr.open_dataset(example_filepath, group='AMPLI_processing')
values = ds['elevation_radar_ampli'].values
values[~np.isnan(values)]

(Optional) Create an archive of products downloaded¶

Create an archive of the products downloaded to your workspace and save them in .zip format to make them compressed

# Create an archive of downloaded products 
zip_path = shutil.make_archive(output_dir, 'zip', root_dir=output_dir)
print(f"Created ZIP archive: {zip_path}")

EarthCODE Examples

Example from SRAL Processing over Land Ice Dataset