Skip to content

For Products

FAIR Data Best Practices EarthCODE

The final data outputs of your research in EarthCODE are referred to as Products (i.e. data products). A product in EarthCODE typically includes:

  • A dataset representing the measured or derived values of one or more environmental or geoscience variables, that is, your research.
  • Documentation that describe the methodology or related publications.
  • Rich STAC Metadata, including EO satellite mission(s), project affiliation, and classification tags.

Each product is described using STAC (SpatioTemporal Asset Catalog) metadata, specifically through a Collection that captures key attributes like the spatial and temporal extent, scientific context, provenance, and more. To ensure FAIRness, the catalog uses a shared dictionary and metadata standard. This structure enables users to explore products across diverse sources by theme, variable, and mission.

The data itself is typically uploaded and stored on the ESA Project Results Repository (PRR), or alternatively on external long-term storage repositories. The PRR is ESA’s dedicated long-term storage service for project results. For detailed instructions about the PRR, refer to the ESA Project Results Repository (PRR) section.

Data Products

Variables
  • Product: A geoscience dataset that captures specific variables over a spatial and temporal extent. Products are distinguished by factors such as the processing method, validation status, and EO mission used.
  • Project: The ESA-funded research project under which the product was generated.
  • Variables: Scientific or environmental variables measured or estimated in the dataset.
  • Themes: Top-level science topics from ESA’s strategic challenges (e.g. climate, biodiversity, atmosphere).
  • Keywords: Hierarchical tags for product discovery, often derived from the variables and broader scientific terms.
  • EO Mission: The satellite mission or sensor used to generate the data, referenced in the product’s metadata.
  • Documentation: A link to related materials or publications explaining how the product was created.
  • Data link: A pointer to the actual location of the dataset (e.g. PRR, institutional archive, or external repository).

FAIR Checklist

Use this checklist to prepare your Product for publication in EarthCODE.

FAIR EarthCODE Products Standards

As discussed in FAIR and Open Science Best Practices, standards need to specified in detail to describe what constitutes "FAIR" for each community. EarthCODE uses commonly adopted standards for EO and Earth Sciences. For Products, we define them as follows:

Findable (F)Accessible (A)

F1 — Use globally unique, persistent IDs (DOIs) for datasets. Example: 10.57780/s3d-83ad619. These may be assigned automatically from EarthCODE or brought by the user.

F2 — Metadata follows the STAC specification., richly described using the OSC extension. Example: EarthCODE OSC STAC Item.

F3 — Metadata explicitly includes dataset identifiers under a via link titled `Access`. Example: STAC links.Access.

F4 — Metadata are indexed and searchable in the: ESA Open Science Catalog.

A1 — Metadata are accessible over HTTPS via STAC API, OGC CSW (2.0.2/3.0.0), OpenSearch, OAI-PMH, or SRU.

A1.1 — Protocols are open, free, and universally implementable.

A1.2 — The protocols allow for authentication and authorization (e.g., OpenID Connect) where needed.

A2 — Metadata are accessible even if data are delete, as an EarthCODE policy.

Interoperable (I)

I1 — Use of formal, accessible representation languages - EarthCODE uses JSON in STAC.

I2 — EarthCODE adopts controlled FAIR vocabularies for meta(data) - with variables defined in the variables catalog as the canonical hierarchy, which aligns terms and links to CF Standard Names, GCMD Keywords, and others - encoded via the STAC OSC extension.

I3 — EarthCODE items add qualified links to related projects, experiments that created the data, documentation, relevant themes and others via the STAC OSC extension and related datasets via the STAC processing extension.

Reusable (R)

R1 — Provide rich, domain-appropriate descriptions about variables modelled, themes described, spatial and temporal extent, and related missions and instrumentation

R1.1 — EarthCODE products are published with clear, standardized licenses (for example, CC BY 4.0)

R1.2 — Record provenance of processing via workflow and experiments links and via STAC/OGC API - Records links

R1.3 — EarthCODE aligns with community standards for Earth Science and EO, adopts STAC, prefers cloud-native formats (e.g., Zarr, COG, GeoParquet), and widely used Earth-science formats (e.g., netCDF).

EarthCODE FAIR Product Example

For example, the Sentinel-3 AMPLI Ice Sheet Elevation product is published as a STAC Collection (F2, I1) enriched with EarthCODE taxonomy elements including Themes, Variables, and EO Missions (R1). Metadata contains a persistent DOI (https://doi.org/10.57780/s3d-83ad619) (F1) assigned by EarthCODE and explicit via links for direct dataset access and supporting documentation (F3) under "Access" and "Documentation". The actual data is stored on the ESA Project Results Repository (PRR), ensuring long-term preservation and stable access.

The collection is indexed in the ESA Open Science Catalog, making it discoverable through standard search and API queries (F4, A1, A1.1). Links to the generating project, theme, and mission provide qualified relationships to related resources (I3), and variables are aligned with the OSC Variables Catalog and CF Standard Names for semantic interoperability (I2). The licence is clearly stated as CC-BY-4.0 (R1.1) and the dataset is in a well documented NetCDF format that adheres to widely used community standards (R1.3), ensuring that it is fully Findable, Accessible, Interoperable, and Reusable within the Earth observation community.

STAC Example
json
{
    "type": "Collection",
    "id": "sentinel3-ampli-ice-sheet-elevation",
    "stac_version": "1.0.0",
    "description": "Ice sheet elevation estimated along the Sentinel-3 satellite track, as retrieved with the Altimeter data Modelling and Processing for Land Ice (AMPLI). The products cover Antarctica and Greenland.",
    "links": [
      {
        "rel": "root",
        "href": "../../catalog.json",
        "type": "application/json",
        "title": "Open Science Catalog"
      },
      {
        "rel": "via",
        "href": "https://eoresults.esa.int/browser/#/external/eoresults.esa.int/stac/collections/sentinel3-ampli-ice-sheet-elevation",
        "title": "Access"
      },
      {
        "rel": "via",
        "href": "https://eoresults.esa.int/d/sentinel3-ampli-ice-sheet-elevation/2025/05/07/sentinel-3-ampli-user-handbook/S3_AMPLI_User_Handbook.pdf",
        "title": "Documentation"
      },
      {
        "rel": "child",
        "href": "https://eoresults.esa.int/stac/collections/sentinel3-ampli-ice-sheet-elevation",
        "type": "application/json",
        "title": "Sentinel-3 AMPLI Ice Sheet Elevation"
      },
      {
        "rel": "parent",
        "href": "../catalog.json",
        "type": "application/json",
        "title": "Products"
      },
      {
        "rel": "related",
        "href": "../../projects/sral-processing-landice/collection.json",
        "type": "application/json",
        "title": "Project: SRAL Processing over Land Ice"
      },
      {
        "rel": "related",
        "href": "../../themes/cryosphere/catalog.json",
        "type": "application/json",
        "title": "Theme: Cryosphere"
      },
      {
        "rel": "related",
        "href": "../../variables/ice-sheet-topography/catalog.json",
        "type": "application/json",
        "title": "Variable: Ice sheet topography"
      },
      {
        "rel": "related",
        "href": "../../eo-missions/sentinel-3/catalog.json",
        "type": "application/json",
        "title": "EO Mission: Sentinel-3"
      },
      {
        "rel": "self",
        "href": "https://esa-earthcode.github.io/open-science-catalog-metadata/products/sentinel3-ampli-ice-sheet-elevation/collection.json",
        "type": "application/json"
      }
    ],
    "stac_extensions": [
      "https://stac-extensions.github.io/osc/v1.0.0/schema.json",
      "https://stac-extensions.github.io/themes/v1.0.0/schema.json",
      "https://stac-extensions.github.io/cf/v0.2.0/schema.json"
    ],
    "osc:project": "sral-processing-landice",
    "osc:status": "completed",
    "osc:region": "Antarctica and Greenland",
    "osc:type": "product",
    "created": "2025-04-04T00:00:00Z",
    "version": "1",
    "sci:doi": "https://doi.org/10.57780/s3d-83ad619",
    "cf:parameter": [
      {
        "name": "ice_sheet_topography"
      }
    ],
    "themes": [
      {
        "scheme": "https://github.com/stac-extensions/osc#theme",
        "concepts": [
          {
            "id": "cryosphere"
          }
        ]
      }
    ],
    "osc:variables": [
      "ice-sheet-topography"
    ],
    "osc:missions": [
        "sentinel-3"
    ],
    "updated": "2025-05-07T20:32:22.960110Z",
    "title": "Sentinel-3 AMPLI Ice Sheet Elevation",
    "extent": {
      "spatial": {
        "bbox": [
          [
            -180.0,
            -90.0,
            180.0,
            90.0
          ]
        ]
      },
      "temporal": {
        "interval": [
          [
            "2016-06-01T00:00:00Z",
            "2024-05-09T23:59:59Z"
          ]
        ]
      }
    },
    "license": "CC-BY-4.0",
    "keywords": [
        "Topography",
        "Glaciers/Ice Sheets",
        "Glacier Elevation/Ice Sheet Elevation",
        "Cryospheric Indicators",
        "Glacial Measurements"
    ]
  }

Open Data & Licensing

Open data is data anyone can access, use, modify, and share — including commercially — provided it’s published in a common, machine-readable format with an open license. Restrictions will typically reduce the reuse value and potential for innovation. You would need to consider the following when making your data open:

  • License is mandatory. Without an explicit license, reuse rights are ambiguous.
  • Costs: Use is free; access may carry minimal delivery/hosting fees. If you use EarthCODE's PRR, ESA will host and distirube your data (free of charge) and store it for the long term.
  • Usability: Openness is about rights; still provide standard formats and rich metadata to maximize reuse.
  • FAIR ≠ Open: FAIR can include controlled access — just document access conditions transparently.

A licence is the permission slip for reuse. In Open Science, no licence means no clear rights—even publicly available materials can’t be reused without explicit permission. For scientific works, the Creative Commons licences are widely used and most well-known.

Upstream constraints matter

Always check upstream licences before you reuse. If you used proprietary inputs or “share-alike” sources, you must honour their terms.

EarthCODE published data products are typically licensed under Creative Commons (CC) licences, the most common are:

LicenceDescription & Typical Use in EarthCODENotes
CC BY 4.0Default for most EarthCODE datasets, documentation, and metadata. Allows any reuse (including commercial) with attribution to the creator.Simple, funder-friendly, and interoperable. Encourages reuse while ensuring credit.
CC0 1.0Public-domain dedication — no restrictions on reuse. Often used for foundational or reference datasets where attribution is not essential.Maximises reuse; some funders require attribution, in which case CC BY 4.0 is preferred.
ODC-By 1.0Attribution licence for databases where sui generis database rights apply (especially in EU/UK contexts).Similar to CC BY but tailored for database rights; use when dataset qualifies as a protected database.
ODbL 1.0Share-alike database licence ensuring that derivatives of the database remain open under the same terms.Used in collaborative database projects (e.g., OpenStreetMap). Can complicate integration with differently licensed data.
PDDL 1.0Public-domain dedication for databases, equivalent to CC0 but for database rights.Best for maximising reuse of databases; attribution is not required.
CC BY-SAReuse with attribution; adaptations must be shared under the same terms (Share Alike). Often used for collaborative community datasets.Ensures openness of derivatives but can complicate integration with datasets under different licences.
CC BY-NCReuse with attribution; non-commercial use only (Non-Commercial). Rarely used in EarthCODE.Not Open Data under most definitions; restricts commercial reuse and some scientific applications.
CC BY-NDReuse with attribution; no derivatives (No Derivatives). Sometimes used for fixed-format outputs (e.g., final PDFs).Not Open Data; blocks translations, subsets, and corrections.

In practice:

for EarthCODE datasets, prefer CC BY 4.0 (default) or CC0 1.0 if maximal reuse is desired. Avoid NC and ND clauses on datasets as they significantly limit reuse.

In STAC you can declare the license of the data by including a short code and a resolvable URL:

json
{
  "license": "CC-BY-4.0",
  "links": [
    { "rel": "license", "href": "https://creativecommons.org/licenses/by/4.0/" }
  ]
}

DOI Assignment

In Development

A Digital Object Identifier (DO1I) provides a persistent, citable link to your EarthCODE Product, ensuring it can be reliably referenced in publications, metadata catalogs, and other research outputs. DOIs are a core part of making data Findable under the FAIR principles.

EarthCODE can mint and assign a DOI to your Product as part of the publishing process for free. This DOI will be recorded in your STAC Collection or STAC Item metadata and made visible in the ESA Open Science Catalog. You simply need to Request a DOI in the pull request description or by email to: earth-code@esa.int. When your product is published, the DOI will be minted and added to your metadata automatically.

If your dataset already has a DOI (e.g., from another repository), you can include it in your STAC metadata using the Scientific Citation Extension:

json
"sci:doi": "https://doi.org/10.57780/s3d-83ad619"

See full example: https://doi.org/10.57780/s3d-83ad619

This will preserve the original DOI and make it searchable in the EarthCODE ecosystem.

Interoperability

EarthCODE prioritises cloud-native geospatial formats so data can be streamed over HTTP/object storage. Wherever possible, publish Data Cubes (n-D arrays with explicit chunking) rather than directories of files. If you already have many NetCDF/GeoTIFFs, consolidate to a single cube (e.g., Zarr) or provide kerchunk references to avoid file sprawl. Always set CRS, nodata, units, and variable semantics. See the Code and Data Quality guide for more details.

Data typePreferred format(s) & notes
Raster scenes & mosaicsCOG (GeoTIFF) with internal tiling and overviews for fast partial reads; lossless compression by default.
n-D data cubes (time/lat/lon/level)Zarr with sensible chunking and consolidated metadata; if legacy NetCDF4/HDF5, add kerchunk references.
Vector analyticsGeoParquet (columnar, scalable); include a dataset _metadata file for multi-shard collections.
Vector delivery/streamingFlatGeobuf with spatial index; suitable for HTTP streaming and subsetting.
Point cloudsCOPC for efficient partial reads.
Web visualisation tilesPMTiles (single archive, serverless); for visual delivery, not analysis.
Tabular in-situ/model outputsParquet (preferred) or CSV for small datasets; define schema, units, and time zones.
DocumentationPDF/Markdown linked from STAC describedby; keep methods and citation text versioned.

Minimum best practices: avoid thousands of small files; align chunk/tiling to dominant access patterns.

Choosing the Right Variable Name

Always first search for and re-use existing variables from the ESA Open Science Catalog Variables list before creating a new one. This ensures semantic consistency across datasets and improves discoverability in the EarthCODE ecosystem.

If no existing variable fits your data, propose a new one via the OSC GitHub repository by opening a pull request.

Each Product should also be mapped to an appropriate CF Standard Name to ensure interoperability. Use the CF Standard Name Table search tool to identify the correct term.

Example in STAC metadata:

json
"cf:parameter": "sea_surface_temperature"

Storage Repositories

EarthCODE Products must be stored in trusted, long-term repositories that provide stable, resolvable links to the actual data. The primary repository is the ESA Project Results Repository (PRR), which ensures curation, preservation, and integration with the ESA Open Science Catalog.

You can also alternatively store your data on accepted repository domains, which include:

  • PRR (preferred)
  • Zenodo
  • Figshare
  • Other recognised, domain-specific repositories that meet open data requirements

Datasets must be accessible, and all provided links must be functional. For open data, links must allow programmatic access without registration or authentication tokens. If your dataset is proprietary or access-controlled, document the restrictions in metadata.

Accepted link types:

  1. Metadata records — links to repository landing pages (e.g., Zenodo, PRR, Figshare)
  2. Direct file links — e.g., NetCDF, GeoTIFF, Zarr store root
  3. Direct service links — e.g., S3 object storage paths, HTTPS download URLs, FTP servers

Linking rules

  • DO NOT link only to a website where the user must manually navigate to download the data.
  • DO link directly to the dataset or service endpoint so it can be retrieved programmatically.

Providing direct, machine-accessible links ensures that OSC Items are FAIR and that open data can be indexed, validated, and reused automatically.

ESA – European Space Agency © 2020-2025