Contribute to Open Science Catalogue with Pull Request (GitHub)#
This notebook is an example of how to add a new entry - i.e. product, project, workflow, experiment - to the Open Science Catalogue (OSC) via GitHub Pull Request.
This can be done using Graphical User Interface within EarthCODE workspace, manual creation of the entries using web browser on GitHub, or by using platforms-specific tools.
This notebook covers steps on how publishing can be done, locally, wothout a need of sepcific platform or installation of the additional tools.
This document covers following steps:
Forking a local copy of the OSC
Embedding new OSC entries into the Catalog
Validation
Open Pull Request to add new entries
What is next?
If you have your OSC already prepared please go directly to Step 1. to see how you can add your inputs into Open Science Catalogue.
To proceed with this notebook, you need an active GitHub account. If you do not have one, please create an account to get started first.
0. Load dummy OSC entries#
In this notebook we use a group of dummy project, product, workflow and exeriment data for demonstration purposes. You can reuse the code below using real OSC entries.
NOTE If you do not have any entries ready, you can use following (testing example) to see whether the publishing schema proposed works on your environment. Once you are familiar with these steps, you can generate valid OSC entries for your specific project!
Run cells 1-5 to preview the OSC dummy entries prepared only for demonstration purposes!
from earthcode.static import generate_OSC_dummy_entries
project_collection, product_collection, workflow_record, experiment_record = generate_OSC_dummy_entries()
project_collection
product_collection
workflow_record
{'id': '4datlantic-wf+123',
'type': 'Feature',
'geometry': None,
'conformsTo': ['http://www.opengis.net/spec/ogcapi-records-1/1.0/req/record-core'],
'properties': {'title': '4D-Atlantic-Workflow',
'description': 'This describes the OHC workflow',
'osc:type': 'workflow',
'osc:project': '4datlantic-ohc',
'osc:status': 'completed',
'formats': [{'name': 'netcdf64'}],
'updated': '2026-01-27T16:38:01Z',
'created': '2026-01-27T16:38:01Z',
'keywords': ['ocean', 'heat', 'çontent'],
'license': 'CC-BY-4.0',
'version': '1',
'themes': [{'scheme': 'https://github.com/stac-extensions/osc#theme',
'concepts': [{'id': 'oceans'}]}]},
'linkTemplates': [],
'links': [{'rel': 'root',
'href': '../../catalog.json',
'type': 'application/json',
'title': 'Open Science Catalog'},
{'rel': 'parent',
'href': '../catalog.json',
'type': 'application/json',
'title': 'Workflows'},
{'rel': 'related',
'href': '../../projects/4D Atlantic OHC/collection.json',
'type': 'application/json',
'title': 'Project: 4D Atlantic OHC'},
{'rel': 'git',
'href': 'https://github.com/ESA-EarthCODE/open-science-catalog-metadata',
'type': 'application/json',
'title': 'Git source repository'},
{'rel': 'related',
'href': '../../oceans/land/catalog.json',
'type': 'application/json',
'title': 'Theme: Oceans'}]}
experiment_record
{'id': '4datlantic-experiment+123',
'type': 'Feature',
'conformsTo': ['http://www.opengis.net/spec/ogcapi-records-1/1.0/req/record-core'],
'geometry': None,
'properties': {'created': '2026-01-27T16:38:01Z',
'updated': '2026-01-27T16:38:01Z',
'type': 'experiment',
'title': '4D-Atlantic-Experiment',
'description': 'This describes the OHC experiment',
'keywords': ['ocean', 'heat', 'content'],
'contacts': [{'name': 'EarthCODE Demo',
'organization': 'EarthCODE',
'links': [{'rel': 'about',
'type': 'text/html',
'href': 'https://opensciencedata.esa.int/'}],
'contactInstructions': 'Contact via EarthCODE',
'roles': ['host']}],
'themes': [{'scheme': 'https://github.com/stac-extensions/osc#theme',
'concepts': [{'id': 'oceans'}]}],
'formats': [{'name': 'GeoTIFF'}],
'license': 'CC-BY-SA-4.0',
'osc:workflow': '4datlantic-wf+123'},
'linkTemplates': [],
'links': [{'rel': 'root',
'href': '../../catalog.json',
'type': 'application/json',
'title': 'Open Science Catalog'},
{'rel': 'parent',
'href': '../catalog.json',
'type': 'application/json',
'title': 'Experiments'},
{'rel': 'related',
'href': '../../products/4d-atlantic-ohc-global+123/collection.json',
'type': 'application/json',
'title': 'Global Ocean Heat Content'},
{'rel': 'related',
'href': '../../workflows/4datlantic-wf+123/record.json',
'type': 'application/json',
'title': 'Workflow: 4D-Atlantic-Workflow'},
{'rel': 'input',
'href': 'https://github.com/deepesdl/cube-gen',
'type': 'application/yaml',
'title': 'Input parameters'},
{'rel': 'environment',
'href': 'https://github.com/deepesdl/cube-gen',
'type': 'application/yaml',
'title': 'Execution environment'},
{'rel': 'related',
'href': '../../themes/oceans/catalog.json',
'type': 'application/json',
'title': 'Theme: Oceans'}]}
1. Setup a local Copy of the OSC#
You can add new content to the OSC via GitHub Pull Request. To do this, you need a to fork the OSC repository, embeded the new information into the existing catalog and merge. The steps below describe the process.
(if needed) Install git & create a GitHub account
Fork the open science catalog repository on github - ESA-EarthCODE/open-science-catalog-metadata
Clone your forked repository
git clone https://github.com/your-gh-username/open-science-catalog-metadata.gitSet the current workspace to your local clone of the open science catalog metadata repository.
cd ./open-science-catalog-metadata/Create a new branch in the local clone
git checkout -b project_branch
2. Embedding your newly created entries to the local copy of the open-science-catalog-metadata repository#
All OSC entries are interlinked to enable efficient search and analysis. For example, projects have associated products, themes, missions and in turn products link back to their projects, etc. Most of these can be automatically generated using the existing information in an OSC Entry and the associated earthcode library function.
To use these functions you need a local copy of the OSC, preferably a fork, so that later, you can easily open a PR. The functions will save your newly created OSC entries and make changes to existing OSC entries, in order to conform to the required structure.
Import Python libraries#
# Import all necessary Python libraries to run the code
from pathlib import Path
from earthcode.git_add import (
save_product_collection_to_catalog,
save_project_collection_to_osc,
save_workflow_record_to_osc,
save_experiment_record_to_osc
)
Specify the local path to your forked repository#
# Specify the absolute path to the local OSC fork
catalog_root = Path('C:/Users/ewelina.dobrowolska/Documents/open-science-catalog-metadata/')
# save the project entry and add the required links
save_project_collection_to_osc(project_collection, catalog_root)
# save the product and add the required links
save_product_collection_to_catalog(product_collection, catalog_root)
# save the workflow and add the required links
save_workflow_record_to_osc(workflow_record, catalog_root)
# save the experiment and add the required links
save_experiment_record_to_osc(experiment_record, catalog_root)
3. Validation#
There will be two types of checks before accepting your entry into the main OSC:
Automatic verification
Semantic validation
Before doing any of the checks you need to store your entries on disk in the OSC directory. This is required in order to check that all links are generated correctly. You can see the results of the automatic checks and any potential error using the library.
# you can validate individual entries
from earthcode.validator import validateOSCEntry
validateOSCEntry(project_collection.to_dict(), catalog_root)
[]
validateOSCEntry(product_collection.to_dict(), catalog_root)
[]
validateOSCEntry(workflow_record, catalog_root)
validateOSCEntry(experiment_record, catalog_root)
# you can also validate the whole catalog - YOU ARE NOT REQUESTED TO RUN THIS STEP AS IT MAY TAKE SOME TIME!
#from earthcode.validator import validate_catalog
#validate_catalog(catalog_root)
([], [])
4. Open a PR to add new entries#
After the validation passes, you are ready to request changes into existing open-science-catalog-metadata repository to be able to publish your datasets and project. By using the terminal:
Commit the changes to the newly created branch on your local copy of repository:
cd ./open-science-catalog-metadata
git checkout -b \<branch-name\>
git commit -m"Adding new product\_v2.0"Push the changes to your fork:
git push --set-upstream origin \<branch-name\>Open a pull request against the main open science catalog repository
gh pr create -f
5. Check the status of your PR direclty in GitHub#
After creation of Pull Request you should see it on the list: ESA-EarthCODE/open-science-catalog-metadata
Check the status of your PR under: ESA-EarthCODE/open-science-catalog-metadata
Changes to the OSC content will be reviewed by the OSC Data Steward team. In case of any changes needed to your inputs, you will be contacted by the team.