Context
The Harmonized Data Access (HDA) API allows uniform access to the whole WEkEO catalogue, including subsetting and downloading functionalities (to learn more: How to download WEkEO data?).
The HDA API is REST-based and is published at this base URL: https://wekeo-broker.prod.wekeo2.eu/databroker
.
๐Note: all endpoints defined in this article are relative to this URL.
We introduce here usage examples of this API in WEkEO's JupyterHub service (accessible from Sign in > Dashboard > JupyterHub).
โ ๏ธ We recommend that you use cURL instead of Wget because Wget is outside the scope of our support.
How to use Wget?
GNU Wget is a free software package for retrieving files using the most widely used internet protocols.
You can access Wget in several ways:
In the WEkEO JupyterHub: open a terminal and run the command
conda activate miniwekeolab
. Then, you should see(miniwekeolab)
appear on the left which means you are now using theminiwekeolab
environment. In this environment, Wget is installed.In a local terminal: chances are you have Wget installed by default on Linux. If you are on Windows, you can get it here.
โ ๏ธ Commands have only been tested in Linux.
Step by step Guide
Step 1. Authentication
All calls to the HDA API require an access token, which can be obtained through a call to GET /gettoken
. You'll need to provide your credentials, where <credentials>
is the base64-encoded version of the string <username>:<password>
.
For instance, if john
is your username and 123123
your password, credentials would be am9objoxMjMxMjM=
(you can use the online tool base64encode to perform this conversion):
$ wget --header="Authorization: Basic <credentials>" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/gettoken
The HDA API will respond with a token, valid for 1 hour:
{ "access_token": "xxxxxxxx-yyyy-zzzz-xxxx-yyyyyyyyyyyy" }
All other calls to the HDA API must include the access token. Whenever you see <access_token>
in a command, replace it with your token.
โ
Step 2. Accepting the terms and conditions
Before data can be accessed, the Copernicus Terms and Conditions must be accepted. This needs to be done only once:
$ wget --method=PUT --header="accept: application/json" --header="authorization: <access_token>" --body-data="accepted=true" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/termsaccepted/Copernicus_General_License
Step 3. Subsetting datasets
Every datasets in the WEkEO catalogue can be subsetted through multiple attributes. You can obtain the full list of subsetting attributes for a particular dataset through a call to GET /querymetadata/{datasetId}
.
For instance, to get the attributes for EEA's HPPV dataset on Vegetation Phenology and Productivity:
$ wget --header="Authorization: <access_token>" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/querymetadata/EO:EEA:DAT:CLMS_HRVPP_VPP
Result of the command (click me)
Result of the command (click me)
{
"constraints": [],
"datasetId": "EO:EEA:DAT:CLMS_HRVPP_VPP",
"parameters": {
"boundingBoxes": [
{
"comment": "Bounding Box",
"details": {
"crs": "EPSG:4326",
"extent": []
},
"isRequired": false,
"label": "Bounding Box",
"name": "bbox"
}
],
"dateRangeSelects": [
{
"comment": "Temporal interval to search",
"details": {
"defaultEnd": null,
"defaultStart": null,
"end": null,
"start": null
},
"isRequired": false,
"label": "Temporal interval to search",
"name": "temporal_interval"
},
{
"comment": "The dateTime when the resource described by the entry was created.",
"details": {
"defaultEnd": null,
"defaultStart": null,
"end": null,
"start": null
},
"isRequired": false,
"label": "processingDate",
"name": "processingDate"
}
],
"multiStringSelects": null,
"stringChoices": [
{
"comment": "String identifying the entry type.",
"details": {
"valuesLabels": {
"AMPL": "AMPL",
"EOSD": "EOSD",
"EOSV": "EOSV",
"LENGTH": "LENGTH",
"LSLOPE": "LSLOPE",
"MAXD": "MAXD",
"MAXV": "MAXV",
"MINV": "MINV",
"QFLAG": "QFLAG",
"RSLOPE": "RSLOPE",
"SOSD": "SOSD",
"SOSV": "SOSV",
"SPROD": "SPROD",
"TPROD": "TPROD"
}
},
"isRequired": false,
"label": "productType",
"name": "productType"
},
{
"comment": "String identifying the particular group to which a product belongs.",
"details": {
"valuesLabels": {
"s1": "s1",
"s2": "s2"
}
},
"isRequired": false,
"label": "productGroupId",
"name": "productGroupId"
}
],
"stringInputs": [
{
"comment": "Local identifier of the record in the repository context.",
"details": {
"pattern": "[\\w-]+"
},
"isRequired": false,
"label": "Product identifier",
"name": "uid"
},
{
"comment": "Identification of the second part of an MGRS coordinate (square identification).",
"details": {
"pattern": "[\\w-]+"
},
"isRequired": false,
"label": "tileId",
"name": "tileId"
},
{
"comment": "String identifying the version of the Product.",
"details": {
"pattern": "[\\w-]+"
},
"isRequired": false,
"label": "productVersion",
"name": "productVersion"
},
{
"comment": "A location criteria (Googleplace name) to perform the search. Example : Paris, Belgium",
"details": {
"pattern": "[\\pL\\pN\\pZs\\pS\\pP]+"
},
"isRequired": false,
"label": "Place name",
"name": "name"
}
]
},
"rendering": null,
"userTerms": {
"accepted": true,
"termsId": "Copernicus_General_License"
}
The HDA API responds with definitions for multiple variables you can use to subset (e.g. producttype
):
{
"datasetId": "EO:EEA:DAT:CLMS_HRVPP_VPP",
"boundingBoxValues": [
{ "name": "bbox",
"bbox": [ -0.5871259395913329, 45.91075967748305, 3.255819949392474, 48.65066374786067 ] }
],
"dateRangeSelectValues": [
{ "name": "temporal_interval",
"start": "2018-01-01T00:00:00.000Z",
"end": "2019-01-01T00:00:00.000Z" }
],
"stringChoiceValues": [
{ "name": "productType", "value": "TPROD" },
{ "name": "productGroupId", "value": "s1" }
]
}
๐Note: an other way to create an API request is via the WEkEO Data Viewer. You can follow the article How to download WEkEO data? to learn how.
Step 4. Requesting data
Based on this response, you can create a subsetting job by sending a request to
POST /datarequest
:
$ wget \
--header="Content-Type: application/json" \
--header="Accept: application/json" \
--header="Authorization: <access_token>" \
--post-data='{
"datasetId": "EO:EEA:DAT:CLMS_HRVPP_VPP",
"boundingBoxValues": [
{ "name": "bbox",
"bbox": [ -0.5871259395913329, 45.91075967748305, 3.255819949392474, 48.65066374786067 ] }
],
"dateRangeSelectValues": [
{
"name": "temporal_interval",
"start": "2018-01-01T00:00:00.000Z",
"end": "2019-01-01T00:00:00.000Z"
}
],
"stringChoiceValues": [
{ "name": "productType", "value": "TPROD"},
{"name": "productGroupId", "value": "s1"}
]
}' -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/datarequest
This returns an initial response, indicating that the new job has started:
{
"jobId": "0e6zJrcIrvpyHPCb9pMfquxZllc",
"status": "started",
"results": [],
"message": null
}
Use the jobId
to poll the GET /datarequest/status/{jobId}
endpoint:
$ wget --header="Authorization: <access_token>" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/datarequest/status/0e6zJrcIrvpyHPCb9pMfquxZllc
until the job has finished:
{
"status": "completed",
"message": "Done!"
}
At this point you can retrieve the list of results calling the GET /datarequest/jobs/{jobId}/result
endpoint:
$ wget --header="Authorization: <access_token>" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/datarequest/jobs/0e6zJrcIrvpyHPCb9pMfquxZllc/result
Sample response:
{ "content": [
{
"downloadUri": null,
"extraInformation": null,
"filename": "VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif",
"order": null,
"productInfo": {
"datasetId": "EO:EEA:DAT:CLMS_HRVPP_VPP",
"product": "VPP_2018_S2_T30TXR-010m_V101_s1_TPROD",
"productEndDate": "2018-12-31T23:59:59.999000Z",
"productStartDate": "2018-01-01T00:00:00Z"
},
"size": 119048027,
"url": "hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif"
},
[...],
{
"downloadUri": null,
"extraInformation": null,
"filename": "VPP_2018_S2_T30UYV-010m_V101_s1_TPROD.tif",
"order": null,
"productInfo": {
"datasetId": "EO:EEA:DAT:CLMS_HRVPP_VPP",
"product": "VPP_2018_S2_T30UYV-010m_V101_s1_TPROD",
"productEndDate": "2018-12-31T23:59:59.999000Z",
"productStartDate": "2018-01-01T00:00:00Z"
},
"size": 195119088,
"url": "hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30UYV-010m_V101_s1_TPROD.tif"
}
],
"itemsInPage": 10,
"nextPage": "https://wekeo-broker.prod.wekeo2.eu/databroker/datarequest/jobs/FQ6_njWjE9RD7GCeYXyy6ejJ-Kc/result?page=1&size=10",
"page": 0,
"pages": 5,
"previousPage": null,
"totItems": 46
}
From this response you can spot the url of each item. You will need it for ordering data at the next step by mentioning the jobId
and this url
.
As you can see, WEkEO has found 46 items and is returning only the first page with 10 results. You can add size
and page
query parameters to your call to retrieve other results:
$ wget --header="Authorization: <access_token>" -qO- "https://wekeo-broker.prod.wekeo2.eu/databroker/datarequest/jobs/0e6zJrcIrvpyHPCb9pMfquxZllc/result?size=15&page=3"
Step 5. Ordering data
Once a subsetting job has been completed, you can issue data orders for any of the job results. To do this, create an order job by sending a request to POST /dataorder
, including the jobId
and the url
of the result to be downloaded:
$ wget \
--header="Content-Type: application/json" \
--header="Accept: application/json" \
--header="Authorization: <access_token>" \
--post-data='{
"jobId": "0e6zJrcIrvpyHPCb9pMfquxZllc",
"uri": "hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif"}' \
-qO - \
https://wekeo-broker.prod.wekeo2.eu/databroker/dataorder
This returns an initial response, indicating that the new job has started:
{
"orderId": "0fW0ZecwsR-kKxRd13vsnjJpXOQ",
"status": "running",
"message": null
}
Use the orderId
to poll the GET /dataorder/status/{orderId}
endpoint:
$ wget --header="Authorization: <access_token>" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/dataorder/status/0fW0ZecwsR-kKxRd13vsnjJpXOQ
until the job has finished:
{
"status": "completed", "message": "Done!", "downloadUri": "hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif", "url": "hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif"
}
At this point you can get the link to download your data by calling the GET /dataorder/download/{orderId}
endpoint:
$ wget --no-check-certificate --header 'Accept: application/json' --header 'Authorization: <access_token>' -r -L 'https://wekeo-broker.prod.wekeo2.eu/databroker/dataorder/download/0fW0ZecwsR-kKxRd13vsnjJpXOQ' -O VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif
๐กWEkEO Pro Tip: with the command above, the name of the file you will download will be VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif.
The name and the extension of what you are downloading can be found at the end of the url
used previously. However it is not mandatory, the command will be executed without any problem with the name you want.
โ ๏ธDepending on the dataset you use, you may not be able to download directly to a file. So you have to refer to the link returned by the command to download your data https://s3.waw3-[...]D&Expires=1681993287
. You can copy and paste this link into your browser to download it.
[...]
Location: https://s3.waw3-1.cloudferro.com/hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif?AWSAccessKeyId=7f44d907594e4587a3b5647ce4bf7c9f&Signature=MVbXM08rM7JsPWz0JzEW2rUxjwg%3D&Expires=1681993287 [following]
After a short while, your files should have been downloaded to your working directory! ๐
โ ๏ธ There is a limitation of Request and Orders. More details in this article.
Swagger UI
The Swagger UI is an interface explaining in details all the different routes you can call in the HDA API:
By selecting one of the categories you have the different calls you can do, and clicking on one of them will let you see all responses you can get.
๐กWEkEO Pro Tip: you can even run a command from the Swagger UI by clicking the Try it out! button at the bottom of a call description.
๐ Note: for information, the Swagger UI uses cURL commands.
What's next?
These articles might be of interest for you:
We are user-driven and we implement users' suggestions, so feel free to contact us:
through a chat session available in the bottom right corner of the page
via our contact webpage
via e-mail to our support team (supportATwekeo.eu)