All Collections
WEkEO Harmonized Data Access
HDA API Client & Rest
How to use the WEkEO Harmonized Data Access API REST with Wget?
How to use the WEkEO Harmonized Data Access API REST with Wget?

Let's see how to use the WEkEO HDA API Rest with Wget!

Alexandre avatar
Written by Alexandre
Updated over a week ago

Context


The Harmonized Data Access (HDA) API allows uniform access to the whole WEkEO catalogue, including subsetting and downloading functionalities (to learn more: How to download WEkEO data?).

The HDA API is REST-based and is published at this base URL: https://wekeo-broker.prod.wekeo2.eu/databroker.

📌Note: all endpoints defined in this article are relative to this URL.

We introduce here usage examples of this API in WEkEO's JupyterHub service (accessible from Sign in > Dashboard > JupyterHub).

⚠️ We recommend that you use cURL instead of Wget because Wget is outside the scope of our support.

How to use Wget?


GNU Wget is a free software package for retrieving files using the most widely used internet protocols.

You can access Wget in several ways:

  • In the WEkEO JupyterHub: open a terminal and run the command conda activate miniwekeolab. Then, you should see (miniwekeolab) appear on the left which means you are now using the miniwekeolab environment. In this environment, Wget is installed.

  • In a local terminal: chances are you have Wget installed by default on Linux. If you are on Windows, you can get it here.

⚠️ Commands have only been tested in Linux.

Step by step Guide


Step 1. Authentication

All calls to the HDA API require an access token, which can be obtained through a call to GET /gettoken. You'll need to provide your credentials, where <credentials> is the base64-encoded version of the string <username>:<password>.

For instance, if john is your username and 123123 your password, credentials would be am9objoxMjMxMjM= (you can use the online tool base64encode to perform this conversion):

$ wget --header="Authorization: Basic <credentials>" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/gettoken

The HDA API will respond with a token, valid for 1 hour:

{ "access_token": "xxxxxxxx-yyyy-zzzz-xxxx-yyyyyyyyyyyy" }

All other calls to the HDA API must include the access token. Whenever you see <access_token> in a command, replace it with your token.

Step 2. Accepting the terms and conditions

Before data can be accessed, the Copernicus Terms and Conditions must be accepted. This needs to be done only once:

$ wget --method=PUT --header="accept: application/json" --header="authorization: <access_token>" --body-data="accepted=true" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/termsaccepted/Copernicus_General_License

Step 3. Subsetting datasets

Every datasets in the WEkEO catalogue can be subsetted through multiple attributes. You can obtain the full list of subsetting attributes for a particular dataset through a call to GET /querymetadata/{datasetId}.

For instance, to get the attributes for EEA's HPPV dataset on Vegetation Phenology and Productivity:

$ wget --header="Authorization: <access_token>" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/querymetadata/EO:EEA:DAT:CLMS_HRVPP_VPP

Result of the command (click me)

{

"constraints": [],

"datasetId": "EO:EEA:DAT:CLMS_HRVPP_VPP",

"parameters": {

"boundingBoxes": [

{

"comment": "Bounding Box",

"details": {

"crs": "EPSG:4326",

"extent": []

},

"isRequired": false,

"label": "Bounding Box",

"name": "bbox"

}

],

"dateRangeSelects": [

{

"comment": "Temporal interval to search",

"details": {

"defaultEnd": null,

"defaultStart": null,

"end": null,

"start": null

},

"isRequired": false,

"label": "Temporal interval to search",

"name": "temporal_interval"

},

{

"comment": "The dateTime when the resource described by the entry was created.",

"details": {

"defaultEnd": null,

"defaultStart": null,

"end": null,

"start": null

},

"isRequired": false,

"label": "processingDate",

"name": "processingDate"

}

],

"multiStringSelects": null,

"stringChoices": [

{

"comment": "String identifying the entry type.",

"details": {

"valuesLabels": {

"AMPL": "AMPL",

"EOSD": "EOSD",

"EOSV": "EOSV",

"LENGTH": "LENGTH",

"LSLOPE": "LSLOPE",

"MAXD": "MAXD",

"MAXV": "MAXV",

"MINV": "MINV",

"QFLAG": "QFLAG",

"RSLOPE": "RSLOPE",

"SOSD": "SOSD",

"SOSV": "SOSV",

"SPROD": "SPROD",

"TPROD": "TPROD"

}

},

"isRequired": false,

"label": "productType",

"name": "productType"

},

{

"comment": "String identifying the particular group to which a product belongs.",

"details": {

"valuesLabels": {

"s1": "s1",

"s2": "s2"

}

},

"isRequired": false,

"label": "productGroupId",

"name": "productGroupId"

}

],

"stringInputs": [

{

"comment": "Local identifier of the record in the repository context.",

"details": {

"pattern": "[\\w-]+"

},

"isRequired": false,

"label": "Product identifier",

"name": "uid"

},

{

"comment": "Identification of the second part of an MGRS coordinate (square identification).",

"details": {

"pattern": "[\\w-]+"

},

"isRequired": false,

"label": "tileId",

"name": "tileId"

},

{

"comment": "String identifying the version of the Product.",

"details": {

"pattern": "[\\w-]+"

},

"isRequired": false,

"label": "productVersion",

"name": "productVersion"

},

{

"comment": "A location criteria (Googleplace name) to perform the search. Example : Paris, Belgium",

"details": {

"pattern": "[\\pL\\pN\\pZs\\pS\\pP]+"

},

"isRequired": false,

"label": "Place name",

"name": "name"

}

]

},

"rendering": null,

"userTerms": {

"accepted": true,

"termsId": "Copernicus_General_License"

}

The HDA API responds with definitions for multiple variables you can use to subset (e.g. producttype):

{
"datasetId": "EO:EEA:DAT:CLMS_HRVPP_VPP",
"boundingBoxValues": [
{ "name": "bbox",
"bbox": [ -0.5871259395913329, 45.91075967748305, 3.255819949392474, 48.65066374786067 ] }
],
"dateRangeSelectValues": [
{ "name": "temporal_interval",
"start": "2018-01-01T00:00:00.000Z",
"end": "2019-01-01T00:00:00.000Z" }
],
"stringChoiceValues": [
{ "name": "productType", "value": "TPROD" },
{ "name": "productGroupId", "value": "s1" }
]
}

📌Note: an other way to create an API request is via the WEkEO Data Viewer. You can follow the article How to download WEkEO data? to learn how.

Step 4. Requesting data

Based on this response, you can create a subsetting job by sending a request to

POST /datarequest:

$ wget \
--header="Content-Type: application/json" \
--header="Accept: application/json" \
--header="Authorization: <access_token>" \
--post-data='{
"datasetId": "EO:EEA:DAT:CLMS_HRVPP_VPP",
"boundingBoxValues": [
{ "name": "bbox",
"bbox": [ -0.5871259395913329, 45.91075967748305, 3.255819949392474, 48.65066374786067 ] }
],
"dateRangeSelectValues": [
{
"name": "temporal_interval",
"start": "2018-01-01T00:00:00.000Z",
"end": "2019-01-01T00:00:00.000Z"
}
],
"stringChoiceValues": [
{ "name": "productType", "value": "TPROD"},
{"name": "productGroupId", "value": "s1"}
]
}' -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/datarequest

This returns an initial response, indicating that the new job has started:

{ 
"jobId": "0e6zJrcIrvpyHPCb9pMfquxZllc",
"status": "started",
"results": [],
"message": null
}

Use the jobId to poll the GET /datarequest/status/{jobId} endpoint:

$ wget --header="Authorization: <access_token>" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/datarequest/status/0e6zJrcIrvpyHPCb9pMfquxZllc

until the job has finished:

{ 
"status": "completed",
"message": "Done!"
}

At this point you can retrieve the list of results calling the GET /datarequest/jobs/{jobId}/result endpoint:

$ wget --header="Authorization: <access_token>" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/datarequest/jobs/0e6zJrcIrvpyHPCb9pMfquxZllc/result

Sample response:

{  "content": [
{
"downloadUri": null,
"extraInformation": null,
"filename": "VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif",
"order": null,
"productInfo": {
"datasetId": "EO:EEA:DAT:CLMS_HRVPP_VPP",
"product": "VPP_2018_S2_T30TXR-010m_V101_s1_TPROD",
"productEndDate": "2018-12-31T23:59:59.999000Z",
"productStartDate": "2018-01-01T00:00:00Z"
},
"size": 119048027,
"url": "hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif"
},
[...],
{
"downloadUri": null,
"extraInformation": null,
"filename": "VPP_2018_S2_T30UYV-010m_V101_s1_TPROD.tif",
"order": null,
"productInfo": {
"datasetId": "EO:EEA:DAT:CLMS_HRVPP_VPP",
"product": "VPP_2018_S2_T30UYV-010m_V101_s1_TPROD",
"productEndDate": "2018-12-31T23:59:59.999000Z",
"productStartDate": "2018-01-01T00:00:00Z"
},
"size": 195119088,
"url": "hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30UYV-010m_V101_s1_TPROD.tif"
}
],
"itemsInPage": 10,
"nextPage": "https://wekeo-broker.prod.wekeo2.eu/databroker/datarequest/jobs/FQ6_njWjE9RD7GCeYXyy6ejJ-Kc/result?page=1&size=10",
"page": 0,
"pages": 5,
"previousPage": null,
"totItems": 46
}

From this response you can spot the url of each item. You will need it for ordering data at the next step by mentioning the jobId and this url.

As you can see, WEkEO has found 46 items and is returning only the first page with 10 results. You can add size and page query parameters to your call to retrieve other results:

$ wget --header="Authorization: <access_token>"  -qO-  "https://wekeo-broker.prod.wekeo2.eu/databroker/datarequest/jobs/0e6zJrcIrvpyHPCb9pMfquxZllc/result?size=15&page=3"

Step 5. Ordering data

Once a subsetting job has been completed, you can issue data orders for any of the job results. To do this, create an order job by sending a request to POST /dataorder, including the jobId and the url of the result to be downloaded:

$ wget \
--header="Content-Type: application/json" \
--header="Accept: application/json" \
--header="Authorization: <access_token>" \
--post-data='{
"jobId": "0e6zJrcIrvpyHPCb9pMfquxZllc",
"uri": "hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif"}' \
-qO - \
https://wekeo-broker.prod.wekeo2.eu/databroker/dataorder

This returns an initial response, indicating that the new job has started:

{ 
"orderId": "0fW0ZecwsR-kKxRd13vsnjJpXOQ",
"status": "running",
"message": null
}

Use the orderId to poll the GET /dataorder/status/{orderId} endpoint:

$ wget --header="Authorization: <access_token>" -qO - https://wekeo-broker.prod.wekeo2.eu/databroker/dataorder/status/0fW0ZecwsR-kKxRd13vsnjJpXOQ

until the job has finished:

{
"status": "completed", "message": "Done!", "downloadUri": "hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif", "url": "hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif"
}

At this point you can get the link to download your data by calling the GET /dataorder/download/{orderId} endpoint:

$ wget --no-check-certificate --header 'Accept: application/json' --header 'Authorization: <access_token>' -r -L 'https://wekeo-broker.prod.wekeo2.eu/databroker/dataorder/download/0fW0ZecwsR-kKxRd13vsnjJpXOQ' -O VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif

💡WEkEO Pro Tip: with the command above, the name of the file you will download will be VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif. The name and the extension of what you are downloading can be found at the end of the url used previously. However it is not mandatory, the command will be executed without any problem with the name you want.

⚠️Depending on the dataset you use, you may not be able to download directly to a file. So you have to refer to the link returned by the command to download your data https://s3.waw3-[...]D&Expires=1681993287. You can copy and paste this link into your browser to download it.

[...]
Location: https://s3.waw3-1.cloudferro.com/hr-vpp-products-vpp-v01-2018/CLMS/Pan-European/Biophysical/VPP/v01/2018/s1/VPP_2018_S2_T30TXR-010m_V101_s1_TPROD.tif?AWSAccessKeyId=7f44d907594e4587a3b5647ce4bf7c9f&Signature=MVbXM08rM7JsPWz0JzEW2rUxjwg%3D&Expires=1681993287 [following]

After a short while, your files should have been downloaded to your working directory! 😃

⚠️ There is a limitation of Request and Orders. More details in this article.

Swagger UI


The Swagger UI is an interface explaining in details all the different routes you can call in the HDA API:

By selecting one of the categories you have the different calls you can do, and clicking on one of them will let you see all responses you can get.

💡WEkEO Pro Tip: you can even run a command from the Swagger UI by clicking the Try it out! button at the bottom of a call description.

📌 Note: for information, the Swagger UI uses cURL commands.

What's next?


These articles might be of interest for you:

We are user-driven and we implement users' suggestions, so feel free to contact us:

  • through a chat session available in the bottom right corner of the page

  • via e-mail to our support team (supportATwekeo.eu)

Did this answer your question?