Context
WEkEO provides access to a wide range of data, which can be downloaded using the Harmonized Data Access (HDA) service. For more detailed information, please refer to the article What is the Harmonized Data Access (HDA) API?
The HDA API got a Python Client that will help you to download and process quickly needed data. Let's see now how to use the HDA API with Python! ๐
How to use the HDA API in Python?
You can follow along the steps of the article to download data in this notebook:
Official documentation
For further information, please visit the WEkEO HDA API documentation.
Step 1. Install the latest version of hda
Run the following command in a terminal in order to install the latest version of HDA:
using pip:
pip install hda -U
using Mamba (you can replace
mamba
byconda
):mamba install conda-forge::hda
๐กWEkEO Pro Tip: all required packages, including the hda
library, are already installed in the default environment wekeolab
of the WEkEO JupyterHub.
Step 2. Import hda
module
In a Python script import the needed functions:
from hda import Client, Configuration
Step 3. Configure credentials and load hda
Client
Afterwards, we must configure user's credentials and load the hda
Client.
Let's see most used methods:
Method 1 (occasional users)
Configuration of WEkEO credentials directly in the Python script:
# Configure user's credentials without a .hdarc
conf = Configuration(user = "username", password = "password")
hda_client = Client(config = conf)
Method 2 (regular users)
Create the .hdarc
configuration file as follows in the Python script:
from pathlib import Path
# Default location expected by hda package
hdarc = Path(Path.home() / '.hdarc')
# Create it only if it does not already exists
if not hdarc.is_file():
import getpass
USERNAME = input('Enter your username: ')
PASSWORD = getpass.getpass('Enter your password: ')
with open(Path.home() / '.hdarc', 'w') as f:
f.write(f'user:{USERNAME}\n')
f.write(f'password:{PASSWORD}\n')
hda_client = Client()
โ ๏ธ This method needs to be done only once. Future calls hda_client = Client()
will always retrieve credentials from created file.
๐Note: be careful, if you created a .hdarc before March 2024, you'll need remove the url indicated in it.
Step 4. Create the request and download data
We can now call the .json
request (see how to get the query):
# The JSON query loaded in the "query" variable
query = {
"dataset_id": "EO:EEA:DAT:CLMS_HRVPP_VPP",
"productType": "TPROD",
"productGroupId": "s1",
"start": "2020-01-01T00:00:00.000Z",
"end": "2021-01-01T00:00:00.000Z",
"bbox": [
-9.53592042,
42.46825465,
-7.0363102799999995,
43.99700636
]
}
# Ask the result for the query passed in parameter
matches = hda_client.search(query)
# List the results
print(matches)
The JSON request loaded in the query
variable specifies the parameters for searching the dataset. The hda_client.search(query)
function is used to perform the search based on these parameters, and the results are stored in the matches
variable.
To download the results, the following command is used:
# Download results in a directory (e.g. '/tmp')
matches[-1].download(download_dir="/tmp")
For this example, we will fetch the last result.
The download operation is automatic and performs a batch download of all specified matches
, saving them in the specified directory (/tmp
). This ensures all relevant data is downloaded in one operation without needing to initiate each download individually.
๐กWEkEO Pro Tip: the code above will download the last result in matches
, but it is also possible to easily customize the files to download by slicing the matches
object:
matches.download() # Will download all results
matches[0].download() # Will only download the first result
matches[-1].download() # Will only download the last result
matches[:10].download() # Will only download the first 10 results
๐Note: You also have the possibility to download them in a bucket, but first you need to upgrade your plan to get a tenant.
โ ๏ธ There is a limitation of Request and Orders. More details in this article.
hda
functions
Display names of files to be downloaded
You can browse the
matches
object and display names of files via theirid
:for item in matches.results:
print(item['id'])Display information of a dataset
Use the
dataset()
function with thedataset_id
of your choice to display its information:hda_client.dataset('EO:EEA:DAT:CLMS_HRVPP_VPP')
It returns a JSON object that includes the abstract, the
dataset_id
and other properties of the given dataset.Display list of available datasets
Use the
datasets(limit=None)
function to display the list of available datasets. Specify alimit
to control the number of datasets displayed, or leave it empty to show all:hda_client.datasets(3)
The line above returns a list of 3 available datasets, each represented as a JSON object containing the abstract,dataset_id
and other properties.Limit number of results returned by the
search
functionYou can limit the number of results returned by the
search()
function as follows:matches = hda_client.search(query,3)โ
From the line above, the number of items returned inmatches
is limited to 3.
What's next?
Feel free to check these articles that might be of interest for you:
If you encounter any issue by following this article (including the notebook) or have any question, feel free to contact us:
through a chat session available in the bottom right corner of the page
via our contact webpage
via e-mail to our support team (supportATwekeo.eu)
Regardless of how you choose to contact us, you will first be put in touch with our AI Agent Neo. At any time, you can reach a member of the WEkEO User Support team by clicking on "talk to a person" via chat, or by naturally requesting it in reply to Neo's email.