Skip to main content
All CollectionsWEkEO Data Viewer and CatalogueProduct Information
In which format the WEkEO data are delivered?
In which format the WEkEO data are delivered?

Let's see the different data formats available in the WEkEO DIAS and some simple steps to open them!

David Bina avatar
Written by David Bina
Updated over 5 months ago

Context


The WEkEO DIAS provides more than 400 Earth observation datasets from several originating centers (more info on data available on WEkEO).

In this article, we will discuss the various formats of the data provided and show easy instructions on how to open the data using Python.

Table of data formats


Observation Domain

Data Format

Atmosphere

GRIB and NetCDF* in .zip

Climate Change

GRIB and NetCDF*

Emergency

GRIB and NetCDF in .zip

Land

GRIB, GeoTIFF and NetCDF* in .zip

Marine

NetCDF

Sentinel

NetCDF, .SEN3, .SEN6, .SAFE in .zip**

*the NetCDF format is experimental for these Services

**all Sentinel data are stored in a .zip file (learn how to unzip files)

πŸ“ In the following examples, we will use different datasets over Italy.

NetCDF data format


Let's see here how to open a NetCDF data file.

We will focus on the atmospheric temperature of January 1978, provided in the product ERA5 hourly data on pressure levels from 1950 to 1978 (preliminary version) (datasetID = EO:ECMWF:DAT:REANALYSIS_ERA5_PRESSURE_LEVELS_PRELIMINARY_BACK_EXTENSION).

The main packages we use are xarray (to open the dataset) and matplotlib (to customize our map):

import xarray as xr
dataset = xr.open_dataset("ERA5_CAMS_1978.nc")
dataset

This simple line allows to access all the information (dimensions, coordinates, variable(s), attributes) of the downloaded file:

Now, to generate a quick map, just call the xarray.plot() function as follows:

time = "1978-01-01"
dataset.t.sel(time=time).plot()

However, we are visualizing here the atmospheric temperature, so if we want to add the coasts and better georeference the map, we need to import the package matplotlib:

import matplotlib.pyplot as plt 

f = plt.figure(figsize=(15,10))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.coastlines()
ax.add_feature(cfeature.LAND, zorder=1, edgecolor='k')

dataset.t.sel(time=time).plot()
plt.title(f"Atmospheric Temperature (K) on {time}", size = 15)

GRIB data format


For the GRIB data format, we downloaded the ERA5-Land hourly data from 1950 to present product (datasetID = "EO:ECMWF:DAT:REANALYSIS_ERA5_LAND"), and its leaf area index (lai_hv) for the first week of January 2022.

We recommend installing cfgrib via conda:

conda install -c conda-forge cfgrib

As for the data in NetCDF format, we'll use xarray to open the data:

import xarray as xr
grib = xr.load_dataset("mydirectory/ERA_CLMS_2022.grib", engine = "cfgrib")
grib

πŸ“Œ Note: it is important here to specify the engine type engine = "cfgrib".

This command will open the .grib file and explore the downloaded data:

πŸ’‘WEkEO Pro Tip: make sure you have the cfgrib package installed, otherwise the error message "ValueError: unrecognized engine cfgrib must be one of: ['netcdf4', 'scipy', 'store']" will be displayed.

Finally, a simple line of code to plot and view the data:

grib.lai_hv.sel(time="2022-01-01").plot()

GeoTIFF data format


For GeoTIFF data format, let's focus on the Copernicus Land's Total productivity (PPI) data from the Vegetation Phenology and Productivity, yearly, product (datasetID = "EO:EEA:DAT:CLMS_HRVPP_VPP"), in southern Italy.

First, open the .tif file. We will use the rasterio package (to be installed if not yet):

import rasterio as rs
import rasterio.plot

# set the directory where .tif files are stored
data_dir = './tiff'
all_tiff = []

# loop to open and plot all .tif files
for path in os.listdir(data_dir):
if os.path.isfile(os.path.join(data_dir, path)):
all_tiff.append(path)
print('file_name = ', path) # file's name
with rs.open(os.path.join(data_dir, path)) as file:
print("data info : ", file.profile) # file's information
rasterio.plot.show(file) # plot
print(all_tiff)

Thus, for each GeoTIFF in the data_dir directory, we will obtain:

  • the file name

file_name = VPP_2020_S2_T33TXE-010m_V101_s1_TPROD.tif
  • general file information, such as the data format, data type, crs, and more:

data info : {'driver': 'GTiff', 'dtype': 'uint16', 'nodata': 65535.0, 'width': 10980, 'height': 10980, 'count': 1, 'crs': CRS.from_epsg(32633), 'transform': Affine(10.0, 0.0, 600000.0, 0.0, -10.0, 4500000.0), 'blockxsize': 512, 'blockysize': 512, 'tiled': True, 'compress': 'deflate', 'interleave': 'band'}
  • and the basic generated image:

πŸ“Œ Note: rasterio also allows to get some information about the .tif file:

  • tiff.bounds: indicates the spatial bounding box

  • tiff.count: number of bands

  • tiff.width: number of columns of the raster dataset

  • tiff.height: number of rows of the raster dataset

  • tiff.crs: coordinate reference system

For more information about this Python package, please consult the rasterio documentation page.

Sentinel data


All Sentinel data are concatenated in a .zip file when downloaded, in which you'll find files in specific extensions:

  • Sentinel-1 and Sentinel-2 products (supplied by ESA) are provided in .SAFE format (cf. sentiwiki)

  • Sentinel-3 products (supplied by ESA or Eumetsat) are provided in .SEN3 format (a tailored version of SAFE format)

  • Sentinel-5P products (supplied by ESA) are provided in NetCDF format

  • Sentinel-6 products (supplied by Eumetsat) are provided in .SEN6 format (a tailored version of SAFE format)

To open downloaded Sentinel data, you simply need to unzip all the files:

import os
import zipfile

extension = ".zip"
path = "Directory"

for item in os.listdir(path): # loop through items in path
if item.endswith(extension): # check for ".zip" extension
file_name = os.path.join(path, item) # get full path of files
zip_ref = zipfile.ZipFile(file_name) # create zipfile to read it
zip_ref.extractall(path) # extract file to dir
zip_ref.close() # close file

πŸ’‘WEkEO Pro Tip: for data containing NetCDF files, you can check our previous section to learn how to open a .nc file via xarray! πŸ˜ƒ

And that's it!

Now you know all about WEkEO data format and how to read it! 😎

What's next?


More examples are available in our previous WEkEO trainings, available online, from our JupyterHub!

We are user-driven and we implement users' suggestions, so feel free to contact us:

  • through a chat session available in the bottom right corner of the page

  • via e-mail to our support team (supportATwekeo.eu)

Regardless of how you choose to contact us, you will first be put in touch with our AI Agent Neo. At any time, you can reach a member of the WEkEO User Support team by clicking on "talk to a person" via chat, or by naturally requesting it in reply to Neo's email.

Did this answer your question?