Context
The WEkEO DIAS provides more than 400 Earth observation datasets from several originating centers (more info on data available on WEkEO).
In this article, we will discuss the various formats of the data provided and show easy instructions on how to open the data using Python.
Table of data formats
Observation Domain | Data Format |
Atmosphere | GRIB and NetCDF* in |
Climate Change | GRIB and NetCDF* |
Emergency | GRIB and NetCDF in |
Land | GRIB, GeoTIFF and NetCDF* in |
Marine | NetCDF |
Sentinel | NetCDF, |
*the NetCDF format is experimental for these Services
**all Sentinel data are stored in a .zip
file (learn how to unzip files)
π In the following examples, we will use different datasets over Italy.
NetCDF data format
Let's see here how to open a NetCDF data file.
We will focus on the atmospheric temperature of January 1978, provided in the product ERA5 hourly data on pressure levels from 1950 to 1978 (preliminary version) (datasetID = EO:ECMWF:DAT:REANALYSIS_ERA5_PRESSURE_LEVELS_PRELIMINARY_BACK_EXTENSION
).
The main packages we use are xarray
(to open the dataset) and matplotlib
(to customize our map):
import xarray as xr
dataset = xr.open_dataset("ERA5_CAMS_1978.nc")
dataset
This simple line allows to access all the information (dimensions, coordinates, variable(s), attributes) of the downloaded file:
Now, to generate a quick map, just call the xarray.plot()
function as follows:
time = "1978-01-01"
dataset.t.sel(time=time).plot()
However, we are visualizing here the atmospheric temperature, so if we want to add the coasts and better georeference the map, we need to import the package matplotlib
:
import matplotlib.pyplot as plt
f = plt.figure(figsize=(15,10))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.coastlines()
ax.add_feature(cfeature.LAND, zorder=1, edgecolor='k')
dataset.t.sel(time=time).plot()
plt.title(f"Atmospheric Temperature (K) on {time}", size = 15)
GRIB data format
For the GRIB data format, we downloaded the ERA5-Land hourly data from 1950 to present product (datasetID = "EO:ECMWF:DAT:REANALYSIS_ERA5_LAND"
), and its leaf area index (lai_hv
) for the first week of January 2022.
We recommend installing cfgrib
via conda
:
conda install -c conda-forge cfgrib
As for the data in NetCDF format, we'll use xarray
to open the data:
import xarray as xr
grib = xr.load_dataset("mydirectory/ERA_CLMS_2022.grib", engine = "cfgrib")
grib
π Note: it is important here to specify the engine type engine = "cfgrib"
.
This command will open the .grib
file and explore the downloaded data:
π‘WEkEO Pro Tip: make sure you have the cfgrib
package installed, otherwise the error message "ValueError: unrecognized engine cfgrib must be one of: ['netcdf4', 'scipy', 'store']
" will be displayed.
Finally, a simple line of code to plot and view the data:
grib.lai_hv.sel(time="2022-01-01").plot()
GeoTIFF data format
For GeoTIFF data format, let's focus on the Copernicus Land's Total productivity (PPI
) data from the Vegetation Phenology and Productivity, yearly, product (datasetID = "EO:EEA:DAT:CLMS_HRVPP_VPP"
), in southern Italy.
First, open the .tif
file. We will use the rasterio
package (to be installed if not yet):
import rasterio as rs
import rasterio.plot
# set the directory where .tif files are stored
data_dir = './tiff'
all_tiff = []
# loop to open and plot all .tif files
for path in os.listdir(data_dir):
if os.path.isfile(os.path.join(data_dir, path)):
all_tiff.append(path)
print('file_name = ', path) # file's name
with rs.open(os.path.join(data_dir, path)) as file:
print("data info : ", file.profile) # file's information
rasterio.plot.show(file) # plot
print(all_tiff)
Thus, for each GeoTIFF in the data_dir
directory, we will obtain:
the file name
file_name = VPP_2020_S2_T33TXE-010m_V101_s1_TPROD.tif
general file information, such as the data format, data type, crs, and more:
data info : {'driver': 'GTiff', 'dtype': 'uint16', 'nodata': 65535.0, 'width': 10980, 'height': 10980, 'count': 1, 'crs': CRS.from_epsg(32633), 'transform': Affine(10.0, 0.0, 600000.0, 0.0, -10.0, 4500000.0), 'blockxsize': 512, 'blockysize': 512, 'tiled': True, 'compress': 'deflate', 'interleave': 'band'}
and the basic generated image:
π Note: rasterio
also allows to get some information about the .tif
file:
tiff.bounds
: indicates the spatial bounding boxtiff.count
: number of bandstiff.width
: number of columns of the raster datasettiff.height
: number of rows of the raster datasettiff.crs
: coordinate reference system
For more information about this Python package, please consult the rasterio
documentation page.
Sentinel data
All Sentinel data are concatenated in a .zip
file when downloaded, in which you'll find files in specific extensions:
Sentinel-1 and Sentinel-2 products (supplied by ESA) are provided in .SAFE format (cf. sentiwiki)
Sentinel-3 products (supplied by ESA or Eumetsat) are provided in .SEN3 format (a tailored version of SAFE format)
Sentinel-5P products (supplied by ESA) are provided in NetCDF format
Sentinel-6 products (supplied by Eumetsat) are provided in .SEN6 format (a tailored version of SAFE format)
To open downloaded Sentinel data, you simply need to unzip all the files:
import os
import zipfile
extension = ".zip"
path = "Directory"
for item in os.listdir(path): # loop through items in path
if item.endswith(extension): # check for ".zip" extension
file_name = os.path.join(path, item) # get full path of files
zip_ref = zipfile.ZipFile(file_name) # create zipfile to read it
zip_ref.extractall(path) # extract file to dir
zip_ref.close() # close file
π‘WEkEO Pro Tip: for data containing NetCDF files, you can check our previous section to learn how to open a .nc
file via xarray
! π
And that's it!
Now you know all about WEkEO data format and how to read it! π
What's next?
More examples are available in our previous WEkEO trainings, available online, from our JupyterHub!
We are user-driven and we implement users' suggestions, so feel free to contact us:
through a chat session available in the bottom right corner of the page
via our contact webpage
via e-mail to our support team (supportATwekeo.eu)
Regardless of how you choose to contact us, you will first be put in touch with our AI Agent Neo. At any time, you can reach a member of the WEkEO User Support team by clicking on "talk to a person" via chat, or by naturally requesting it in reply to Neo's email.