ERA5 Marshalling data

Process Activity

Description: The process involves reading NetCDF files into Panda dataframes, obtaining estimated population data for grids from the Global Human Settlements data based on Eurostat, merging the population data with ERA5 data, and writing the merged data to disk in Parquet format. External experts perform quality checks on the merged data, which could be either a one-off or a regular quality assurance check. The process utilizes over 12.000 NetCDF4 files as input as well as data from the GHSL - Global Human Settlement Layer. The output of the process is a single Parquet file named “Interim data for review” with its corresponding URI.


Diagram of the Process Activity

digraph Diagram { graph [ stylesheet="../_static/custom.css" fontnames = "svg" # "... rock solid standards compliant SVG", see: https://graphviz.org/faq/font/#what-about-svg-fonts rankdir="LR" nodesep="0.15" tooltip=" " ]; node [ fontname="sans-serif" ]; activity_8a364e23_1e2f_4f75_9b61_2ad1ad39fe68 [ shape="rect" style="filled, rounded" width="3" height="0.8" fontcolor="white" fillcolor="#4363d8" fontsize="13pt" label="ERA5 Marshalling data\n(current activity)" tooltip=" " ]; EntityUsed_1 [ shape="rect" style="filled" width="2" height="0.1" fontcolor="blue" fontsize="10pt" URL="" target="_blank" label="ca. 12.000 NetCDF4 files" tooltip="Entity used for current activity" fillcolor="#ffe119" ]; EntityUsed_1:e -> activity_8a364e23_1e2f_4f75_9b61_2ad1ad39fe68:w [ minlen="2" ]; EntityUsed_2 [ shape="rect" style="filled" width="2" height="0.1" fontcolor="blue" fontsize="10pt" URL="https://ghsl.jrc.ec.europa.eu/datasets.php" target="_blank" label="GHSL - Global Human\nSettlement Layer" tooltip="Entity used for current activity" fillcolor="#ffe119" ]; EntityUsed_2:e -> activity_8a364e23_1e2f_4f75_9b61_2ad1ad39fe68:w [ minlen="2" ]; EntityProduced_1 [ shape="rect" style="filled" width="2" height="0.1" fontcolor="blue" fontsize="10pt" URL="https://doi.org/10.21338/era5-grids" target="_blank" label="ERA5 Interim data for review\n(era5-grids)" tooltip="Entity produced by current activity" fillcolor="#dcbeff" ]; activity_8a364e23_1e2f_4f75_9b61_2ad1ad39fe68:e -> EntityProduced_1:w [ minlen="2" ]; }