support@opendatascience.eu (+31) (0)317 427537

Training sessions 6-7 September 2021 two parallel sessions @ WICC

Training sessions

Two parallel training sessions on days 1 and 2
Expand All +
  • Day 1

    Sept 6, 2021

  • Day 2

    Sept 7, 2021

  • Day 3

    Sept 8, 2021

  • Day 4

    Sept 9, 2021

  • Day 5

    Sept 10, 2021

  • Introduction to spatial and spatiotemporal data in R (90') Spatial vs spatiotemporal data. Time-series analysis. Visualizing spatiotemporal data (some examples).
    Training sessions
    Where
    Progress Avenue

  • Software/libraries preparations and user support (30') Introduction to spatial and spatiotemporal data in Python (90')
    Training sessions
    Where
    Succes Avenue, WICC

  • Machine Learning in R
    Training sessions
    Where
    Progress Avenue

  • Instructors: Martijn Witjes Software requirements: opengeohub/py-geo docker image (gdal, rasterio, eumap, scikit-learn) Content: Theoretical background for machine learning and python implementations Integrating raster data with scikit-learn models Why use pyeumap.LandMapper? Spatial overlay to prepare the training samples Spatial cross-validation to evaluate the ML model performance Hyperparameter optimization to tune the ML model Fitting the final ML model Generating spatial predictions using the fitted model
    Machine Learning
    Training sessions
    Where
    Succes Avenue, WICC

  • GRASS GIS is an all-in-one tool for geospatial data analysis and remote sensing, made by users for users. In this workshop we will introduce the new version 8 of GRASS GIS. It features a heavily redesigned graphical user interface and optimizes the way users interact with their data. The new version includes all the features of previous versions, including Python 3 scripting, spatio-temporal data analysis, and more. The workshop will analyze data from different domains and show how automated processing can be performed.
    Training sessions
    Where
    Progress Avenue

  • GRASS GIS supports time series processing for vector, raster, and volume data that can be created, analyzed, and also visualized through the graphical user interface. Due to the sheer volume of data available from the Landsat and Sentinel satellites, there is a strong need for large-scale automated processing. The i.sentinel toolset allows querying Sentinel data coverage for a region of interest, downloading from multiple data sources, performing atmospheric and topographic corrections, and cloud/shadow masking. Preparation of data for multitemporal analyses is enabled in the t.sentinel and t.rast.mosaic extensions through automatic creation of space-time raster datasets (strds) and temporal aggregation to obtain cloud-free temporal mosaics of arbitrary granularity.
    Software
    Training sessions
    Where
    Progress Avenue

  • Instructors: Milan Kilibarda Software requirements: web browser and QGIS Content: Typical architecture of web map applications Introduction to OGC services (WMS, WFS, WCS) Web mapping servers and clients Accessing the ODSE services Web mapping application demonstration
    AI
    Big Data
    Data Cubes
    Where
    Succes Avenue, WICC

  • Opening and welcome session
    Where
    HugoTECH

  • The Free and Open Source paradigm is no longer a novelty in the geospatial world. There are numerous, stable and powerful solutions, services, libraries built and maintained in the spirit of FOSS. The movement has long overcome the status of a garage hobby, infiltrating all sectors. There are hundreds of relevant events around the world each year, from conferences to hackathons, workshops and seminars, studies and reports analysing open source business models, funding schemas and explicit requests for projects that build software to make it open source. In this agile context, it is difficult, if not impossible to keep up to the speedy developments, to make use of what is already out there and investing resources in improving and not building from scratch.
    Big Data
    Conference
    Software
    Where
    HugoTECH

  • Platforms for the Exploitation of Earth Observation (EO) data have been developed by public and private companies in order to foster the usage of EO data and expand the market of Earth Observation-derived information. A fundamental principle of the platform operations concept is to move the EO data processing service’s user to the data and tools, as opposed to downloading, replicating, and exploiting data ‘at home’. In this scope, previous OGC Innovation Program research and development activities initiated the development of an architecture to allow the ad-hoc deployment and execution of applications close to the physical location of the source data with the goal to minimize data transfer between data repositories and application processes. We now see Best Practice emerging to package and deploy Earth Observation Applications in an Exploitation Platform. Even further, we see how the same principles actually work smoothly in microservice-based architectures if integrated with standardized RESTful Web APIs. This session describes the general architecture, demonstrates the Best Practices, and includes recommendations for the application design patterns, package encoding, container and data interfaces for data stage-in and stage-out strategies. The session further outlines how to interact with a respective system using Jupyter Notebooks and OGC Web APIs.
    Big Data
    Earth Observation
    Where
    HugoTECH

  • GISCO, the ‘Geographical Information System of the COmmission’, is a permanent service of Eurostat that fulfils the requirements of both Eurostat and the European Commission for geographic information and related services at European Union (EU), Member State and regional levels. These services are also provided to European citizens at large. GISCO’s goal is to promote and stimulate the use of geographic information within the European Statistical System and the European Commission.
    Earth Observation
    Land cover
    Where
    HugoTECH

  • Environmental Impact Assessments (EIA) aim at assessing a project’s impacts on the environment, including biodiversity, and proposing mitigation measures to tackle negative impacts. However, these assessments often underestimate the impacts on biodiversity due to lack of continuous biological information covering the whole targeted area. Knowing where a species occurs, and the distribution of biologically diverse and unique areas are key in successfully assessing the impacts of infrastructure projects on biodiversity, and proposing adequate biodiversity management plans; however, collecting biological information is time-consuming and costly. A powerful tool to overcome the challenges of lacking continuous biodiversity information over larger areas is to combine existing biological field-data with satellite images, such as Landsat. Satellite images provide continuous spectral information, which have already been used to model the distribution of species and predict biodiversity patterns over large areas. Therefore, the combination of available field-data, freely available satellite imagery and machine learning offers a cost-effective way to map continuous biodiversity patterns for EIAs. By combining biological data with remote sensing layers, it is possible to predict biodiversity patterns in areas far beyond the location of the field-data, nevertheless such predictions are only reliable within the area of applicability. Even though this approach has been used in different regions, it has not been used for Environmental Impact Assessments (EIAs) at local scales. Having maps of biodiversity patterns, such as beta-diversity, throughout a project’s study area allow: (i) properly quantifying a project footprint’s impact on biodiversity and identifying suitable areas for “like-for-like” compensation schemes.
    Earth Observation
    Land cover
    Where
    HugoTECH

  • The UN Open GIS Initiative is an ongoing Partnership Initiative leaded by the United Nations Geospatial Operations. The Initiative, established in March 2016, is supported by several UN Agencies and mission partners (Member States, technology contributing countries, international organizations, academia, NGOs, and the private sector) and takes full advantage of their expertise. The target is the creation of an extended open-source spatial data infrastructure that meets the requirements of the UN Secretariat (including UN field missions and regional commissions), and then expands to UN agencies, UN operating partners and developing countries. The keynote will explain the reasons of the choice toward openness, will describe the various components of the initiative (Working Groups and Pilot Projects) and will present the reached level of maturity. Lessons learned and future developments will conclude.
    Big Data
    Earth Observation
    Where
    HUGOtech

  • The Open Source Geospatial Foundation (OSGeo) is a not-for-profit organization whose mission is to foster global adoption of open geospatial technology by being an inclusive software foundation devoted to an open philosophy and participatory community driven development.
    Big Data
    Software
    Where
    HUGOtech

  • The study of inhalable particulate matter with diameter equal or less than 2.5 micrometers (PM2.5) is gaining more interest by air quality community since these small particles are considered one of the most harmful air pollutants to all living things. PM2.5 concentrations are usually measured by ground stations; however, it is not possible to have full coverage of estimation from such source solely due to high cost. With the application of machine learning and deep learning algorithms, it became common to estimate PM2.5 using multiple sources like satellites retrievals of Aerosol Optical Depth (AOD) and other auxiliary data, such as meteorological data, land cover, land use, etc. Previous studies were limited by spatial resolution, small coverage, or the gaps in the estimated PM2.5 maps related to the missing satellite retrievals. Up to our knowledge, we are the first to produce high resolution (1 km), full coverage PM2.5 maps of whole Europe for the years 2018–2020 using open-source data. Results will be discussed later during the presentation.
    Land cover
    Machine Learning
    Where
    HugoTECH

  • Exposure to fine particulate matter (PM2.5) is linked to adverse health outcomes. Usually, epidemiological studies rely on PM2.5 measurements collected from ground monitors. However, in many places such as Great Britain the existing monitoring network provides limited spatio-temporal coverage of PM2.5. Data from satellites, climate/atmospheric reanalysis models, and chemical transport models offer additional information that can be used to reconstruct PM2.5 concentrations, filling the gaps in the ground monitoring network. This study developed a multi-stage satellite-based machine learning (ML) model to estimate daily PM2.5 levels across Great Britain during 2008-2018. Stage-1 estimated PM2.5 concentrations in monitors with only PM10 records. Stage-2 imputed missing satellite aerosol-optical-depth due to cloudiness and bad retrievals. Stage-3 applied the Random Forest algorithm to estimate PM2.5 concentrations using a combined dataset from Stage-1, Stage-2, and a list of spatiotemporally synchronised predictors. Stage-4 estimated daily PM2.5 using Stage-3 model. The model performed well with an overall mean R2 of 0.77. The high spatio-temporal resolution and the relatively high precision allowed these estimates (approximately 950 million points) to be used in epidemiological analyses to assess health risks associated with both short- and long-term exposure to PM2.5.
    Earth Observation
    Machine Learning
    Where
    HugoTECH

  • Peatlands have been under severe pressure globally due to anthropogenic activities. In Ireland, approximately 90 % of the peatlands have been degraded to some extent due to these activities. Raised bogs account for approximately 35 % of the total peatland area and are present in high density in the midland lowland regions of the country. While the rest of the peatland area is occupied by blanket bog i.e., ~ 65 %. To achieve temperature goals agreed in the Paris agreement and fulfil the EU’s commitment to quantifying the Carbon/Green House Gases (C/GHG) emissions from land use, land-use change forestry, accurate mapping and identification of management related activities (land use) on peatlands is important.
    Climate data
    Land cover
    Where
    HugoTECH

  • This talk describes an open-source R package for satellite image time series analysis using machine learning. It supports the complete cycle of data analysis for land classification. Its API provides a simple but powerful set of functions. The software works in different cloud computing environments, including AWS, MS-Azure, and Digital Earth Africa. In sits, satellite image time series are input to machine learning classifiers, and the results are post-processed using spatial smoothing. Since machine learning methods need accurate training data, sits includes methods for quality assessment of training samples. The software also provides methods for validation and accuracy measurement. The package thus comprises a production environment for big EO data analysis. The package is available on https://github.com/e-sensing/sits and the documentation is available on https://e-sensing.github.io/sitsbook/.
    Data Cubes
    Earth Observation
    Where
    HUGOtech

  • 3 course dinner at the WICC restaurant.
    Where
    Restaurant, WICC

  • Geoinformation science and remote sensing at Wageningen University.
    Big Data
    Land cover
    Software
    Where
    HUGOtech

  • We classified 33 land use / land cover (LULC) classes between 2000 and 2019 using a single spatiotemporal ensemble machine learning model in a fully automated, free and open source workflow. This workflow includes harmonization and preprocessing of several high-resolution publically available covariate datasets and over five million training samples, spatial K-fold cross-validation, hyperparameter optimization, and multiple methods for LULC change analysis. We show how the per-class probability predictions (1) facilitate useful prediction uncertainty metrics, (2) inform use case-tailored post-processing strategies, and (3) enable a novel way to quantify LULC change dynamics without relying on hard-class predictions. We show that for this purpose, spatial models that are trained on data from a single year are consistently outperformed by a single spatiotemporal model that is trained on all data from all years, especially when generalizing to input data from years that are not included in the training dataset. We present a final land cover dataset with per-class probability and uncertainty metrics, as well as a hard-class classifications with 62\% cross-validation (CV) accuracy for 33 Corine Land Cover (CLC) level 3 classes, 70\% accuracy for 14 level 2 CLC classes, and 87\% accuracy for the 5 level 1 classes. Our results suggest that our method enables land cover classification for subsequent years without waiting for new training data, while facilitating improved training data collection through analysing variable importance, per-class performance, and uncertainty metrics.
    Earth Observation
    Land cover
    Where
    HugoTECH

  • The presentation will show results of modeling species distribution maps for both potential and actual natural vegetation through spatiotemporal machine learning using a data-driven, robust, objective and fully reproducible workflow. The presentation will focus on the benefits of using ensemble machine learning for species distribution modeling to capture patterns of niche changes in both space and time: yearly (from 2000 to 2020) probability distribution maps for both potential and actual natural vegetation will be presented for forest tree species that live in different climatic conditions across Europe. The high spatial (30 m) and temporal (1 year) resolution of the outputs should allow us to enhance and better understand the patterns of niche change.
    Land cover
    Machine Learning
    Vegetation mapping
    Where
    HUGOtech

  • Windstorms remain one of the most disturbing factors of European forest ecosystems. Nowadays, our deep understanding of all the major drivers standing behind observed changes in forests is essential to improve prediction models. This is important for various scientific disciplines, decision makers on different levels and forest management planners. New statistical learning techniques help with analyses of massive data objects and offer sophisticated explaining tools that help to understand complex models. In the present study, we combined several data sets on tree features, bioclimatic and geomorphic conditions, and the level of forest damage in the Sudety Mountains over the period 2004-2010. We tested four scenarios under five classification model frameworks: logistic regression (binomial GLM), random forest (RF), support vector machines (SVM), neural networks (NN), and gradient boosted modelling (GBM). All models except GLM offer similar level of predictive power (AUC ~ 0.7). GBM and RF feature the best predictive power indicated by AUC = 0.717, while RF model reached AUC = 0.715. Tree volume and age are the most important predictors. Less important are climate and geomorphic variables. The same models performed less accurately for test data from the period 2004-2006. Forest damage probability maps based on forest data from 2020 show overall lower level of damage probability as compared to the end of 20th and the beginning of 21st century. To sum up, using only 11 variables based on the open source datasets, we were able to obtain predictive models of good accuracy.
    Climate data
    Earth Observation
    Vegetation mapping
    Where
    HUGOtech

  • European Soil Data Centre (ESDAC) is producing several interesting new data sets based on predictive soil mapping.
    Land cover
    Soil data
    Where
    HUGOtech

  • Europe is a dynamic continent and its landscape has changed over the last 20 years. Current developments in computing power and an increase in the efficiency of geospatial computing has made it possible to analyse the archive of Landsat images that date back to the year 2000 at 30 meter resolution for the entire continent of Europe. We developed methods to harmonize, analyse and visualize changes between land cover types, create overviews of NDVI trends, and trends in predicted probabilities over the years using machine learning approaches. In this talk I will be discussing some specific examples of interpretations that these new approaches allow. We will be travelling from the forest harvesting in Sweden to the hydroelectric dams in Portugal, over mysterious changes in the Alps to the apparent reforestation of the Romanian forests. I invite everyone to join the trip through Europe and help us understand our changing surroundings at scale!
    Earth Observation
    Land cover
    Where
    HugoTECH

  • The rapidly increasing amount of publically available remotely sensed data in recent decades has revolutionized large-scale research and context-informed decision making. However, these data are generally not freely available as homogenized products ready for analysis at continental (or larger) scales. This is widely observed with datasets generated by EO satellites, particularly those with optical sensors and those capable of high-resolution imaging, where the process of mosaicking imagery to produce a homogenous, cloudless dataset across a particular area of interest often grows increasingly cumbersome at larger scales.
    Big Data
    Earth Observation
    Where
    HugoTECH

  • European State of the Climate report: this annual report on behalf of the European Commission provides an analysis of the monitoring for Europe for the past calendar year, with descriptions of climate conditions and events.
    Climate data
    Earth Observation
    Where
    Haakzaal, WICC

  • AgriCaptureCO2 aims at developing a systematic, robust and flexible platform for quantifying SOC capture as well as verifying and promoting farmers’ application of regenerative agriculture (Reg Agri). The digital, web-based platform will provide four services, namely: Quantification, Exploration, Support and Verification, that will run on EO data. The idea is that the services provide information that will support the process of certification of “carbon credits” generated as the result of Reg Agri and verification of Reg Agri practices’ application as a prerequisite for any incentives for farmers.
    Earth Observation
    Land cover
    Soil data
    Where
    HugoTECH

  • The use of AI in agriculture is a trend all over the world; however, in Serbia it should be especially important because agriculture is one of the crucial sectors of Serbian economy. This project will be an important step forward in the application of a wide range of relevant data generated on a daily basis and offering a huge potential for improving agricultural production and developing the concept of smart and regenerative agriculture.
    Earth Observation
    Soil data
    Where
    HugoTECH

  • Gridded maps with information on the location of crops are essential to inform national food and agricultural policies and are an important input for land use change models. Despite rapid advancement in machine learning approaches to identify the location of crops, national and global level crop distribution maps that cover a large number of crops are not readily available yet, in particular for African countries. One of the most uses sources of crop distribution information is the IFPRI Spatial Production Allocation Model (SPAM, www.mapspam.info), which presents global-level plausible spatial estimates of the location of 40 crops (groups) that represent total agricultural production. SPAM uses a cross-entropy optimization approach to allocate national and subnational crop statistics of four production systems (subsistence, low-input, high-input, irrigated), informed by spatial information on both biophysical (e.g. crop suitability) and socio-economic (e.g. accessibility) drivers of crop location. This paper presents MAPSPAM-C, and R package that implements the SPAM procedure to create crop distribution maps and facilitates the pre-processing steps to harmonize spatial input layers and post-processing steps to create harvested area, physical area, yield and production maps. The package can be used to reproduce and validate the new generation of SPAM products and will be useful for researchers that want to create their own maps using country specific input data.
    Land cover
    Soil data
    Where
    HugoTECH

  • Cropping intensification is defined as cultivating a plot of land several times a year to obtain maximum profit. It is a strategy operated for the profitability of equipment and hydro-agricultural facilities and contributes to the country's food security. The Sidi Bennour district, the area of our study, is part of the hydro-agricultural investment programs. It is located in the west of Morocco and is characterized by a semi-arid climate and a very fragmented landscape with small plots. It covers a large part of the irrigated Doukkala scheme, where irrigation has been implemented progressively in time and space. The use of earth observation data has shown great interest in studying agricultural systems, especially with well-developed technological progress in recent years. Thus, this study aims to measure the spatio-temporal evolution of cropping intensification through an analysis of vegetation cover dynamics using satellite images and the Google Earth Engine platform. We used the NDVI vegetation index time series from 1985 to 2020 for Landsat and from 2015 to 2020 for Sentinel-2. The study area is located in an overlap between two Landsat paths, allowing an image every seven days. NDVI time series profiles were analyzed for each cropping season (from September of year Y to August of the following year Y+1). As a first step, we assessed the horizontal expansion that has occurred in agricultural areas. Subsequently, some metrics were established using the NDVI time series to identify cropping systems (single, double, or multiple). Preliminary results show that (1) satellite image time series can be very effective in measuring crop intensification; (2) the introduction of irrigation has been accompanied by a substantial intensification of agricultural activities, which is reflected in the presence of a dense vegetation cover during most of the year; (3) in the last decade, which is characterized by a shortage of irrigation water, due to the frequent years of drought, and the restrictions in allocations for irrigation of crops during the year, there is a decrease in the presence of vegetation cover throughout the year and thus decrease in crop intensification; (4) Sentinel-2 data were used to improve the density of the time series on the one hand, and to identify changes at the small plots on the other.
    Earth Observation
    Land cover
    Where
    HugoTECH

  • Citizen science can be used to supplement and enhance massive field data collections. One example is the LUCAS survey data and the FotoQuest Go Europe 2018 campaign.
    Earth Observation
    Land cover
    Where
    HUGOtech

  • EO Data Cube based in the South Pacific region (Fiji, Solomon Islands and Vanuatu) as part of the IPP CommonSensing project, funded by the UK Space Agency's International Partnership Programme (https://www.commonsensing.org.uk/)
    Data Cubes
    Earth Observation
    Where
    HUGOtech

  • A fleet of more than 50.000 cargo ships worldwide has an enormous demand for energy resulting in considerable emissions. According to the 4th International Maritime Organization (IMO) global greenhouse gas (GHG) [study](https://www.imo.org/en/OurWork/Environment/Pages/Fourth-IMO-Greenhouse-Gas-Study-2020.aspx), maritime transport emitted around 1,056 million tonnes of CO2 in 2018 and was responsible for about 2.9% of the global anthropogenic CO2 emissions. While the emissions per tonne and nautical mile have been reduced by almost 30% in the last decade, the overall emissions of cargo ships increased by more than 10% (up to 30% in some models) due to the growing demand. The IMO has committed itself to reducing the amount of global pollutant emissions by 50% in 2050 compared to 2008.
    Climate data
    Earth Observation
    Where
    HugoTECH

  • A global compilation of monthly and annual time-series of images for the periods 1982-2018 and 2000-2020 (data cube) is described. The prepared time-series for 1982-2018 (global at 5-km resolution) comprise: TerraClimate (Abatzoglou et al., 2018), vegetation monthly NDVI 90% percentiles for period 1982--2018 as a merge of the AVHRR daily and MODIS NDVI product, Vegetation Continuous Fields (VCF5KYR) Version 1 dataset (Song et al., 2018), Hyde v3.2 land use annual time-series (Klein Goldewijk et al., 2017), . For period 2000-2020 (global at 1-km resolution) MODIS land products (NDVI, LST, snow cover) in combination with MODIS atmospheric products (water vapour, cloud fraction), and global relief (MERIT DEM) and climate layers (CHELSA) are used. All layers have been resampled and gap-filled so they can be imported as an Analysis-Ready spatiotemporal array. For each pixel we also provide geometric temperatures (derived from latitude, day of the year and elevation) and for many layers also uncertainty measures. These datastacks have been made available via our OpenLandMap.org data portal and Cloud-Optimized GeoTIFF S3 file service and available for research and development. Overlaying Earth System Science point datasets (https://gitlab.com/openlandmap/compiled-ess-point-data-sets) such as the global compilation of soil organic carbon demonstrates that the global data cubes can be used to build complex spatiotemporal 2D+T models, including 3D+T, and produce predictions of important variables representing our dynamic environment. The two important advantages of running machine learning on spatiotemporal data recognized include: (1) possibility to explain complex casual relationships between environmental dynamics of plants, ecosystems communities, and soil variables and dynamic climate and human influence, (2) possibility to predict states beyond the time-span covered by training data - e.g. to predict future (as in scenario testing) and past states for which there are no training points.
    Big Data
    Earth Observation
    Machine Learning
    Where
    HugoTECH

  • Copernicus Land, Marine, & Coastal and emergency themes, Horizon Europe Space, ESA’s Data Operations Scientific Advisory Groupe (DOSTAG)
    Earth Observation
    Land cover
    Where
    HUGOtech

  • The mlr3 package is geared towards scalability and larger datasets by supporting parallelization and out-of-memory data-backends like databases. While 'mlr3' focuses on the core computational operations, add-on packages provide additional functionality.
    Machine Learning
    Software
    Where
    HUGOtech

  • Although forest fires are an integral part of the natural Earth system dynamics, they are becoming more devastating and less predictable as anthropogenic climate change exacerbates their impacts. We observe longer fire seasons, more extreme fire weather and more rain-free days globally, which are inducing significant variations in wildfire danger. Fire observations (either in situ or remotely sensed) are important to understand the physical phenomenon and estimate spatio-temporal trends and patterns, while forecasting fire danger allows first responders to act quickly and efficiently at the onset of a dangerous event. The European Centre for Medium-range Weather Forecasts (ECMWF) and Copernicus have created a wealth of datasets related to the forecasting of wildfire danger as well as the detection of wildfire events and related emissions in the atmosphere. These products contribute to the operational services provided by the Copernicus Emergency Management Service (CEMS) and the Copernicus Atmosphere Monitoring Service (CAMS) and consists of real time forecasts as well as historical datasets based on ECMWF reanalysis database ERA5. Data is open and available through the Copernicus Climate Data Store (CDS), the European Commission’s Global Wildfire Information System (GWIS) and European Forest Fire Information System (EFFIS). In this talk I will present the complete wildfire-related data offering, tools for efficient data processing and visualisation as well as results from recent research projects.
    Big Data
    Climate data
    Earth Observation
    Vegetation mapping
    Where
    HUGOtech


Our Trainers

Training sessions would be run on sample data. Participants are expected to work on their own laptop computers with pre-installed software.