Featured Products

Source Cooperative is a Radiant Earth project

Agri-environmental Semantic Segmentation of LUCAS landscape photos · JRC LUCAS: Land Use and Coverage Area frame Survey · Source Cooperative | Source Cooperative

Agri-environmental Semantic Segmentation of LUCAS landscape photos

This Dataset is a collection of street-level images extracted from the Land Use/Cover Area Frame Survey (LUCAS) dataset provided by Eurostat. This dataset is designed for semantic segmentation tasks, focusing on distinguishing between different land cover categories, including agricultural and natural landscapes. The dataset covers the survey year 2018 and includes north looking images with both full masks and partial masks, where certain areas are not delineated.

land-use and land-cover change

Product Details

Visibility: Public
Owner: JRC LUCAS: Land Use and Coverage Area frame Survey
Created: 8 Sep 2023
Last Updated: 21 Aug 2025

Product Contents

root

README

This dataset contains a semantic segmentation delineation derived from street-level images, focusing on categorizing agricultural and natural landscapes. With 35 distinct classes, including labels such as "field margin," "crop," "cropfield," and "ditch," the dataset draws from Land Use/Cover Area Frame Survey (LUCAS) geospatial dataset. LUCAS images are collected using a consistent sampling framework, offering a representative view of different regions and environments of Europe.

Comprising a total of 1784 north looking images from 2018, this dataset contributes to land cover analysis by providing fine-grained annotations for a variety of landscape elements, as well as, a valuable resource for training and evaluating semantic segmentation models.

The dataset's potential applications span a range of domains, from land use mapping and environmental monitoring to urban planning and agricultural management. By fostering the advancement of machine learning models in accurately segmenting landscapes, this dataset contributes to sustainable land management practices and supports informed decision-making processes.

Dataset Structure

We provide two data products across three folders derived from the same raw data, for a total of three folders reported in this repository:

raw_data
ml_data
STAC

Raw data

Across the above folders the raw data is the original data and not easily useable in machine learning context, but kept as a reference. The original dataset is organised into batches, per segmentation campaign, with each batch containing three main folders:

images: Contains the LUCAS north-looking images captured for each theoretical point.
full_masks: Contains pixel-level annotated masks corresponding to each image, where each pixel is labelled with a class.
partial_masks (only for the first batch): Contains partial masks where some areas of the images are not delineated.

In the root of the raw_data folder is a classes_dataset.csv csv file containing the code and label correspondence.

ML data

The batch data is consolidated and enhanced the original labelled data with geolocation information and ancillary data derived from the Harmonized LUCAS in-situe land-cover and land use database. This meta-data can provide the necessary context within machine learning exercises or exploratory analysis.

The data is structured in two folders:

images
masks

With in the root of the folder a file with the meta-data called lucas_ml_data.csv with ancillary data. It also contains the classes_dataset.csv CSV file containing the code and label correspondence.

STAC

The dynamic use of the data without downloading all data, should the dataset grow, can be accomplished using the implementation of a Spatio-Temporal Assets Catalogue (STAC). The STAC format allows for easy spatio-temporal subsetting. The data can be visually browsed using the STAC browser.

Data Format

Image files are provided in JPEG format.
Masks are provided in PNG format, where each pixel corresponds to a specific class

Data size

Each data folder is approximately ~1GB in size with the STAC being largely larger
Exact total 3.3 GB

Usage

This dataset can be used for various semantic segmentation tasks, including land cover analysis, environmental monitoring, and urban planning. The unique identifiers in the image names enable geospatial analysis using correspondence with the LUCAS harmonised database. The provided ML dataset provides this capability.

Citation

If you use this dataset in your work, please consider citing the following paper:

Andrimont, Raphaël d’, Momchil Yordanov, Laura Martinez-Sanchez, Beatrice Eiselt, Alessandra Palmieri, Paolo Dominici, Javier Gallego, et al. “Harmonised LUCAS In-Situ Land Cover and Use Database for Field Surveys from 2006 to 2018 in the European Union.” Scientific Data 7, no. 1 (December 2020): 352. [https://doi.org/10.1038/s41597-020-00675-z](https://doi.org/10.1038/s41597-020-00675-z](https://doi.org/10.1038/s41597-020-00675-z))

License

The LUCAS Semantic Segmentation Dataset is provided under CDLA-Permissive-1.0 License

Authors

Marijn Van der Velde
Laura Martinez-Sanchez
Raphaël d’Andrimont
Elizabeth Kearsley
Koen Hufkens

Release

version: v1.0
Latest release: 13 september 2023
previous release: 13 september 2023
Temporal coverage: 2018
Update frequency: 3-yearly for the underlying data
Spatial coverage (geographic area): European Union
Spatial coverage (bounding box): [xmin = -9.56, ymin = 34.7, xmax = 33.4, ymax = 65.8]