High-resolution Sentinel-2 mosaics and AI-derived field-boundary probability maps for Japan, Mexico, Rwanda, South Africa, and Switzerland. This repository contains cloud-optimized Zarr datasets produced using the WherobotsAI platform during the FTW Phase 2 Model Bakeoff. Includes bi-temporal mosaics (Planting and Harvest seasons) and raw softmax model outputs for agricultural field segmentation.
Country-scale Sentinel-2 mosaics and AI-derived field boundary probability maps for Japan, Mexico, Rwanda, South Africa, and Switzerland.
This dataset contains the underlying raster data generated during the evaluation of models produced during the FTW Phase 2 Model Bakeoff using the WherobotsAI platform. It includes both the input imagery (seasonally optimized mosaics) and the raw model outputs (softmax predictions) hosted in cloud-optimized Zarr format.
Read the Blog: Wherobots and Taylor Geospatial Engine Bring Fields-of-the-World Models to Production Scale
The model architecture and training pipeline are fully open-source. You can run inference, reproduce our baselines, or train your own models using the Fields of The World (FTW) repository and the FTW CLI tool.
Installation:
Documentation & Source Code: For tutorials and usage instructions, visit the ftw-baselines repository.
The dataset covers five agricultural systems across 2 seasons and 2 years (2023 and 2024), totaling 4.76 million km². This data represents the inference stage of models produced during the FTW Phase 2 Model Bakeoff.
These predictions were generated using models produced during the FTW Phase 2 Model Bakeoff, which include U-Net architectures with EfficientNet encoders, specifically optimized for deployment robustness at scale.
The high-throughput inference pipeline runs on the WherobotsAI platform and prioritizes spatial consistency and cloud-native efficiency.
The data is hosted as cloud-optimized Zarr stores. Each store corresponds to a country and contains high-resolution raster arrays with time coordinates for 2023 and 2024:
Sentinel-2 Mosaics:
Model Predictions:
The dataset uses cloud-native Zarr format which allows access to subsets for analysis without downloading the entire dataset using Python libraries like xarray and zarr.
Define a sample Area of Interest (AOI) over Japan:
Lazily load data within the AOI for a specific year and visualize inputs alongside model outputs.
The dataset includes both 2023 and 2024 data. Perform basic change detection to identify where field boundaries shifted or disappeared.
If you use this data in your research, please cite:
The current model checkpoints used to produce these predictions were trained on the full FTW dataset, which included CC-BY-NC labels, resulting in CC-BY-NC licensed predictions. We will be releasing predictions trained on only the CC-BY licensed labels in the next few weeks.
License: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)