Global metadata catalog for the MajorTOM 10 km grid. Over 5 million tiles enriched with terrain, climate, soil, socioeconomic, and administrative attributes. GeoParquet format.
The Major TOM Index is a global metadata catalog for the Major TOM grid at 10 km resolution. It provides a single entry point to discover, filter, and select tiles across sensors, locations, and time without downloading any imagery.
The index covers over 5 million tiles spanning the entire Earth. Each tile corresponds to a 1056 × 1056 px patch (10.56 × 10.56 km) aligned to Sentinel-2 MGRS tiles at 10 m resolution. Every tile is enriched with terrain, climate, soil, socioeconomic, and administrative attributes derived from public Earth Engine datasets.
What can you do with this index?
climate:precipitation < 200 and terrain:elevation > 3000.land_s2 and land_l8 files include sensor-specific image IDs (s2:id_gee, l8:id_gee) that point directly to the source products in Google Earth Engine.elliot.parquet file provides pre-built monotemporal and temporal splits designed for multi-sensor, multi-temporal EO research.All files are self-contained GeoParquet with ZSTD compression, sorted by majortom:code_1000km → majortom:code_100km → id for efficient spatial predicate pushdown.
Columns are organized into namespaces. Each namespace groups related attributes.
majortom:)Tile identity and spatial reference within the Major TOM grid system.
stac:)Spatial and temporal reference following STAC conventions. Present in land_s2 and land_l8 only, where it replaces the majortom: grid columns.
s2:)Sensor metadata for the assigned Sentinel-2 image. Present in land_s2 only.
Note on solar vs viewing angles. The sun has a single position relative to the scene, so ESA provides one solar azimuth and one solar zenith averaged across all bands. Viewing angles are different: Sentinel-2 uses a pushbroom sensor where each spectral band has its own detector array in the focal plane, each observing from a slightly different angle. That is why GEE provides per-band viewing angles (
MEAN_INCIDENCE_*_ANGLE_B1through_B12). We use band B8 (NIR, 10 m) as the reference because it is at native 10 m resolution and sits near the center of the focal plane, making it a representative proxy for the viewing geometry of the 10 m and 20 m bands.
l8:)Sensor metadata for the assigned Landsat image. Present in land_l8 only.
terrain:)climate:)soil:)Surface-layer soil properties from the OpenLandMap dataset, derived from machine learning predictions on global soil survey data at 250 m resolution.
socio:)admin:)Human-readable administrative boundary names resolved from rasterized boundary datasets.
The elliot.parquet file contains 279,166 tiles selected for the ELLIOT project multi-temporal dataset extension. Tile locations were sampled using hierarchical spherical k-means (530 × 528 = 279,840 clusters) over AlphaEarth Foundation embeddings to ensure global environmental diversity.
The split column defines two subsets:
Monotemporal (250,000 tiles). One cloud-free image per sensor per location. Designed for tasks where spatial coverage matters more than temporal depth: land cover classification, feature extraction, or pretraining foundation models on diverse global scenes.
Temporal (29,166 tiles). Multiple observations per location across time. Designed for tasks that require temporal context: change detection, phenology tracking, seasonal compositing, or training models that learn from multi-temporal sequences. This subset is further divided into monthly cadence (12,500 tiles × 12 timesteps) and five-daily cadence (16,666 tiles × 6 timesteps).
CC-BY-4.0
The Major TOM Index has been made possible thanks to Asterisk Labs, the ELLIOT project (European Commission, Horizon Europe, Grant 101214398), and the Image and Signal Processing Group (ISP) at Universitat de València.
The Major TOM Index is a global metadata catalog for the Major TOM grid at 10 km resolution. It provides a single entry point to discover, filter, and select tiles across sensors, locations, and time without downloading any imagery.
The index covers over 5 million tiles spanning the entire Earth. Each tile corresponds to a 1056 × 1056 px patch (10.56 × 10.56 km) aligned to Sentinel-2 MGRS tiles at 10 m resolution. Every tile is enriched with terrain, climate, soil, socioeconomic, and administrative attributes derived from public Earth Engine datasets.
What can you do with this index?
climate:precipitation < 200 and terrain:elevation > 3000.land_s2 and land_l8 files include sensor-specific image IDs (s2:id_gee, l8:id_gee) that point directly to the source products in Google Earth Engine.elliot.parquet file provides pre-built monotemporal and temporal splits designed for multi-sensor, multi-temporal EO research.All files are self-contained GeoParquet with ZSTD compression, sorted by majortom:code_1000km → majortom:code_100km → id for efficient spatial predicate pushdown.
Columns are organized into namespaces. Each namespace groups related attributes.
majortom:)Tile identity and spatial reference within the Major TOM grid system.
stac:)Spatial and temporal reference following STAC conventions. Present in land_s2 and land_l8 only, where it replaces the majortom: grid columns.
s2:)Sensor metadata for the assigned Sentinel-2 image. Present in land_s2 only.
Note on solar vs viewing angles. The sun has a single position relative to the scene, so ESA provides one solar azimuth and one solar zenith averaged across all bands. Viewing angles are different: Sentinel-2 uses a pushbroom sensor where each spectral band has its own detector array in the focal plane, each observing from a slightly different angle. That is why GEE provides per-band viewing angles (
MEAN_INCIDENCE_*_ANGLE_B1through_B12). We use band B8 (NIR, 10 m) as the reference because it is at native 10 m resolution and sits near the center of the focal plane, making it a representative proxy for the viewing geometry of the 10 m and 20 m bands.
l8:)Sensor metadata for the assigned Landsat image. Present in land_l8 only.
terrain:)climate:)soil:)Surface-layer soil properties from the OpenLandMap dataset, derived from machine learning predictions on global soil survey data at 250 m resolution.
socio:)admin:)Human-readable administrative boundary names resolved from rasterized boundary datasets.
The elliot.parquet file contains 279,166 tiles selected for the ELLIOT project multi-temporal dataset extension. Tile locations were sampled using hierarchical spherical k-means (530 × 528 = 279,840 clusters) over AlphaEarth Foundation embeddings to ensure global environmental diversity.
The split column defines two subsets:
Monotemporal (250,000 tiles). One cloud-free image per sensor per location. Designed for tasks where spatial coverage matters more than temporal depth: land cover classification, feature extraction, or pretraining foundation models on diverse global scenes.
Temporal (29,166 tiles). Multiple observations per location across time. Designed for tasks that require temporal context: change detection, phenology tracking, seasonal compositing, or training models that learn from multi-temporal sequences. This subset is further divided into monthly cadence (12,500 tiles × 12 timesteps) and five-daily cadence (16,666 tiles × 6 timesteps).
CC-BY-4.0
The Major TOM Index has been made possible thanks to Asterisk Labs, the ELLIOT project (European Commission, Horizon Europe, Grant 101214398), and the Image and Signal Processing Group (ISP) at Universitat de València.