Major TOM Elliot-Pretrain · Major TOM · Source Cooperative | Source Cooperative
Major TOM Elliot-Pretrain
A unified multi-modal Earth Observation pre-training dataset combining Sentinel-2, Landsat 8/9, Copernicus DEM, and ESA WorldCover on a global 10 km grid. 250,000 tiles, TACO v3 format.
MajorTOM Elliot-Pretrain is the first expansion of Major TOM focused on fast pre-training of multi-modal AI models on Major TOM data.
It's built on top of the original MajorTOM global 10 km grid
The files come in 10.56 km footprints with harmonised resolution for faster access
It combines four modalities into a single collection where every tile contains co-registered optical, thermal, elevation, and land cover data
This dataset is designed to grow incrementally -- we start with 250,000 monotemporal tiles and are going to add time-series data in the near future. The dataset follows the TACO v3 specification, a format for organizing AI-ready Earth Observation datasets.
Modality
Source
Resolution
Bands
Sentinel-2 L1C
ESA Copernicus
10 m
13 spectral
Landsat 8/9 OLI-TIRS
USGS
30 m
11 (9 OLI + 2 TIRS)
Copernicus DEM GLO-30
ESA / TanDEM-X
30 m
1 (elevation)
ESA WorldCover
ESA
10 m
1 (land cover class)
Quick Start
Pick a tile index and visualize all four modalities:
1import numpy as np
2import rasterio
3import matplotlib.pyplot as plt
4
5BASE = "https://data.source.coop/major-tom/core"
6IDX = 42 # <- change this to explore different tiles
7
8MODS = ["s2", "l8", "dem", "wc"]
9fig, axes = plt.subplots(
1import numpy as np
2import rasterio
3import matplotlib.pyplot as plt
4
5BASE = "https://data.source.coop/major-tom/core"
6IDX = 42 # <- change this to explore different tiles
7
8MODS = ["s2", "l8", "dem", "wc"]
9fig, axes = plt.subplots(
Reproducible Example
A complete notebook with metadata queries, filtering, and a streaming PyTorch DataLoader with parallel fetching is available here: