This repository hosts the AI-ready datasets for the paper "Global 3D Reconstruction of Clouds & Tropical Cyclones." It contains paired 2D geostationary imagery (GOES, MSG, Himawari) and 3D vertical profiles (CloudSat/CALIPSO) used to train ML models for 3D cloud reconstruction, with a dedicated dataset for tropical cyclones.
GOES (Geostationary Operational Environmental Satellite) imagery subset from the Global 3D Cloud Reconstruction Dataset. Contains multispectral geostationary satellite imagery from GOES-16/ABI for 3D cloud structure reconstruction. Each sample contains 20 bands: 16 spectral channels plus satellite and solar angles. 512x512 pixel patches in Cloud-Optimized GeoTIFF format.
Version: 0.1.0
License: CC-BY-4.0
Keywords: cloud microphysics, 3d reconstruction, geostationary satellites, GOES-16, remote sensing, tropical cyclones, deep learning
Tasks: regression, foundation-model
Partitions: 105 files Spatial coverage: [-120.94, -41.13, -1.41, 55.50] (WGS84) Temporal coverage: 2018-01-02 to 2024-12-31
Root: FILE (91,423 samples)
NOAA — producer
https://www.noaa.gov
European Space Agency (ESA) — licensor
https://www.esa.int
source.coop — host
https://source.coop
If you use this dataset in your research, please cite:
DOI: 10.48550/arXiv.2511.04773
Ermis, S., Aybar, C., Freischem, L., Girtsou, S., Bintsi, K.-M., Diaz Salas-Porras, E., Eisinger, M., Jones, W., Jungbluth, A., & Tremblay, B. (2025). Global 3D Reconstruction of Clouds & Tropical Cyclones. Tackling Climate Change with Machine Learning Workshop at NeurIPS 2025.
Primary publication describing the dataset and methodology
Generated with ❤️ using TacoToolbox v0.23.0
stac:time_middle | timestamp[us] | Midpoint between start and end timestamps (microseconds since Unix epoch, UTC) |
geotiff:stats | list<item: list<item: float>> | Per-band statistics (List[List[Float32]]): categorical mode returns class probabilities, continuous mode returns [min, max, mean, std, valid%, p25, p50, p75, p95] |
cloud3d:satellite | string | Geostationary satellite platform (GOES, HIMAWARI, or MSG) |
cloud3d:cyclone | bool | Whether the sample contains tropical cyclone imagery |
majortom:code | string | MajorTOM spherical grid cell identifier (e.g., 0100km_0003U_0005R) with ~dist_km spacing |
geoenrich:elevation | float | Mean elevation in meters (GLO-30 DEM) |
geoenrich:precipitation | float | Mean annual precipitation in mm estimated from GPM data |
geoenrich:temperature | float | Mean annual temperature in °C estimated from MODIS LST data |
geoenrich:admin_countries | string | Country name at centroid location |
internal:current_id | int64 | Current sample position at this level (0-indexed). Enables O(1) random access and relational JOINs (ZIP, FOLDER, TACOCAT). |
internal:parent_id | int64 | Foreign key referencing parent sample position in previous level (ZIP, FOLDER, TACOCAT). |