This repository contains the GlobalBuildingAtlas (GBA) Level of Detail 1 (LoD1) building footprints dataset, converted from the original 922 GeoJSON files into Parquet format for improved performance and accessibility.
Dataset Overview
The GlobalBuildingAtlas LoD1 dataset contains building footprints for approximately 2.75 billion buildings worldwide. This represents the most comprehensive global building dataset available, covering nearly all inhabited areas on Earth.
The dataset combines multiple existing building footprint datasets with new footprints generated using open source deep learning models applied to Planet Labs satellite imagery.
This Parquet version was converted from the original 922 GeoJSON files (1.1 TB) to improve performance and accessibility. For detailed information about the conversion process and technical analysis, see Mark Litwintschik's comprehensive blog post.
Data Format
The dataset is organized as Parquet files, each containing building footprint data with the following key attributes:
Building footprints: Vector polygons representing building outlines
Building heights: Height information for each building
Geographic coordinates: Precise location data
Metadata: Additional building characteristics
Parquet Structure
Each Parquet file contains:
Building geometry (polygons)
Height attributes
Spatial indexing for efficient querying
Optimized columnar storage for fast analytics
Data Sources and Methodology
The dataset combines multiple building footprint sources:
Google Open Buildings (2023): 1.62 billion buildings
OpenStreetMap (2025): 490 million buildings
Microsoft Building Footprints (2024): 432 million buildings
TUM Deep Learning (ours2): 135 million buildings from Planet Labs satellite imagery
3D Global Footprints: 68 million buildings from previous research
The TUM deep learning component uses open source computer vision models applied to Planet Labs satellite imagery. Planet Labs operates several constellations totaling hundreds of satellites in low Earth orbit, capturing images of the entire Earth's landmasses daily.
Data Limitations
Temporal Accuracy: No timestamps for AI-generated footprints
Data Conflicts: Some footprints may conflict with newer OpenStreetMap data
Construction Areas: Recent construction may not be captured
Validation: Manual verification recommended for critical applications
Usage Recommendations
Map Making: Use as base layer with selective integration of newer OSM data
Construction Areas: Prioritize latest OSM data for areas with recent development
Research: Suitable for large-scale analysis and modeling
Spatial Indexing: H3 and other spatial indices included
Total Size: 210 GB (compressed from 1.1 TB original)
Data Quality Notes
The dataset combines multiple data sources as detailed above. The TUM deep learning component (ours2) represents new building footprints generated from Planet Labs satellite imagery, while the other sources provide existing building data from various organizations.
For areas with significant recent construction, consider using the latest OpenStreetMap data or Overture's building dataset, which prioritizes recent OSM updates.
Citation
If you use this Parquet version of the dataset hosted on Source Cooperative, please cite both the original dataset and this hosted version:
Original Dataset:
1@misc{1782307,
2 author = {Zhu, Xiao Xiang and Chen, Sining and Zhang, Fahong and Shi, Yilei and Wang, Yuanyuan},
3 title = {{GlobalBuildingAtlas: An Open Global and Complete Dataset of Building Polygons, Heights and LoD1 3D Models}},
4 type = {Dataset},
5 year = {2025},
6 type = {Forschungsdaten},
7 abstract = {GlobalBuildingAtlas is a dataset providing global and complete coverage of building polygons (GBA.Polygon), heights (GBA.Height) and Level of Detail 1 (LoD1) 3D building models (GBA.LoD1). It is the first open dataset to offer high quality, consistent, and complete building data in 2D and 3D at the individual building level on a global scale. The dataset is delivered in tiles of 5 degree by 5 degree, with GBA.Polygon and GBA.LoD1 in GeoJSON format, and GBA.Height in GeoTiff format. Details see https://github.com/zhu-xlab/GlobalBuildingAtlas.},
8 keywords = {Building Footprint; Building Height; 3D Building Models; Remote Sensing; Earth Observation; Deep Learning},
9 doi = {10.14459/2025mp1782307}
10}
1@misc{1782307,
2 author = {Zhu, Xiao Xiang and Chen, Sining and Zhang, Fahong and Shi, Yilei and Wang, Yuanyuan},
3 title = {{GlobalBuildingAtlas: An Open Global and Complete Dataset of Building Polygons, Heights and LoD1 3D Models}},
4 type = {Dataset},
5 year = {2025},
6 type = {Forschungsdaten},
7 abstract = {GlobalBuildingAtlas is a dataset providing global and complete coverage of building polygons (GBA.Polygon), heights (GBA.Height) and Level of Detail 1 (LoD1) 3D building models (GBA.LoD1). It is the first open dataset to offer high quality, consistent, and complete building data in 2D and 3D at the individual building level on a global scale. The dataset is delivered in tiles of 5 degree by 5 degree, with GBA.Polygon and GBA.LoD1 in GeoJSON format, and GBA.Height in GeoTiff format. Details see https://github.com/zhu-xlab/GlobalBuildingAtlas.},
8 keywords = {Building Footprint; Building Height; 3D Building Models; Remote Sensing; Earth Observation; Deep Learning},
9 doi = {10.14459/2025mp1782307}
10}
Source Cooperative Hosted Version:
1@misc{globalbuildingatlas_lod1_parquet,
2 title = {{GlobalBuildingAtlas LoD1 (Parquet Format)}},
3 author = {Zhu, Xiao Xiang and Chen, Sining and Zhang, Fahong and Shi, Yilei and Wang, Yuanyuan and {Source Cooperative}},