Demo of creating Icechunk stores for CEFI data
This dataset is a demonstration Icechunk repository built from NOAA Changing Ecosystems and Fisheries Initiative (CEFI) regional MOM6 NetCDF files for the Northeast Pacific. The Icechunk repository stores metadata and virtual chunk references, while the original NetCDF data chunks remain in the public NOAA S3 bucket.
This used:
GitHub repo (private): https://github.com/noaa-nwfsc/cefi-icechunks
Vizualization: GridLook Viz
The CEFI portal provides access to information about past and future conditions for U.S. coastal regions, including regional ocean model output intended for analysis, visualization, and management-relevant applications.
The Icechunk store does not copy the full model output out of the original NetCDF files. Instead, it stores virtual references that point back to byte ranges in the original public NOAA S3 files.
That means:
The source archive contains variables with different file layouts. Some variables are stored as one file per year, while others are stored as one full-period file. Due to the differing time chunking, these files cannot be merged into one Icechunk store. Instead this Icechunk repository uses separate groups for those layouts.
Group names:
These Icechunks stores use references to public data in other cloud storage and we need to explicitly authorize access.
Using slice() is a bit more memory safe for big data.
See the example notebooks cefi_nep_monthly.ipynb and cefi_nep_daily.ipynb for the code that created the icechunk stores.
Please cite the original NOAA CEFI regional MOM6 data product when using the data scientifically. This Icechunk repository is a derived access layer for demonstration and teaching purposes; the underlying data are from NOAA.
Useful project links:
See the license and use constraints for the original NOAA CEFI regional MOM6 data. This repository provides virtual access to that public source data and does not modify the original NetCDF files.