Unprecedented volumes of Earth observation data are continually collected around the world, but high-quality labels remain scarce given the effort required to make physical measurements and observations. This has led to considerable investment in bespoke modeling efforts translating sparse labels into maps. Here we introduce AlphaEarth Foundations, an embedding field model yielding a highly general, geospatial representation that assimilates spatial, temporal, and measurement contexts across multiple sources, enabling accurate and efficient production of maps and monitoring systems from local to global scales. The embeddings generated by AlphaEarth Foundations are the only to consistently outperform a suite of other well-known/widely accepted featurization approaches tested on a diverse set of mapping evaluations without re-training. We have released a dataset of global, annual, analysis-ready embedding field layers from 2017 through 2024.
Notes: 275000000000000 FLOP / sec / TPU v4 * 512 TPUs * 56 hours * 3600 sec / hour * 0.3 [assumed utilization] = 2.36544e+18 FLOP
Size Notes: [videoframes] "AEF was trained over 8,412,511 video sequences containing interleaved, time-stamped frames from the sources and metadata listed in supplemental materials S1. Each frame covered a 1.28 km x 1.28 km (128 x 128 pixel) area projected into the UTM zone of the area’s centroid and were not limited in length: all available data was used totalling 3,047,520,515 frames." "AEF was trained for 56 hours on 512 TPU v4 devices over 100k steps in batches of 256 video sequences. "
Notes: "We trained ∼1B and ∼480M parameter variants of AEF, and ultimately proceeded with the smaller variant for improved inference efficiency."