In digital pathology, whole-slide images (WSIs) are often difficult to handle due to their gigapixel scale, so most approaches train patch encoders via self-supervised learning (SSL) and then aggregate the patch-level embeddings via multiple instance learning (MIL) or slide encoders for downstream tasks. However, patch-level SSL may overlook complex domain-specific features that are essential for biomarker prediction, such as mutation status and molecular characteristics, as SSL methods rely only on basic augmentations selected for natural image domains on small patch-level area. Moreover, SSL methods remain less data efficient than fully supervised approaches, requiring extensive computational resources and datasets to achieve competitive performance. To address these limitations, we present EXAONE Path 2.0, a pathology foundation model that learns patch-level representations under direct slide-level supervision. Using only 37k WSIs for training, EXAONE Path 2.0 achieves state-of-the-art average performance across 10 biomarker prediction tasks, demonstrating remarkable data efficiency.
Size Notes: "EXAONE Path 2.0 is trained on 37,195 Formalin-Fixed, Paraffin-Embedded (FFPE) Hematoxylin and Eosin (H&E) stained WSIs. These WSIs generate 144,450 image-label pairs across 16 training tasks, with each WSI contributing multiple labels corresponding to different prediction objectives including cancer subtyping, tissue classification, and biomarker prediction." "gigapixel scale" "In the first curriculum stage, we apply 256×256 DINO loss to the first-stage ViT and 1024×1024 DINO loss to the second-stage ViT" unknown number of epochs or training steps
Notes: ~175M parameters based on Figure 1