Model Details

Domain:

Task:

Model Access:

Open weights (restricted use)

AI Tools Usage

This model is commonly used behind the scenes in AI tools.

Introduction

MedSigLIP is a variant of SigLIP (Sigmoid Loss for Language Image Pre-training) that is trained to encode medical images and text into a common embedding space. Developers can use MedSigLIP to accelerate building healthcare-based AI applications. MedSigLIP was trained on a variety of de-identified medical image and text pairs, including chest X-rays, dermatology images, ophthalmology images, histopathology slides, and slices of CT and MRI volumes, along with associated descriptions or reports. MedSigLIP contains a 400M parameter vision encoder and 400M parameter text encoder, it supports 448x448 image resolution with up to 64 text tokens. MedSigLIP is recommended for medical image interpretation applications without a need for text generation, such as data-efficient classification, zero-shot classification, and semantic image retrieval. For medical applications that require text generation, MedGemma is recommended.

Training

Training Code AccessibilityThe use of MedSigLIP is governed by the Health AI Developer Foundations terms of use https://huggingface.co/google/medsiglip-448