Model Details

Domain:

Vision

Task:

Image embedding

Image classification

Model Access:

Open weights (non-commercial)

Citations:

294

AI Tools Usage

This model is commonly used behind the scenes in AI tools.

Introduction

Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods. These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1% and confirming that self-supervised learning works in a real world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving 77.9% top-1 with access to only 10% of ImageNet. Code: this https URL

Benchmarking

FLOPs1.8e+22

Notes: Numbers from section 3.2, they specifically mention using mixed precision training. 6125 ms / batch * 114890 batches = 8.14 days (they round to 8 in the text) 512 GPUs * 8.14 days * 24h/day * 3600s/h * 125 TFLOP/s * 0.4 (assumed utilization) = 1.800e22 "on 512 V100 32GB NVIDIA GPUs. Training this model on 1 billion images requires 114, 890 training iterations for a batch size of 8, 704 images, summing to 8 days of training over 512 GPUs."

Training

Training Code Accessibilityhttps://github.com/facebookresearch/vissl/tree/main/projects/SEER We share instructions on how to train SEER model on GPUs using PyTorch. First, Install VISSL and follow the data setup instructions to easily setup your data input with VISSL. https://github.com/facebookresearch/vissl/blob/main/projects/SEER/MODEL_LICENSE.md

HardwareNVIDIA V100

Hardware Quantity512

Size Notes: "Overall, we train on 1B images for a total of 122K iterations."

Parameters

Parameters1300000000

Notes: From abstract: " Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters..."

Authors

Priya Goyal, Mathilde Caron, Benjamin Lefaudeux, Min Xu, Pengchao Wang, Vivek Pai, Mannat Singh, Vitaliy Liptchinsky, Ishan Misra, Armand Joulin, Piotr Bojanowski

Related ModelsView all models