Imagenet classifier and general purpose backbone.
VIT is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.
Model checkpoint:Imagenet
Input resolution:224x224
Number of parameters:86.6M
Model size (float):330 MB
Model size (w8a16):86.2 MB
Model size (w8a8):83.2 MB
Medical Imaging
Anomaly Detection
Inventory Management
Source Model: BSD-3-CLAUSE
Deployable Model: AI-HUB-MODELS-LICENSE