Instructions to use saurabhati/VMamba_ImageNet_83.6 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use saurabhati/VMamba_ImageNet_83.6 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="saurabhati/VMamba_ImageNet_83.6", trust_remote_code=True) pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoModelForImageClassification model = AutoModelForImageClassification.from_pretrained("saurabhati/VMamba_ImageNet_83.6", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
metadata
license: mit
pipeline_tag: image-classification
library_name: transformers
tags:
- PyTorch
- Mamba
- SSM
VMamba: Visual State Space Model
VMamba is a bidirectional state-space model finetuned on Imagenet dataset. It was introduced in the paper: VMamba: Visual State Space Model and was first released in this repo.
Disclaimer: This is not the official implementation, please refer to the official repo.
How to Get Started with the Model
Use the code below to get started with the model.
import torch
from PIL import Image
import torchvision.transforms as T
from transformers import AutoConfig, AutoModelForImageClassification
config = AutoConfig.from_pretrained('saurabhati/VMamba_ImageNet_82.6',trust_remote_code=True)
vmamba_model = AutoModelForImageClassification.from_pretrained('saurabhati/VMamba_ImageNet_82.6',trust_remote_code=True)
preprocess = T.Compose([
T.Resize(224, interpolation=Image.BICUBIC),
T.CenterCrop(224),
T.ToTensor(),
T.Normalize(
mean=[0.4850, 0.4560, 0.4060],
std=[0.2290, 0.2240, 0.2250]
)])
input_image = Image.open('/data/sls/scratch/sbhati/data/Imagenet/train/n02009912/n02009912_16160.JPEG')
input_image = preprocess(input_image)
with torch.no_grad():
logits = vmamba_model(input_image.unsqueeze(0)).logits
predicted_label = vmamba_model.config.id2label[logits.argmax().item()]
predicted_label
'crane'
Citation
@article{liu2024vmamba,
title={VMamba: Visual State Space Model},
author={Liu, Yue and Tian, Yunjie and Zhao, Yuzhong and Yu, Hongtian and Xie, Lingxi and Wang, Yaowei and Ye, Qixiang and Liu, Yunfan},
journal={arXiv preprint arXiv:2401.10166},
year={2024}
}