Transformers documentation
Inference on Specialized Hardware
Get started
Tutorials
Pipelines for inferenceLoad pretrained instances with an AutoClassPreprocessFine-tune a pretrained modelDistributed training with 🤗 AccelerateShare a model
How-to guides
General usage
Create a custom architectureSharing custom modelsTrain with a scriptRun training on Amazon SageMakerConverting TensorFlow CheckpointsExport 🤗 Transformers modelsTroubleshoot
Natural Language Processing
Audio
Computer Vision
Performance and scalability
OverviewTraining on one GPUTraining on many GPUsTraining on CPUTraining on many CPUsTraining on TPUsTraining on Specialized HardwareInference on CPUInference on one GPUInference on many GPUsInference on Specialized HardwareCustom hardware for trainingInstantiating a big modelDebugging
Contribute
How to contribute to transformers?How to add a model to 🤗 Transformers?How to add a pipeline to 🤗 Transformers?TestingChecks on a Pull Request
🤗 Transformers NotebooksCommunity resourcesBenchmarksMigrating from previous packagesConceptual guides
PhilosophyGlossarySummary of the tasksSummary of the modelsSummary of the tokenizersPadding and truncationBERTologyPerplexity of fixed-length models
API
You are viewing v4.22.2 version. A newer version v5.8.1 is available.
Inference on Specialized Hardware
This document will be completed soon with information on how to infer on specialized hardware. In the meantime you can check out the guide for inference on CPUs.