Instructions to use PaddlePaddle/PaddleOCR-VL with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PaddleOCR
How to use PaddlePaddle/PaddleOCR-VL with PaddleOCR:
# See https://www.paddleocr.ai/latest/version3.x/pipeline_usage/PaddleOCR-VL.html to installation from paddleocr import PaddleOCRVL pipeline = PaddleOCRVL(pipeline_version="v1") output = pipeline.predict("path/to/document_image.png") for res in output: res.print() res.save_to_json(save_path="output") res.save_to_markdown(save_path="output") - Notebooks
- Google Colab
- Kaggle
fix: allow pass image kwargs to image processor
Why we need this?
(1) From a design standpoint, I see no justification for why image_processor does not accept images_kwargs.
(2) This leads to a critical bug: passing an image with height=3 and a specific width causes the vLLM service to crash immediately. The crash occurs because height=3 images resolve to different dimensions during embedding calculation versus token calculation, causing a length mismatch between the computed token count and the actual embeddings produced. Without support for **image_kwargs, we have no workaround from the client side.
I will submit a PR to vLLM to fix this issue on the vLLM side, but the prerequisite is that image_processor must accept externally passed **kwargs.