Run OCR
This guide shows how to deploy the LightOnOCR-2-1B model on Exoscale Dedicated Inference and use it to extract text from images such as invoices, receipts, forms, and scanned documents.
LightOnOCR-2-1B is a lightweight multilingual vision-language model purpose-built for OCR. It supports 11 languages and handles tables, multi-column layouts, and math notation.
Prerequisites
- Exoscale CLI (
exo) installed and configured - An API key with
computeandaiservice access - Sufficient GPU quota in your Exoscale organization
Step 1: Create the Model
Download and register the LightOnOCR model in your zone:
exo dedicated-inference model create lightonai/LightOnOCR-2-1B -z at-vie-2Monitor creation progress:
exo dedicated-inference model list -z at-vie-2Wait until the status shows created before proceeding.
Step 2: Deploy the Model
Create a deployment on a single A5000 GPU. LightOnOCR-2-1B is a 1B-parameter model (roughly 1.9 GiB) and fits comfortably on a single GPU.
Important: Use inference engine version 0.15.1. Later versions of vLLM have a regression that prevents multimodal image requests from reaching the model.
exo dedicated-inference deployment create ocr \
--model-name lightonai/LightOnOCR-2-1B \
--gpu-type gpua5000 \
--gpu-count 1 \
--replicas 1 \
--inference-engine-version 0.15.1 \
-z at-vie-2Monitor the deployment:
exo dedicated-inference deployment show ocr -z at-vie-2Wait until the status is ready (typically 2-3 minutes).
Step 3: Get the Endpoint URL and API Key
exo dedicated-inference deployment show ocr -z at-vie-2
exo dedicated-inference deployment reveal-api-key ocr -z at-vie-2Export them for use in subsequent commands:
export ENDPOINT_URL="https://<your-deployment-id>.inference.at-vie-2.exoscale-cloud.com/v1"
export API_KEY="<your-api-key>"Step 4: Run OCR on an Image
LightOnOCR uses the OpenAI-compatible chat completions API. Pass the image as a base64-encoded data URI. The model does not require a text prompt — it automatically extracts all text from the provided image.
Prepare the Request
Encode the image and build a JSON request file:
BASE64_IMG=$(base64 -i document.png)
cat > request.json << EOF
{
"model": "lightonai/LightOnOCR-2-1B",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,${BASE64_IMG}"
}
}
]
}
],
"max_tokens": 4096,
"temperature": 0.2,
"top_p": 0.9
}
EOFNote: The request body must be passed as a file (-d @request.json) rather than inline, because base64 image data can interfere with shell escaping.
Send the Request
curl -X POST "$ENDPOINT_URL/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d @request.jsonExample Response
For an invoice image, the model returns structured text with tables formatted as HTML:
{
"choices": [
{
"message": {
"content": "INVOICE #12345\nDate: 2026-03-26\n\n<table>\n <tr>\n <th>Item</th>\n <th>Qty</th>\n <th>Price</th>\n </tr>\n <tr>\n <td>Widget A</td>\n <td>2</td>\n <td>19.99</td>\n </tr>\n</table>\n\nTotal: 272.69"
}
}
]
}Image Recommendations
For best OCR accuracy:
- Resolution: Render PDFs at 200 DPI
- Size: Target the longest dimension at around 1540 pixels
- Aspect ratio: Maintain the original aspect ratio to preserve text geometry
- Format: PNG or JPEG both work; use JPEG for smaller payloads on large documents
Supported Languages
LightOnOCR-2-1B supports: English, French, German, Spanish, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, and Korean.
Clean Up
Scale to zero to stop GPU billing while preserving the endpoint:
exo dedicated-inference deployment scale ocr 0 -z at-vie-2Or delete the deployment entirely:
exo dedicated-inference deployment delete ocr -z at-vie-2To also remove the model from Object Storage:
exo dedicated-inference model delete <model-id> -z at-vie-2