Run OCR

This guide deploys LightOnOCR-2-1B on Dedicated Inference and uses it to pull text out of images: invoices, receipts, forms, and scanned documents.

LightOnOCR-2-1B is a small multilingual vision-language model built for OCR. It covers 11 languages and handles tables, multi-column layouts, and math notation.

Prerequisites

Exoscale CLI (exo) installed and configured
an API key with compute and ai access
enough GPU quota in your organization

Set these variables for the examples:

export ZONE="at-vie-2"
export MODEL="lightonai/LightOnOCR-2-1B"

Step 1. Create the Model

exo dedicated-inference model create "$MODEL" -z "$ZONE"

Wait until it shows created:

exo dedicated-inference model list -z "$ZONE"

Step 2. Deploy the Model

LightOnOCR-2-1B is a 1B-parameter model, about 1.9 GiB, so one A5000 GPU is plenty.

exo dedicated-inference deployment create ocr \
  --model-name "$MODEL" \
  --gpu-type gpua5000 \
  --gpu-count 1 \
  --replicas 1 \
  --inference-engine-version 0.15.1 \
  -z "$ZONE"

Wait until the deployment is ready, usually 2 to 3 minutes:

exo dedicated-inference deployment show ocr -z "$ZONE"

Step 3. Get the Endpoint and Key

exo dedicated-inference deployment show ocr -z "$ZONE"
exo dedicated-inference deployment reveal-api-key ocr -z "$ZONE"

Export both for the next steps:

export ENDPOINT_URL="https://<your-deployment-id>.inference.at-vie-2.exoscale-cloud.com/v1"
export API_KEY="<your-api-key>"

Step 4. Run OCR on an Image

The model uses the OpenAI-compatible chat completions API. Pass the image as a base64 data URI. You don’t need a text prompt. The model extracts all text from the image on its own.

Encode the image and build the request file:

BASE64_IMG=$(base64 -i document.png)

cat > request.json << EOF
{
  "model": "lightonai/LightOnOCR-2-1B",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "image_url",
          "image_url": { "url": "data:image/png;base64,${BASE64_IMG}" }
        }
      ]
    }
  ],
  "max_tokens": 4096,
  "temperature": 0.2,
  "top_p": 0.9
}
EOF

Pass the body as a file with -d @request.json. Base64 image data inline tends to break shell escaping.

curl -X POST "$ENDPOINT_URL/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d @request.json

For an invoice, the model returns the text with tables as HTML:

{
  "choices": [
    {
      "message": {
        "content": "INVOICE #12345\nDate: 2026-03-26\n\n<table>\n  <tr><th>Item</th><th>Qty</th><th>Price</th></tr>\n  <tr><td>Widget A</td><td>2</td><td>19.99</td></tr>\n</table>\n\nTotal: 272.69"
      }
    }
  ]
}

Get Better Results

Image quality drives accuracy more than anything else:

Render PDFs at 200 DPI.
Aim for about 1540 pixels on the longest side.
Keep the original aspect ratio so text geometry stays intact.
PNG and JPEG both work. Use JPEG to keep large documents smaller.

Supported languages: English, French, German, Spanish, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, and Korean.

Clean Up

Scale to zero to stop GPU billing but keep the endpoint and key:

exo dedicated-inference deployment scale ocr 0 -z "$ZONE"

Delete the deployment when you’re done with the endpoint:

exo dedicated-inference deployment delete ocr -z "$ZONE"

Then remove the model files from Object Storage if you no longer need them:

exo dedicated-inference model delete <model-id> -z "$ZONE"

Next Steps

Last updated on June 15, 2026