Skip to content

Exoscale Documentation Platform Product Reference Contact ↗

CTRL K

CTRL K

Platform
Product
Reference
Contact ↗

Compute
- Instances
- Containers (SKS)
Concrete AI
- Dedicated Inference
Storage
- Block Storage
- Object Storage
Networking
DBaaS
IAM

Dedicated Inference

How-To

How-To

Deploy a Gated Model

Learn how to deploy gated models from Hugging Face that require authentication and license acceptance, including obtaining and using access tokens.

Monitor and Troubleshoot Deployments

Monitor deployment health, diagnose issues, interpret logs, and resolve common problems with Dedicated Inference deployments.

Optimize Deployment Costs

Learn strategies to minimize Dedicated Inference costs while maintaining performance through smart scaling, resource selection, and lifecycle management.

Optimize Performance

Improve inference latency and throughput through context length tuning, quantization, KV-cache optimization, and speculative decoding.

Last updated on January 30, 2026

© 2025 Exoscale is a registered trademark of Akenes SA - Reg/VAT ID CHE-423.524.322 // Privacy // Terms & Conditions