SLA

Dedicated Inference Service Level Agreement (SLA)

NOTE
Dedicated Inference is currently in Private Preview. SLA commitments, including availability targets and service credits, will only apply once the service reaches General Availability (GA). No SLA guarantees are provided during the Private Preview phase.

Those product specific SLAs define the Service Availability Target for Dedicated Inference. It applies to each Client using the Service. Capitalized terms used herein but not defined herein shall have the meanings set forth in the Exoscale Terms and Conditions or EUSA whichever applies.

SLA

MetricTargetDefinition
Dedicated Inference Endpoint Availability>= 99.95% Monthly Uptime PercentageThe percentage of total minutes in a calendar month during which Exoscale Dedicated Inference Endpoint is available and operational for use.

Client Responsibilities

While Exoscale is responsible for operating and maintaining the Dedicated Inference endpoint, Clients remain responsible for the following:

  • Selection, deployment, and configuration of models and inference workloads.
  • Management of input and output data, including data classification, retention, and deletion.
  • Access control to inference endpoints, including API keys, credentials, and network policies.
  • Validation of inference results and model behavior.
  • Backup and lifecycle management of models and related artifacts.
  • Compliance of datasets, prompts, outputs, and workloads with applicable laws and regulations.
  • Cost management, usage monitoring, and scaling configuration.

The Service Level Agreement applies solely to Dedicated Inference Endpoint Availability and does not cover model performance, inference accuracy, or Client workload behavior. For a complete explanation of the shared responsibility framework between Exoscale and Clients, please see the Exoscale Shared Responsibility Model documentation.

Exclusions

No compensation shall be granted to the Client if the failure to comply with the Dedicated Inference Service Level Objective is due to any of the following reasons:

  • Client-side configurations, including model deployment parameters, scaling settings, network rules, access policies, or quota limitations.
  • Model behavior, inference quality, output correctness, or data loss.
    For the complete list of SLA exclusions, please refer to the General Exoscale Terms and Conditions.

Retribution

Service credits apply only to Dedicated Inference Endpoint Availability as defined above and do not apply to model performance, inference latency, throughput, or Client workload behavior. The standard retribution and SLA credit schemes apply for the Dedicated Inference as described under General Exoscale Terms and Conditions or EUSA whichever applies. The Service Credits shall be the sole remedy in case of non-meeting the SLA. In the event that the Monthly Uptime Percentage for Exoscale Compute Instances falls below the committed Service Level Objective, the Client will be eligible for service credits as follows:

Monthly Uptime PercentageService Credit Percentage of Monthly Service Fees for Affected Resources
from 99.95% to 98.3%50%
below 98.3%100%

Claim Service Credit

To claim a Service Credit, open a support ticket in the Exoscale Portal within 10 business days of the incident. More details

Last updated on