Create Deployment Request
Deployment an AI model onto a set of GPUs
Properties
| Property | Type | Required | Description |
|---|---|---|---|
gpu-count | integer | yes | Number of GPUs (1-8) |
gpu-type | string | yes | GPU type family (e.g., gpua5000, gpu3080ti) |
model | Model Ref | yes | |
name | string | yes | Deployment name |
replicas | integer | yes | Number of replicas (>=1) |
inference-engine-parameters | array[string] | no | Optional extra inference engine server CLI args |
inference-engine-version | string | no | Allowed values: 0.12.0, 0.15.1, 0.16.0, 0.17.0. |
Last updated on