Create Deployment Request
Deployment an AI model onto a set of GPUs
Properties
| Property | Type | Required | Description |
|---|---|---|---|
gpu-count | integer | yes | Number of GPUs (1-8) |
gpu-type | string | yes | GPU type family (e.g., gpua5000, gpu3080ti) |
replicas | integer | yes | Number of replicas (>=1) |
inference-engine-parameters | array[string] | no | Optional extra inference engine server CLI args |
model | Model Ref | no | |
name | string | no | Deployment name |
Last updated on