Create Deployment Request

Deployment an AI model onto a set of GPUs

Properties

Property	Type	Required	Description
`gpu-count`	integer	yes	Number of GPUs (1-8)
`gpu-type`	string	yes	GPU type family (e.g., gpua5000, gpu3080ti)
`model`	Model Ref	yes
`name`	string	yes	Deployment name
`replicas`	integer	yes	Number of replicas (>=1)
`inference-engine-parameters`	array[string]	no	Optional extra inference engine server CLI args
`inference-engine-version`	string	no	Allowed values: `0.12.0`, `0.15.1`, `0.16.0`, `0.17.0`.

Last updated on March 20, 2026