exo dedicated-inference deployment create

Description

This command creates an AI deployment on dedicated inference servers.

exo dedicated-inference deployment create [NAME] [flags]

Options

OptionDescription
--gpu-countNumber of GPUs (1-8)
--gpu-typeGPU type family (e.g., gpua5000, gpu3080ti)
--help, -hhelp for create
--inference-engine-parameter-helpShow inference engine parameters help
--inference-engine-paramsSpace-separated inference engine server CLI arguments (e.g., "–gpu-memory-usage=0.8 –max-tokens=4096")
--inference-engine-versionInference engine version
--model-idModel ID (UUID)
--model-nameModel name (as created)
--replicasNumber of replicas (>=1)
--zone, -zzone

Options inherited from parent commands

OptionDescription
--config, -CSpecify an alternate config file [env EXOSCALE_CONFIG]
--output-format, -OOutput format (table|json|text), see "exo output –help" for more information
--output-templateTemplate to use if output format is "text"
--quiet, -QQuiet mode (disable non-essential command output)
--use-account, -AAccount to use in config file [env EXOSCALE_ACCOUNT]

Related Commands

Last updated on