API Reference
Packages​
vllm.ai/v1alpha1​
Package v1alpha1 contains API Schema definitions for the v1alpha1 API group
Resource Types​
Decision​
Decision defines a routing decision based on rule combinations
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string | Name is the unique identifier for this decision | MaxLength: 100 MinLength: 1 Required: {} | |
priority integer | Priority defines the priority of this decision (higher values = higher priority) Used when strategy is "priority" | 0 | Maximum: 1000 Minimum: 0 |
description string | Description provides a human-readable description of this decision | MaxLength: 500 | |
signals SignalCombination | Signals defines the signal combination logic | Required: {} | |
modelRefs ModelRef array | ModelRefs defines the model references for this decision (currently only one model is supported) | MaxItems: 1 MinItems: 1 Required: {} | |
plugins DecisionPlugin array | Plugins defines the plugins to apply for this decision | MaxItems: 10 |
DecisionPlugin​
DecisionPlugin defines a plugin configuration for a decision
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
type string | Type is the plugin type (semantic-cache, jailbreak, pii, system_prompt, header_mutation) | Enum: [semantic-cache jailbreak pii system_prompt header_mutation] Required: {} | |
configuration RawExtension | Configuration is the plugin-specific configuration as a raw JSON object | Schemaless: {} |
DomainSignal​
DomainSignal defines a domain category for classification
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string | Name is the unique identifier for this domain | MaxLength: 100 MinLength: 1 Required: {} | |
description string | Description provides a human-readable description of this domain | MaxLength: 500 |
EmbeddingSignal​
EmbeddingSignal defines an embedding-based signal extraction rule
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string | Name is the unique identifier for this signal | MaxLength: 100 MinLength: 1 Required: {} | |
threshold float | Threshold is the similarity threshold for matching (0.0-1.0) | Maximum: 1 Minimum: 0 Required: {} | |
candidates string array | Candidates is the list of candidate phrases for semantic matching | MaxItems: 100 MinItems: 1 Required: {} | |
aggregationMethod string | AggregationMethod defines how to aggregate multiple candidate similarities | max | Enum: [mean max any] |
IntelligentPool​
IntelligentPool defines a pool of models with their configurations
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string | vllm.ai/v1alpha1 | ||
kind string | IntelligentPool | ||
metadata ObjectMeta | Refer to Kubernetes API documentation for fields of metadata. | ||
spec IntelligentPoolSpec | |||
status IntelligentPoolStatus |
IntelligentPoolList​
IntelligentPoolList contains a list of IntelligentPool
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string | vllm.ai/v1alpha1 | ||
kind string | IntelligentPoolList | ||
metadata ListMeta | Refer to Kubernetes API documentation for fields of metadata. | ||
items IntelligentPool array |
IntelligentPoolSpec​
IntelligentPoolSpec defines the desired state of IntelligentPool
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
defaultModel string | DefaultModel specifies the default model to use when no specific model is selected | MaxLength: 100 MinLength: 1 Required: {} | |
models ModelConfig array | Models defines the list of available models in this pool | MaxItems: 100 MinItems: 1 Required: {} |
IntelligentPoolStatus​
IntelligentPoolStatus defines the observed state of IntelligentPool
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
conditions Condition array | Conditions represent the latest available observations of the IntelligentPool's state | ||
observedGeneration integer | ObservedGeneration reflects the generation of the most recently observed IntelligentPool | ||
modelCount integer | ModelCount indicates the number of models in the pool |
IntelligentRoute​
IntelligentRoute defines intelligent routing rules and decisions
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string | vllm.ai/v1alpha1 | ||
kind string | IntelligentRoute | ||
metadata ObjectMeta | Refer to Kubernetes API documentation for fields of metadata. | ||
spec IntelligentRouteSpec | |||
status IntelligentRouteStatus |
IntelligentRouteList​
IntelligentRouteList contains a list of IntelligentRoute
| Field | Description | Default | Validation |
|---|---|---|---|
apiVersion string | vllm.ai/v1alpha1 | ||
kind string | IntelligentRouteList | ||
metadata ListMeta | Refer to Kubernetes API documentation for fields of metadata. | ||
items IntelligentRoute array |
IntelligentRouteSpec​
IntelligentRouteSpec defines the desired state of IntelligentRoute
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
signals Signals | Signals defines signal extraction rules for routing decisions | ||
decisions Decision array | Decisions defines the routing decisions based on signal combinations | MaxItems: 100 MinItems: 1 Required: {} |
IntelligentRouteStatus​
IntelligentRouteStatus defines the observed state of IntelligentRoute
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
conditions Condition array | Conditions represent the latest available observations of the IntelligentRoute's state | ||
observedGeneration integer | ObservedGeneration reflects the generation of the most recently observed IntelligentRoute | ||
statistics RouteStatistics | Statistics provides statistics about configured decisions and signals |
KeywordSignal​
KeywordSignal defines a keyword-based signal extraction rule
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string | Name is the unique identifier for this rule (also used as category name) | MaxLength: 100 MinLength: 1 Required: {} | |
operator string | Operator defines the logical operator for keywords (AND/OR) | Enum: [AND OR] Required: {} | |
keywords string array | Keywords is the list of keywords to match | MaxItems: 100 MinItems: 1 Required: {} | |
caseSensitive boolean | CaseSensitive specifies whether keyword matching is case-sensitive | false |
LoRAConfig​
LoRAConfig defines a LoRA adapter configuration
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string | Name is the unique identifier for this LoRA adapter | MaxLength: 100 MinLength: 1 Required: {} | |
description string | Description provides a human-readable description of this LoRA adapter | MaxLength: 500 |
ModelConfig​
ModelConfig defines the configuration for a single model
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
name string | Name is the unique identifier for this model | MaxLength: 100 MinLength: 1 Required: {} | |
reasoningFamily string | ReasoningFamily specifies the reasoning syntax family (e.g., "qwen3", "deepseek") Must be defined in the global static configuration's ReasoningFamilies | MaxLength: 50 | |
pricing ModelPricing | Pricing defines the cost structure for this model | ||
loras LoRAConfig array | LoRAs defines the list of LoRA adapters available for this model | MaxItems: 50 |
ModelPricing​
ModelPricing defines the pricing structure for a model
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
inputTokenPrice float | InputTokenPrice is the cost per input token | Minimum: 0 | |
outputTokenPrice float | OutputTokenPrice is the cost per output token | Minimum: 0 |
ModelRef​
ModelRef defines a model reference without score
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
model string | Model is the name of the model (must exist in IntelligentPool) | MaxLength: 100 MinLength: 1 Required: {} | |
loraName string | LoRAName is the name of the LoRA adapter to use (must exist in the model's LoRAs) | MaxLength: 100 | |
useReasoning boolean | UseReasoning specifies whether to enable reasoning mode for this model | false | |
reasoningDescription string | ReasoningDescription provides context for when to use reasoning | MaxLength: 500 | |
reasoningEffort string | ReasoningEffort defines the reasoning effort level (low/medium/high) | Enum: [low medium high] |
RouteStatistics​
RouteStatistics provides statistics about the IntelligentRoute configuration
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
decisions integer | Decisions indicates the number of decisions | ||
keywords integer | Keywords indicates the number of keyword signals | ||
embeddings integer | Embeddings indicates the number of embedding signals | ||
domains integer | Domains indicates the number of domain signals |
SignalCombination​
SignalCombination defines how to combine multiple signals
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
operator string | Operator defines the logical operator for combining conditions (AND/OR) | Enum: [AND OR] Required: {} | |
conditions SignalCondition array | Conditions defines the list of signal conditions | MaxItems: 50 MinItems: 1 Required: {} |
SignalCondition​
SignalCondition defines a single signal condition
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
type string | Type defines the type of signal (keyword/embedding/domain) | Enum: [keyword embedding domain] Required: {} | |
name string | Name is the name of the signal to reference | MaxLength: 100 MinLength: 1 Required: {} |
Signals​
Signals defines signal extraction rules
Appears in:
| Field | Description | Default | Validation |
|---|---|---|---|
keywords KeywordSignal array | Keywords defines keyword-based signal extraction rules | MaxItems: 100 | |
embeddings EmbeddingSignal array | Embeddings defines embedding-based signal extraction rules | MaxItems: 100 | |
domains DomainSignal array | Domains defines MMLU domain categories for classification | MaxItems: 14 |