vllm.entrypoints.openai.engine.protocol ¶
AnyResponseFormat module-attribute ¶
AnyResponseFormat: TypeAlias = (
ResponseFormat
| StructuralTagResponseFormat
| LegacyStructuralTagResponseFormat
)
AnyStructuralTagResponseFormat module-attribute ¶
AnyStructuralTagResponseFormat: TypeAlias = (
LegacyStructuralTagResponseFormat
| StructuralTagResponseFormat
)
DeltaFunctionCall ¶
Bases: BaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
DeltaMessage ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
reasoning_content class-attribute instance-attribute ¶
reasoning_content: str | None = None
Deprecated: use reasoning instead.
tool_calls class-attribute instance-attribute ¶
tool_calls: list[DeltaToolCall] = Field(
default_factory=list
)
handle_deprecated_reasoning_content ¶
Copy reasoning to reasoning_content for backward compatibility.
DeltaToolCall ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
ErrorInfo ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
ErrorResponse ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
ExtractedToolCallInformation ¶
Bases: BaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
FunctionCall ¶
FunctionDefinition ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
GenerateRequest ¶
Bases: BaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
cache_salt class-attribute instance-attribute ¶
cache_salt: str | None = Field(
default=None,
description="If specified, the prefix cache will be salted with the provided string to prevent an attacker to guess prompts in multi-user environments. The salt should be random, protected from access by 3rd parties, and long enough to be unpredictable (e.g., 43 characters base64-encoded, corresponding to 256 bit).",
)
features class-attribute instance-attribute ¶
features: str | None = None
The processed MM inputs for the model.
kv_transfer_params class-attribute instance-attribute ¶
kv_transfer_params: dict[str, Any] | None = Field(
default=None,
description="KVTransfer parameters used for disaggregated serving.",
)
priority class-attribute instance-attribute ¶
priority: int = Field(
default=0,
description="The priority of the request (lower means earlier handling; default: 0). Any priority other than 0 will raise an error if the served model does not use priority scheduling.",
)
request_id class-attribute instance-attribute ¶
request_id: str = Field(
default_factory=random_uuid,
description="The request_id related to this request. If the caller does not set it, a random_uuid will be generated. This id is used through out the inference process and return in response.",
)
sampling_params instance-attribute ¶
sampling_params: SamplingParams
The sampling parameters for the model.
JsonSchemaResponseFormat ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
json_schema class-attribute instance-attribute ¶
LegacyStructuralTag ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
LegacyStructuralTagResponseFormat ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
LogitsProcessorConstructor ¶
Bases: BaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
ModelCard ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
created class-attribute instance-attribute ¶
permission class-attribute instance-attribute ¶
permission: list[ModelPermission] = Field(
default_factory=list
)
ModelList ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
ModelPermission ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
created class-attribute instance-attribute ¶
id class-attribute instance-attribute ¶
id: str = Field(
default_factory=lambda: f"modelperm-{random_uuid()}"
)
OpenAIBaseModel ¶
Bases: BaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
__log_extra_fields__ classmethod ¶
Source code in vllm/entrypoints/openai/engine/protocol.py
PromptTokenUsageInfo ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
RequestResponseMetadata ¶
Bases: BaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
ResponseFormat ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
json_schema class-attribute instance-attribute ¶
json_schema: JsonSchemaResponseFormat | None = None
StreamOptions ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
StructuralTagResponseFormat ¶
ToolCall ¶
UsageInfo ¶
Bases: OpenAIBaseModel
Source code in vllm/entrypoints/openai/engine/protocol.py
prompt_tokens_details class-attribute instance-attribute ¶
prompt_tokens_details: PromptTokenUsageInfo | None = None
get_logits_processors ¶
get_logits_processors(
processors: LogitsProcessors | None, pattern: str | None
) -> list[Any] | None