pymllm.server.launch¶
pymllm HTTP server – RESTful API entry point.
This module implements a FastAPI-based HTTP server that wraps the pymllm
Engine and exposes OpenAI-compatible and native REST endpoints.
Endpoints¶
GET /health– liveness probeGET /v1/models– list served models (OpenAI-compatible)POST /generate– native generate (streaming via SSE)POST /v1/completions– OpenAI-compatible completionsPOST /v1/chat/completions– OpenAI-compatible chat completionsGET /model_info– model metadataGET /server_info– runtime config dumpPOST /flush_cache– flush internal cachesPOST /abort_request– cancel a running request
Attributes¶
Classes¶
Body for |
|
!!! abstract "Usage Documentation" |
|
!!! abstract "Usage Documentation" |
|
!!! abstract "Usage Documentation" |
|
!!! abstract "Usage Documentation" |
|
!!! abstract "Usage Documentation" |
|
!!! abstract "Usage Documentation" |
|
OpenAI |
|
OpenAI |
|
!!! abstract "Usage Documentation" |
Functions¶
|
Startup / shutdown hooks for the FastAPI app. |
|
|
|
Liveness / readiness probe. Returns 503 if subprocesses died. |
Return basic model metadata. |
|
Dump runtime server configuration (sensitive fields redacted). |
|
OpenAI-compatible model listing. |
|
|
OpenAI-compatible single model retrieval. |
|
Native generation endpoint. Supports SSE streaming. |
|
OpenAI-compatible text completion endpoint. |
|
OpenAI-compatible chat completion endpoint with reasoning & tool-call parsing. |
Cache flush (not yet implemented). |
|
|
Abort a running request by rid. |
Launch the pymllm Engine then start the uvicorn HTTP server. |
|
|
CLI entry point. |
Module Contents¶
- pymllm.server.launch.logger¶
- class pymllm.server.launch.GenerateRequest(/, **data)¶
Bases:
pydantic.BaseModelBody for
POST /generate.- Parameters:
data (Any)
- text: List[str] | str | None = None¶
- input_ids: List[List[int]] | List[int] | None = None¶
- sampling_params: List[Dict[str, Any]] | Dict[str, Any] | None = None¶
- image_data: Any | None = None¶
- audio_data: Any | None = None¶
- video_data: Any | None = None¶
- return_logprob: List[bool] | bool | None = None¶
- logprob_start_len: List[int] | int | None = None¶
- top_logprobs_num: List[int] | int | None = None¶
- lora_path: List[str | None] | str | None = None¶
- session_params: List[Dict[str, Any]] | Dict[str, Any] | None = None¶
- stream: bool = False¶
- rid: List[str] | str | None = None¶
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pymllm.server.launch.ImageUrl(/, **data)¶
Bases:
pydantic.BaseModel- !!! abstract “Usage Documentation”
[Models](../concepts/models.md)
A base class for creating Pydantic models.
- Parameters:
data (Any)
- __class_vars__¶
The names of the class variables defined on the model.
- __private_attributes__¶
Metadata about the private attributes of the model.
- __signature__¶
The synthesized __init__ [Signature][inspect.Signature] of the model.
- __pydantic_complete__¶
Whether model building is completed, or if there are still undefined fields.
- __pydantic_core_schema__¶
The core schema of the model.
- __pydantic_custom_init__¶
Whether the model has a custom __init__ function.
- __pydantic_decorators__¶
Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.
- __pydantic_generic_metadata__¶
Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.
- __pydantic_parent_namespace__¶
Parent namespace of the model, used for automatic rebuilding of models.
- __pydantic_post_init__¶
The name of the post-init method for the model, if defined.
- __pydantic_root_model__¶
Whether the model is a [RootModel][pydantic.root_model.RootModel].
- __pydantic_serializer__¶
The pydantic-core SchemaSerializer used to dump instances of the model.
- __pydantic_validator__¶
The pydantic-core SchemaValidator used to validate instances of the model.
- __pydantic_fields__¶
A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects.
- __pydantic_computed_fields__¶
A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.
- __pydantic_extra__¶
A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.
- __pydantic_fields_set__¶
The names of fields explicitly set during instantiation.
- __pydantic_private__¶
Values of private attributes set on the model instance.
- url: str¶
- detail: str | None = 'auto'¶
- class pymllm.server.launch.ContentPart(/, **data)¶
Bases:
pydantic.BaseModel- !!! abstract “Usage Documentation”
[Models](../concepts/models.md)
A base class for creating Pydantic models.
- Parameters:
data (Any)
- __class_vars__¶
The names of the class variables defined on the model.
- __private_attributes__¶
Metadata about the private attributes of the model.
- __signature__¶
The synthesized __init__ [Signature][inspect.Signature] of the model.
- __pydantic_complete__¶
Whether model building is completed, or if there are still undefined fields.
- __pydantic_core_schema__¶
The core schema of the model.
- __pydantic_custom_init__¶
Whether the model has a custom __init__ function.
- __pydantic_decorators__¶
Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.
- __pydantic_generic_metadata__¶
Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.
- __pydantic_parent_namespace__¶
Parent namespace of the model, used for automatic rebuilding of models.
- __pydantic_post_init__¶
The name of the post-init method for the model, if defined.
- __pydantic_root_model__¶
Whether the model is a [RootModel][pydantic.root_model.RootModel].
- __pydantic_serializer__¶
The pydantic-core SchemaSerializer used to dump instances of the model.
- __pydantic_validator__¶
The pydantic-core SchemaValidator used to validate instances of the model.
- __pydantic_fields__¶
A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects.
- __pydantic_computed_fields__¶
A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.
- __pydantic_extra__¶
A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.
- __pydantic_fields_set__¶
The names of fields explicitly set during instantiation.
- __pydantic_private__¶
Values of private attributes set on the model instance.
- type: str¶
- text: str | None = None¶
- class pymllm.server.launch.ChatMessage(/, **data)¶
Bases:
pydantic.BaseModel- !!! abstract “Usage Documentation”
[Models](../concepts/models.md)
A base class for creating Pydantic models.
- Parameters:
data (Any)
- __class_vars__¶
The names of the class variables defined on the model.
- __private_attributes__¶
Metadata about the private attributes of the model.
- __signature__¶
The synthesized __init__ [Signature][inspect.Signature] of the model.
- __pydantic_complete__¶
Whether model building is completed, or if there are still undefined fields.
- __pydantic_core_schema__¶
The core schema of the model.
- __pydantic_custom_init__¶
Whether the model has a custom __init__ function.
- __pydantic_decorators__¶
Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.
- __pydantic_generic_metadata__¶
Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.
- __pydantic_parent_namespace__¶
Parent namespace of the model, used for automatic rebuilding of models.
- __pydantic_post_init__¶
The name of the post-init method for the model, if defined.
- __pydantic_root_model__¶
Whether the model is a [RootModel][pydantic.root_model.RootModel].
- __pydantic_serializer__¶
The pydantic-core SchemaSerializer used to dump instances of the model.
- __pydantic_validator__¶
The pydantic-core SchemaValidator used to validate instances of the model.
- __pydantic_fields__¶
A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects.
- __pydantic_computed_fields__¶
A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.
- __pydantic_extra__¶
A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.
- __pydantic_fields_set__¶
The names of fields explicitly set during instantiation.
- __pydantic_private__¶
Values of private attributes set on the model instance.
- role: str¶
- content: str | List[ContentPart] | None = None¶
- name: str | None = None¶
- tool_calls: List[Any] | None = None¶
- tool_call_id: str | None = None¶
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pymllm.server.launch.StreamOptions(/, **data)¶
Bases:
pydantic.BaseModel- !!! abstract “Usage Documentation”
[Models](../concepts/models.md)
A base class for creating Pydantic models.
- Parameters:
data (Any)
- __class_vars__¶
The names of the class variables defined on the model.
- __private_attributes__¶
Metadata about the private attributes of the model.
- __signature__¶
The synthesized __init__ [Signature][inspect.Signature] of the model.
- __pydantic_complete__¶
Whether model building is completed, or if there are still undefined fields.
- __pydantic_core_schema__¶
The core schema of the model.
- __pydantic_custom_init__¶
Whether the model has a custom __init__ function.
- __pydantic_decorators__¶
Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.
- __pydantic_generic_metadata__¶
Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.
- __pydantic_parent_namespace__¶
Parent namespace of the model, used for automatic rebuilding of models.
- __pydantic_post_init__¶
The name of the post-init method for the model, if defined.
- __pydantic_root_model__¶
Whether the model is a [RootModel][pydantic.root_model.RootModel].
- __pydantic_serializer__¶
The pydantic-core SchemaSerializer used to dump instances of the model.
- __pydantic_validator__¶
The pydantic-core SchemaValidator used to validate instances of the model.
- __pydantic_fields__¶
A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects.
- __pydantic_computed_fields__¶
A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.
- __pydantic_extra__¶
A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.
- __pydantic_fields_set__¶
The names of fields explicitly set during instantiation.
- __pydantic_private__¶
Values of private attributes set on the model instance.
- include_usage: bool | None = False¶
- continuous_usage_stats: bool | None = False¶
- class pymllm.server.launch.ToolFunction(/, **data)¶
Bases:
pydantic.BaseModel- !!! abstract “Usage Documentation”
[Models](../concepts/models.md)
A base class for creating Pydantic models.
- Parameters:
data (Any)
- __class_vars__¶
The names of the class variables defined on the model.
- __private_attributes__¶
Metadata about the private attributes of the model.
- __signature__¶
The synthesized __init__ [Signature][inspect.Signature] of the model.
- __pydantic_complete__¶
Whether model building is completed, or if there are still undefined fields.
- __pydantic_core_schema__¶
The core schema of the model.
- __pydantic_custom_init__¶
Whether the model has a custom __init__ function.
- __pydantic_decorators__¶
Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.
- __pydantic_generic_metadata__¶
Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.
- __pydantic_parent_namespace__¶
Parent namespace of the model, used for automatic rebuilding of models.
- __pydantic_post_init__¶
The name of the post-init method for the model, if defined.
- __pydantic_root_model__¶
Whether the model is a [RootModel][pydantic.root_model.RootModel].
- __pydantic_serializer__¶
The pydantic-core SchemaSerializer used to dump instances of the model.
- __pydantic_validator__¶
The pydantic-core SchemaValidator used to validate instances of the model.
- __pydantic_fields__¶
A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects.
- __pydantic_computed_fields__¶
A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.
- __pydantic_extra__¶
A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.
- __pydantic_fields_set__¶
The names of fields explicitly set during instantiation.
- __pydantic_private__¶
Values of private attributes set on the model instance.
- name: str¶
- description: str | None = None¶
- parameters: Dict[str, Any] | None = None¶
- class pymllm.server.launch.Tool(/, **data)¶
Bases:
pydantic.BaseModel- !!! abstract “Usage Documentation”
[Models](../concepts/models.md)
A base class for creating Pydantic models.
- Parameters:
data (Any)
- __class_vars__¶
The names of the class variables defined on the model.
- __private_attributes__¶
Metadata about the private attributes of the model.
- __signature__¶
The synthesized __init__ [Signature][inspect.Signature] of the model.
- __pydantic_complete__¶
Whether model building is completed, or if there are still undefined fields.
- __pydantic_core_schema__¶
The core schema of the model.
- __pydantic_custom_init__¶
Whether the model has a custom __init__ function.
- __pydantic_decorators__¶
Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.
- __pydantic_generic_metadata__¶
Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.
- __pydantic_parent_namespace__¶
Parent namespace of the model, used for automatic rebuilding of models.
- __pydantic_post_init__¶
The name of the post-init method for the model, if defined.
- __pydantic_root_model__¶
Whether the model is a [RootModel][pydantic.root_model.RootModel].
- __pydantic_serializer__¶
The pydantic-core SchemaSerializer used to dump instances of the model.
- __pydantic_validator__¶
The pydantic-core SchemaValidator used to validate instances of the model.
- __pydantic_fields__¶
A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects.
- __pydantic_computed_fields__¶
A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.
- __pydantic_extra__¶
A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.
- __pydantic_fields_set__¶
The names of fields explicitly set during instantiation.
- __pydantic_private__¶
Values of private attributes set on the model instance.
- type: str = 'function'¶
- function: ToolFunction¶
- class pymllm.server.launch.ChatCompletionRequest(/, **data)¶
Bases:
pydantic.BaseModelOpenAI
POST /v1/chat/completionsbody.- Parameters:
data (Any)
- model: str = ''¶
- messages: List[ChatMessage]¶
- temperature: float | None = None¶
- top_p: float | None = None¶
- top_k: int | None = None¶
- max_tokens: int | None = None¶
- max_completion_tokens: int | None = None¶
- stream: bool = False¶
- stream_options: StreamOptions | None = None¶
- stop: str | List[str] | None = None¶
- n: int = 1¶
- frequency_penalty: float | None = None¶
- presence_penalty: float | None = None¶
- repetition_penalty: float | None = None¶
- seed: int | None = None¶
- logprobs: bool | None = None¶
- top_logprobs: int | None = None¶
- user: str | None = None¶
- tool_choice: str | Dict[str, Any] | None = None¶
- separate_reasoning: bool = True¶
- stream_reasoning: bool = True¶
- chat_template_kwargs: Dict[str, Any] | None = None¶
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pymllm.server.launch.CompletionRequest(/, **data)¶
Bases:
pydantic.BaseModelOpenAI
POST /v1/completionsbody.- Parameters:
data (Any)
- model: str = ''¶
- prompt: str | List[str]¶
- temperature: float | None = None¶
- top_p: float | None = None¶
- top_k: int | None = None¶
- max_tokens: int | None = None¶
- stream: bool = False¶
- stream_options: StreamOptions | None = None¶
- stop: str | List[str] | None = None¶
- n: int = 1¶
- frequency_penalty: float | None = None¶
- presence_penalty: float | None = None¶
- repetition_penalty: float | None = None¶
- seed: int | None = None¶
- echo: bool = False¶
- logprobs: int | None = None¶
- user: str | None = None¶
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class pymllm.server.launch.AbortRequest(/, **data)¶
Bases:
pydantic.BaseModel- !!! abstract “Usage Documentation”
[Models](../concepts/models.md)
A base class for creating Pydantic models.
- Parameters:
data (Any)
- __class_vars__¶
The names of the class variables defined on the model.
- __private_attributes__¶
Metadata about the private attributes of the model.
- __signature__¶
The synthesized __init__ [Signature][inspect.Signature] of the model.
- __pydantic_complete__¶
Whether model building is completed, or if there are still undefined fields.
- __pydantic_core_schema__¶
The core schema of the model.
- __pydantic_custom_init__¶
Whether the model has a custom __init__ function.
- __pydantic_decorators__¶
Metadata containing the decorators defined on the model. This replaces Model.__validators__ and Model.__root_validators__ from Pydantic V1.
- __pydantic_generic_metadata__¶
Metadata for generic models; contains data used for a similar purpose to __args__, __origin__, __parameters__ in typing-module generics. May eventually be replaced by these.
- __pydantic_parent_namespace__¶
Parent namespace of the model, used for automatic rebuilding of models.
- __pydantic_post_init__¶
The name of the post-init method for the model, if defined.
- __pydantic_root_model__¶
Whether the model is a [RootModel][pydantic.root_model.RootModel].
- __pydantic_serializer__¶
The pydantic-core SchemaSerializer used to dump instances of the model.
- __pydantic_validator__¶
The pydantic-core SchemaValidator used to validate instances of the model.
- __pydantic_fields__¶
A dictionary of field names and their corresponding [FieldInfo][pydantic.fields.FieldInfo] objects.
- __pydantic_computed_fields__¶
A dictionary of computed field names and their corresponding [ComputedFieldInfo][pydantic.fields.ComputedFieldInfo] objects.
- __pydantic_extra__¶
A dictionary containing extra values, if [extra][pydantic.config.ConfigDict.extra] is set to ‘allow’.
- __pydantic_fields_set__¶
The names of fields explicitly set during instantiation.
- __pydantic_private__¶
Values of private attributes set on the model instance.
- rid: str | None = None¶
- async pymllm.server.launch.lifespan(app)¶
Startup / shutdown hooks for the FastAPI app.
- Parameters:
app (fastapi.FastAPI)
- pymllm.server.launch.app¶
- async pymllm.server.launch.http_exception_handler(request, exc)¶
- Parameters:
request (fastapi.Request)
exc (fastapi.HTTPException)
- async pymllm.server.launch.health()¶
Liveness / readiness probe. Returns 503 if subprocesses died.
- async pymllm.server.launch.model_info()¶
Return basic model metadata.
- async pymllm.server.launch.server_info()¶
Dump runtime server configuration (sensitive fields redacted).
- async pymllm.server.launch.list_models()¶
OpenAI-compatible model listing.
- async pymllm.server.launch.retrieve_model(model_id)¶
OpenAI-compatible single model retrieval.
- Parameters:
model_id (str)
- async pymllm.server.launch.generate(obj, request)¶
Native generation endpoint. Supports SSE streaming.
- Parameters:
obj (GenerateRequest)
request (fastapi.Request)
- async pymllm.server.launch.openai_completions(obj, request)¶
OpenAI-compatible text completion endpoint.
- Parameters:
obj (CompletionRequest)
request (fastapi.Request)
- async pymllm.server.launch.openai_chat_completions(obj, request)¶
OpenAI-compatible chat completion endpoint with reasoning & tool-call parsing.
- Parameters:
obj (ChatCompletionRequest)
request (fastapi.Request)
- async pymllm.server.launch.flush_cache()¶
Cache flush (not yet implemented).
- async pymllm.server.launch.abort_request(obj)¶
Abort a running request by rid.
- Parameters:
obj (AbortRequest)
- pymllm.server.launch.launch_server()¶
Launch the pymllm Engine then start the uvicorn HTTP server.
It first boots all engine subprocesses (tokenizer, scheduler, model-runner, detokenizer) and then hands off to uvicorn to serve HTTP traffic.
- pymllm.server.launch.main()¶
CLI entry point.