pymllm.engine.launch

Attributes

Classes

Module Contents

pymllm.engine.launch.HAS_BANNER_LIBS = True
pymllm.engine.launch.logger
class pymllm.engine.launch.Engine
property is_healthy: bool

True if engine and all subprocesses are alive.

Return type:

bool

launch()
Return type:

None

generate(prompt=None, sampling_params=None, input_ids=None, image_data=None, audio_data=None, video_data=None, return_logprob=None, logprob_start_len=None, top_logprobs_num=None, lora_path=None, session_params=None, stream=False, rid=None, **kwargs)

Synchronous, non-streaming generation entry point.

Accepts a single prompt (str) or a batch (List[str]). Returns a single result dict for single inputs and a list of result dicts for batch inputs, preserving the input order.

Parameters:
  • prompt (Optional[Union[List[str], str]])

  • sampling_params (Optional[Union[List[Dict[str, Any]], Dict[str, Any]]])

  • input_ids (Optional[Union[List[List[int]], List[int]]])

  • image_data (Optional[Any])

  • audio_data (Optional[Any])

  • video_data (Optional[Any])

  • return_logprob (Optional[Union[List[bool], bool]])

  • logprob_start_len (Optional[Union[List[int], int]])

  • top_logprobs_num (Optional[Union[List[int], int]])

  • lora_path (Optional[Union[List[Optional[str]], str]])

  • session_params (Optional[Union[List[Dict[str, Any]], Dict[str, Any]]])

  • stream (bool)

  • rid (Optional[Union[List[str], str]])

Return type:

Union[Dict[str, Any], List[Dict[str, Any]]]

async generate_async(prompt=None, sampling_params=None, input_ids=None, image_data=None, audio_data=None, video_data=None, return_logprob=None, logprob_start_len=None, top_logprobs_num=None, lora_path=None, session_params=None, stream=False, rid=None, **kwargs)

Asynchronous generation entry point.

For a single request and stream=False yields one final result dict; with stream=True yields incremental chunks.

For a batch request the iterator yields the final result for each sub-request as it completes (order not guaranteed); streaming mode yields incremental chunks from all sub-requests interleaved.

Parameters:
  • prompt (Optional[Union[List[str], str]])

  • sampling_params (Optional[Union[List[Dict[str, Any]], Dict[str, Any]]])

  • input_ids (Optional[Union[List[List[int]], List[int]]])

  • image_data (Optional[Any])

  • audio_data (Optional[Any])

  • video_data (Optional[Any])

  • return_logprob (Optional[Union[List[bool], bool]])

  • logprob_start_len (Optional[Union[List[int], int]])

  • top_logprobs_num (Optional[Union[List[int], int]])

  • lora_path (Optional[Union[List[Optional[str]], str]])

  • session_params (Optional[Union[List[Dict[str, Any]], Dict[str, Any]]])

  • stream (bool)

  • rid (Optional[Union[List[str], str]])

Return type:

AsyncIterator[Dict[str, Any]]

shutdown()

Terminate all subprocesses.

Return type:

None