Skip to content

MLLM
2.0.0 documentation

MLLM
2.0.0 documentation

Quick Start

Mllm API Service
- MLLM CLI

Architectures

Compile
- MLLM IR

Quantization
- Data Types in MLLM
- How to Add New Data Types

MLLM LM Cache

pymllm Runtime

CPU Backend

Ascend Backend

QNN Backend

OpenCL Backend

MLLM C++ API

Contribute

Talks

Algorithm
- Pruning

FAQ

Pymllm API

pymllm

pymllm.executor¶

Executor module: model loading, forward pass, and sampling.

Submodules¶

pymllm.executor.cuda_graph_runner
pymllm.executor.model_runner

pymllm.executor.cuda_graph_runner

pymllm.engine.launch

Copyright © 2024-2025, MLLM Contributors

Made with Sphinx and @pradyunsg's Furo

On this page

pymllm.executor
- Submodules