mllm API¶

The mllm.hpp header is the main header file that includes all essential MLLM components. It provides core functionalities for model loading, context management, asynchronous execution, and utility functions.

#include "mllm.hpp"

Core Functions¶

void mllm::initializeContext()¶: Initialize the MLLM context, register backends, and set up memory management.

void mllm::shutdownContext()¶: Shutdown the MLLM context and clean up resources.

void mllm::setRandomSeed(uint64_t seed)¶

Set the random seed for reproducible results.

Parameters:: seed – Random seed value

void mllm::setMaximumNumThreads(uint32_t num_threads)¶

Set the maximum number of threads for parallel execution.

Parameters:: num_threads – Maximum number of threads

void mllm::setPrintPrecision(int precision)¶

Set the floating-point precision for printing tensors.

Parameters:: precision – Number of decimal places

void mllm::setPrintMaxElementsPerDim(int max_elements)¶

Set the maximum number of elements to print per dimension.

Parameters:: max_elements – Maximum elements per dimension

void mllm::memoryReport()¶: Print a memory usage report.

bool mllm::isOpenCLAvailable()¶

Check if OpenCL backend is available.

Returns:: True if OpenCL is available, false otherwise

bool mllm::isQnnAvailable()¶

Check if QNN backend is available.

Returns:: True if QNN is available, false otherwise

void mllm::cleanThisThread()¶: Clean up thread-local resources.

SessionTCB::ptr_t mllm::thisThread()¶

Get the current thread’s session context.

Returns:: Shared pointer to SessionTCB

Parameter File Functions¶

ParameterFile::ptr_t mllm::load(const std::string &file_name, ModelFileVersion version = ModelFileVersion::kV1, DeviceTypes map_2_device = kCPU)¶

Load a parameter file.

Parameters:

file_name – Path to the parameter file
version – Model file version (default: kV1)
map_2_device – Target device for loading (default: kCPU)

Returns:

Shared pointer to ParameterFile

void mllm::save(const std::string &file_name, const ParameterFile::ptr_t &parameter_file, ModelFileVersion version = ModelFileVersion::kV1, DeviceTypes map_2_device = kCPU)¶

Save parameters to a file.

Parameters:

file_name – Path to save the parameter file
parameter_file – ParameterFile to save
version – Model file version (default: kV1)
map_2_device – Target device for saving (default: kCPU)

Utility Functions¶

template<typename ...Args> void mllm::print(const Args&... args)¶

Print arguments to stdout with automatic formatting.

Parameters:: args – Arguments to print

Testing Functions¶

mllm::test::AllCloseResult mllm::test::allClose(const Tensor &a, const Tensor &b, float rtol = 1e-5, float atol = 1e-5, bool equal_nan = false)¶

Check if two tensors are close within tolerance.

Parameters:

a – First tensor
b – Second tensor
rtol – Relative tolerance (default: 1e-5)
atol – Absolute tolerance (default: 1e-5)
equal_nan – Whether NaNs should be considered equal (default: false)

Returns:

AllCloseResult containing comparison results

class mllm::test::AllCloseResult¶

Result structure for allClose function.

bool mllm::test::AllCloseResult::is_close¶: True if tensors are close within tolerance

size_t mllm::test::AllCloseResult::total_elements¶: Total number of elements compared

size_t mllm::test::AllCloseResult::mismatched_elements¶: Number of elements that don’t match within tolerance

float mllm::test::AllCloseResult::max_absolute_diff¶: Maximum absolute difference

float mllm::test::AllCloseResult::max_relative_diff¶: Maximum relative difference

Async Execution Functions¶

template<typename __Module, typename ...__Args> std::pair<TaskResult::sender_t, Task::ptr_t> mllm::async::fork(__Module &module, __Args&&... args)¶

Fork a task for asynchronous execution.

Parameters:

module – Module to execute
args – Arguments for module execution

Returns:

Pair of sender and task pointer

std::vector<Tensor> mllm::async::wait(std::pair<TaskResult::sender_t, Task::ptr_t> &sender)¶

Wait for a single asynchronous task to complete.

Parameters:: sender – Sender-task pair
Returns:: Output tensors

template<typename ...__Args> std::array<std::vector<Tensor>, sizeof...(__Args)> mllm::async::wait(__Args&&... args)¶

Wait for multiple asynchronous tasks to complete.

Parameters:: args – Sender-task pairs
Returns:: Array of output tensors

Signal Handling¶

void mllm::__setup_signal_handler()¶: Set up signal handlers for graceful shutdown on interruption.

void mllm::__signal_handler(int signal)¶

Signal handler function.

Parameters:: signal – Signal number

template<typename Func> int mllm::__mllm_exception_main(Func &&func)¶

Exception-safe main function wrapper.

Parameters:: func – User function to execute
Returns:: Exit code

const char *mllm::signal_description(int signal)¶

Get human-readable description of a signal.

Parameters:: signal – Signal number
Returns:: Description string

Macros¶

MLLM_MAIN(...)¶: Main function macro that sets up signal handlers, initializes context, and provides exception safety.

Performance Functions¶

void mllm::perf::warmup(const ParameterFile::ptr_t &params)¶

Warm up the model with given parameters.

Parameters:: params – Parameters for warmup