runner ====== .. py:module:: runner Classes ------- .. autoapisummary:: runner.Qwen3Quantizer Functions --------- .. autoapisummary:: runner.freeze_qwen3_rmsnorm_weight runner.freeze_qwen3_linear_weight runner.disable_qdq_observer runner.enable_qdq_observer runner.convert_weight Module Contents --------------- .. py:function:: freeze_qwen3_rmsnorm_weight(m) .. py:function:: freeze_qwen3_linear_weight(m) .. py:function:: disable_qdq_observer(m) .. py:function:: enable_qdq_observer(m) .. py:function:: convert_weight(m) .. py:class:: Qwen3Quantizer(model_path, mllm_qualcomm_max_length=2048) .. py:attribute:: tokenizer .. py:attribute:: model .. py:attribute:: mllm_qualcomm_max_length :value: 2048 .. py:method:: freeze_activation() .. py:method:: enable_activation_update() .. py:method:: compile() .. py:method:: infer(prompt) .. py:method:: calibrate(num_samples=64, max_seq_length=512) Perform calibration using Wikipedia dataset (PTQ) :param num_samples: Number of samples for calibration :param max_seq_length: Maximum length for each sample (not exceeding mllm_qualcomm_max_length) .. py:method:: convert()