pymllm.backends.qualcomm.transformers.core.qdq¶
Classes¶
General activation Quantization-DeQuantization (QDQ) module. |
Module Contents¶
- class pymllm.backends.qualcomm.transformers.core.qdq.ActivationQDQ(bits=8, qscheme=torch.per_tensor_affine)¶
Bases:
torch.nn.ModuleGeneral activation Quantization-DeQuantization (QDQ) module. Supports both Symmetric and Asymmetric (Affine) quantization. Uses torch.qint32 as a unified type to support various bit-widths.
- bits = 8¶
- qscheme¶
- dtype¶
- fake_quant¶
- forward(x)¶
- enable_observer()¶
Enable tracking of min/max values to update scale and zero_point.
- disable_observer()¶
Freeze scale and zero_point calculation.
- enable_fakequant()¶
Enable simulation of quantization error.
- disable_fakequant()¶
Disable quantization simulation (act as identity).
- extra_repr()¶