pymllm.configs.quantization_config ================================== .. py:module:: pymllm.configs.quantization_config .. autoapi-nested-parse:: Quantization settings for model weights and KV cache. Classes ------- .. autoapisummary:: pymllm.configs.quantization_config.QuantizationConfig Module Contents --------------- .. py:class:: QuantizationConfig Quantization configuration for weights and KV cache. .. py:attribute:: method :type: Optional[str] :value: None .. py:attribute:: kv_cache_dtype :type: Literal['auto', 'float16', 'bfloat16', 'fp8_e4m3', 'fp8_e5m2'] :value: 'auto'