pymllm.configs.quantization_config
==================================

.. py:module:: pymllm.configs.quantization_config

.. autoapi-nested-parse::

   Quantization settings for model weights and KV cache.


Classes
-------

.. autoapisummary::

   pymllm.configs.quantization_config.QuantizationConfig


Module Contents
---------------

.. py:class:: QuantizationConfig

   Quantization configuration for weights and KV cache.


   .. py:attribute:: method
      :type:  Optional[str]
      :value: None


   .. py:attribute:: kv_cache_dtype
      :type:  Literal['auto', 'float16', 'bfloat16', 'fp8_e4m3', 'fp8_e5m2']
      :value: 'auto'