test_compressed_tensors_runtime
===============================

.. py:module:: test_compressed_tensors_runtime


Functions
---------

.. autoapisummary::

   test_compressed_tensors_runtime.test_create_weights_registers_checkpoint_parameter_names
   test_compressed_tensors_runtime.test_process_and_apply_use_gptq_repack_and_uint4b8
   test_compressed_tensors_runtime.test_w8a8_create_weights_registers_weight_and_scale
   test_compressed_tensors_runtime.test_w8a8_process_weights_transposes_and_flattens_scales
   test_compressed_tensors_runtime.test_w8a8_apply_matches_reference_for_large_m
   test_compressed_tensors_runtime.test_w8a8_apply_supports_small_m_by_padding
   test_compressed_tensors_runtime.test_w8a8_apply_uses_triton_quant_and_torch_int_mm


Module Contents
---------------

.. py:function:: test_create_weights_registers_checkpoint_parameter_names()

.. py:function:: test_process_and_apply_use_gptq_repack_and_uint4b8(monkeypatch)

.. py:function:: test_w8a8_create_weights_registers_weight_and_scale()

.. py:function:: test_w8a8_process_weights_transposes_and_flattens_scales()

.. py:function:: test_w8a8_apply_matches_reference_for_large_m()

.. py:function:: test_w8a8_apply_supports_small_m_by_padding()

.. py:function:: test_w8a8_apply_uses_triton_quant_and_torch_int_mm(monkeypatch)

   Verify the W8A8 forward path uses Triton activation quant + torch._int_mm.