test_compressed_tensors_runtime =============================== .. py:module:: test_compressed_tensors_runtime Functions --------- .. autoapisummary:: test_compressed_tensors_runtime.test_create_weights_registers_checkpoint_parameter_names test_compressed_tensors_runtime.test_process_and_apply_use_gptq_repack_and_uint4b8 test_compressed_tensors_runtime.test_w8a8_create_weights_registers_weight_and_scale test_compressed_tensors_runtime.test_w8a8_process_weights_transposes_and_flattens_scales test_compressed_tensors_runtime.test_w8a8_apply_matches_reference_for_large_m test_compressed_tensors_runtime.test_w8a8_apply_supports_small_m_by_padding test_compressed_tensors_runtime.test_w8a8_apply_uses_triton_quant_and_torch_int_mm Module Contents --------------- .. py:function:: test_create_weights_registers_checkpoint_parameter_names() .. py:function:: test_process_and_apply_use_gptq_repack_and_uint4b8(monkeypatch) .. py:function:: test_w8a8_create_weights_registers_weight_and_scale() .. py:function:: test_w8a8_process_weights_transposes_and_flattens_scales() .. py:function:: test_w8a8_apply_matches_reference_for_large_m() .. py:function:: test_w8a8_apply_supports_small_m_by_padding() .. py:function:: test_w8a8_apply_uses_triton_quant_and_torch_int_mm(monkeypatch) Verify the W8A8 forward path uses Triton activation quant + torch._int_mm.