test_compressed_tensors_runtime

Functions

Module Contents

test_compressed_tensors_runtime.test_create_weights_registers_checkpoint_parameter_names()
test_compressed_tensors_runtime.test_process_and_apply_use_gptq_repack_and_uint4b8(monkeypatch)
Parameters:

monkeypatch (pytest.MonkeyPatch)

test_compressed_tensors_runtime.test_w8a8_create_weights_registers_weight_and_scale()
test_compressed_tensors_runtime.test_w8a8_process_weights_transposes_and_flattens_scales()
test_compressed_tensors_runtime.test_w8a8_apply_matches_reference_for_large_m()
test_compressed_tensors_runtime.test_w8a8_apply_supports_small_m_by_padding()
test_compressed_tensors_runtime.test_w8a8_apply_uses_triton_quant_and_torch_int_mm(monkeypatch)

Verify the W8A8 forward path uses Triton activation quant + torch._int_mm.

Parameters:

monkeypatch (pytest.MonkeyPatch)