test_compressed_tensors_runtime¶
Functions¶
Verify the W8A8 forward path uses Triton activation quant + torch._int_mm. |
Module Contents¶
- test_compressed_tensors_runtime.test_create_weights_registers_checkpoint_parameter_names()¶
- test_compressed_tensors_runtime.test_process_and_apply_use_gptq_repack_and_uint4b8(monkeypatch)¶
- Parameters:
monkeypatch (pytest.MonkeyPatch)
- test_compressed_tensors_runtime.test_w8a8_create_weights_registers_weight_and_scale()¶
- test_compressed_tensors_runtime.test_w8a8_process_weights_transposes_and_flattens_scales()¶
- test_compressed_tensors_runtime.test_w8a8_apply_matches_reference_for_large_m()¶
- test_compressed_tensors_runtime.test_w8a8_apply_supports_small_m_by_padding()¶
- test_compressed_tensors_runtime.test_w8a8_apply_uses_triton_quant_and_torch_int_mm(monkeypatch)¶
Verify the W8A8 forward path uses Triton activation quant + torch._int_mm.
- Parameters:
monkeypatch (pytest.MonkeyPatch)