test_model_runner_memory_pool

Functions

Module Contents

test_model_runner_memory_pool.test_profile_max_num_tokens_treats_mem_fraction_static_as_static_pool_fraction(_cuda_available, monkeypatch)
test_model_runner_memory_pool.test_profile_max_num_tokens_clamps_negative_capacity_and_records_diagnostics(_cuda_available, monkeypatch)
test_model_runner_memory_pool.test_kv_cache_memory_error_includes_profile_details_and_server_hint()
test_model_runner_memory_pool.test_profile_max_num_tokens_caps_user_max_total_tokens_to_profiled_capacity(_cuda_available, monkeypatch)
test_model_runner_memory_pool.test_profile_max_num_tokens_uses_user_limit_when_below_profiled_capacity(_cuda_available, monkeypatch)
test_model_runner_memory_pool.test_available_gpu_memory_uses_system_memory_for_integrated_gpu(monkeypatch)