test_model_runner_memory_pool ============================= .. py:module:: test_model_runner_memory_pool Functions --------- .. autoapisummary:: test_model_runner_memory_pool.test_profile_max_num_tokens_treats_mem_fraction_static_as_static_pool_fraction test_model_runner_memory_pool.test_profile_max_num_tokens_clamps_negative_capacity_and_records_diagnostics test_model_runner_memory_pool.test_kv_cache_memory_error_includes_profile_details_and_server_hint test_model_runner_memory_pool.test_profile_max_num_tokens_caps_user_max_total_tokens_to_profiled_capacity test_model_runner_memory_pool.test_profile_max_num_tokens_uses_user_limit_when_below_profiled_capacity test_model_runner_memory_pool.test_available_gpu_memory_uses_system_memory_for_integrated_gpu Module Contents --------------- .. py:function:: test_profile_max_num_tokens_treats_mem_fraction_static_as_static_pool_fraction(_cuda_available, monkeypatch) .. py:function:: test_profile_max_num_tokens_clamps_negative_capacity_and_records_diagnostics(_cuda_available, monkeypatch) .. py:function:: test_kv_cache_memory_error_includes_profile_details_and_server_hint() .. py:function:: test_profile_max_num_tokens_caps_user_max_total_tokens_to_profiled_capacity(_cuda_available, monkeypatch) .. py:function:: test_profile_max_num_tokens_uses_user_limit_when_below_profiled_capacity(_cuda_available, monkeypatch) .. py:function:: test_available_gpu_memory_uses_system_memory_for_integrated_gpu(monkeypatch)