pymllm.orchestrator.parallel_state ================================== .. py:module:: pymllm.orchestrator.parallel_state .. autoapi-nested-parse:: Minimal parallel state for single-GPU serving. pymllm targets single-GPU, high-concurrency inference. This module keeps the TP / DP / PP scaffolding so the rest of the codebase can query ranks and groups uniformly, but the default (and expected) case is world_size=1. Attributes ---------- .. autoapisummary:: pymllm.orchestrator.parallel_state.logger Functions --------- .. autoapisummary:: pymllm.orchestrator.parallel_state.initialize_model_parallel pymllm.orchestrator.parallel_state.get_tp_group pymllm.orchestrator.parallel_state.get_dp_group pymllm.orchestrator.parallel_state.get_pp_group pymllm.orchestrator.parallel_state.get_tensor_model_parallel_rank pymllm.orchestrator.parallel_state.get_tensor_model_parallel_world_size pymllm.orchestrator.parallel_state.get_data_parallel_rank pymllm.orchestrator.parallel_state.get_data_parallel_world_size pymllm.orchestrator.parallel_state.get_pipeline_model_parallel_rank pymllm.orchestrator.parallel_state.get_pipeline_model_parallel_world_size pymllm.orchestrator.parallel_state.model_parallel_is_initialized pymllm.orchestrator.parallel_state.tensor_model_parallel_all_reduce pymllm.orchestrator.parallel_state.tensor_model_parallel_all_gather pymllm.orchestrator.parallel_state.data_parallel_all_reduce Module Contents --------------- .. py:data:: logger .. py:function:: initialize_model_parallel(tensor_model_parallel_size = 1, data_parallel_size = 1, pipeline_model_parallel_size = 1, backend = 'nccl') .. py:function:: get_tp_group() .. py:function:: get_dp_group() .. py:function:: get_pp_group() .. py:function:: get_tensor_model_parallel_rank() .. py:function:: get_tensor_model_parallel_world_size() .. py:function:: get_data_parallel_rank() .. py:function:: get_data_parallel_world_size() .. py:function:: get_pipeline_model_parallel_rank() .. py:function:: get_pipeline_model_parallel_world_size() .. py:function:: model_parallel_is_initialized() .. py:function:: tensor_model_parallel_all_reduce(tensor) .. py:function:: tensor_model_parallel_all_gather(tensor, dim = 0) .. py:function:: data_parallel_all_reduce(tensor)