pymllm.layers.embedding¶

Classes¶

Embedding layer with vocabulary parallelism.

class pymllm.layers.embedding.VocabParallelEmbedding(num_embeddings, embedding_dim, padding_idx=None)¶

Embedding layer with vocabulary parallelism.

This layer shards the embedding table along the vocabulary dimension for tensor parallelism.

Parameters:

weight_loader(param, loaded_weight)¶

Load sharded weights into the parameter.

Parameters:

param (torch.nn.Parameter) – The parameter to load weights into.
loaded_weight (torch.Tensor) – The weight tensor loaded from checkpoint (full size).

forward(x)¶

Forward pass of the embedding layer with TP support.