Supported MLLM aops Operations¶

The following table lists the operations currently supported in MLLM aops:

Operation Name	Header File	Description
FlashAttention2Op	aops/FlashAttention2Op.hpp	Implements Flash Attention 2 operation
LayerNormOp	aops/LayerNormOp.hpp	Layer Normalization operation
MatMulOp	aops/MatMulOp.hpp	Matrix Multiplication operation
RMSNormOp	aops/RMSNormOp.hpp	Root Mean Square Normalization operation
GraphBeginOp	aops/GraphOps.hpp	Marks the beginning of a computation graph
GraphEndOp	aops/GraphOps.hpp	Marks the end of a computation graph
SplitOp	aops/SplitOp.hpp	Splits a tensor along a given dimension
TransposeOp	aops/TransposeOp.hpp	Transposes a tensor by swapping two dimensions
CausalMaskOp	aops/CausalMaskOp.hpp	Creates a causal mask for attention operations
SliceOp	aops/SliceOp.hpp	Slices a tensor along specified dimensions
SiLUOp	aops/SiLUOp.hpp	Sigmoid Linear Unit activation function
ViewOp	aops/ViewOp.hpp	Reshapes a tensor without copying data
RepeatOp	aops/RepeatOp.hpp	Repeats a tensor along a given dimension
SoftmaxOp	aops/SoftmaxOp.hpp	Softmax activation function
CopyOp	aops/CopyOp.hpp	Copies tensor data
ParamOp	aops/ParamOp.hpp	Parameter operation for model weights
CloneOp	aops/CloneOp.hpp	Clones a tensor
VisionRoPEOp	aops/VisionRoPEOp.hpp	Vision Rotary Positional Encoding operation
ReLUOp	aops/ReLUOp.hpp	Rectified Linear Unit activation function
ContiguousOp	aops/ContiguousOp.hpp	Makes tensor data contiguous in memory
GELUOp	aops/GELUOp.hpp	Gaussian Error Linear Unit activation function
ReshapeOp	aops/ReshapeOp.hpp	Reshapes a tensor
X2XOp	aops/X2XOp.hpp	Transfers tensor data between devices
PermuteOp	aops/PermuteOp.hpp	Permutes tensor dimensions
CastTypeOp	aops/CastTypeOp.hpp	Casts tensor data to a different type
ConcatOp	aops/ConcatOp.hpp	Concatenates tensors along a given dimension
Conv3DOp	aops/Conv3DOp.hpp	3D Convolution operation
ElewiseOps	aops/ElewiseOps.hpp	Element-wise operations (Add, Sub, Mul, Div, etc.)
EmbeddingOp	aops/EmbeddingOp.hpp	Embedding lookup operation
FillOp	aops/FillOp.hpp	Fills a tensor with a specific value
LinearOp	aops/LinearOp.hpp	Linear (fully connected) operation
MultimodalRoPEOp	aops/MultimodalRoPEOp.hpp	Multimodal Rotary Positional Encoding operation
QuickGELUOp	aops/QuickGELUOp.hpp	Quick GELU activation function
ReduceOps	aops/ReduceOps.hpp	Reduction operations (Sum, Mean, Max, Min, etc.)