Supported MLLM aops OperationsΒΆ

The following table lists the operations currently supported in MLLM aops:

Operation Name

Header File

Description

FlashAttention2Op

aops/FlashAttention2Op.hpp

Implements Flash Attention 2 operation

LayerNormOp

aops/LayerNormOp.hpp

Layer Normalization operation

MatMulOp

aops/MatMulOp.hpp

Matrix Multiplication operation

RMSNormOp

aops/RMSNormOp.hpp

Root Mean Square Normalization operation

GraphBeginOp

aops/GraphOps.hpp

Marks the beginning of a computation graph

GraphEndOp

aops/GraphOps.hpp

Marks the end of a computation graph

SplitOp

aops/SplitOp.hpp

Splits a tensor along a given dimension

TransposeOp

aops/TransposeOp.hpp

Transposes a tensor by swapping two dimensions

CausalMaskOp

aops/CausalMaskOp.hpp

Creates a causal mask for attention operations

SliceOp

aops/SliceOp.hpp

Slices a tensor along specified dimensions

SiLUOp

aops/SiLUOp.hpp

Sigmoid Linear Unit activation function

ViewOp

aops/ViewOp.hpp

Reshapes a tensor without copying data

RepeatOp

aops/RepeatOp.hpp

Repeats a tensor along a given dimension

SoftmaxOp

aops/SoftmaxOp.hpp

Softmax activation function

CopyOp

aops/CopyOp.hpp

Copies tensor data

ParamOp

aops/ParamOp.hpp

Parameter operation for model weights

CloneOp

aops/CloneOp.hpp

Clones a tensor

VisionRoPEOp

aops/VisionRoPEOp.hpp

Vision Rotary Positional Encoding operation

ReLUOp

aops/ReLUOp.hpp

Rectified Linear Unit activation function

ContiguousOp

aops/ContiguousOp.hpp

Makes tensor data contiguous in memory

GELUOp

aops/GELUOp.hpp

Gaussian Error Linear Unit activation function

ReshapeOp

aops/ReshapeOp.hpp

Reshapes a tensor

X2XOp

aops/X2XOp.hpp

Transfers tensor data between devices

PermuteOp

aops/PermuteOp.hpp

Permutes tensor dimensions

CastTypeOp

aops/CastTypeOp.hpp

Casts tensor data to a different type

ConcatOp

aops/ConcatOp.hpp

Concatenates tensors along a given dimension

Conv3DOp

aops/Conv3DOp.hpp

3D Convolution operation

ElewiseOps

aops/ElewiseOps.hpp

Element-wise operations (Add, Sub, Mul, Div, etc.)

EmbeddingOp

aops/EmbeddingOp.hpp

Embedding lookup operation

FillOp

aops/FillOp.hpp

Fills a tensor with a specific value

LinearOp

aops/LinearOp.hpp

Linear (fully connected) operation

MultimodalRoPEOp

aops/MultimodalRoPEOp.hpp

Multimodal Rotary Positional Encoding operation

QuickGELUOp

aops/QuickGELUOp.hpp

Quick GELU activation function

ReduceOps

aops/ReduceOps.hpp

Reduction operations (Sum, Mean, Max, Min, etc.)