Neural Network Layers API¶

The layers directory contains implementations of various neural network layers that can be used to build models in MLLM. These layers are the building blocks for constructing neural networks.

#include "mllm/nn/Nn.hpp"

Linear Layer¶

class Linear¶

A fully connected linear layer that applies a linear transformation to the input data.

Linear::Linear()¶: Default constructor.

Linear::Linear(int32_t in_channels, int32_t out_channels, bool bias = true, aops::LinearImplTypes impl_type = aops::LinearImplTypes::kDefault)¶

Constructor with layer parameters.

Parameters:

in_channels – Number of input features
out_channels – Number of output features
bias – Whether to include a bias term (default: true)
impl_type – Implementation type (default: kDefault)

Linear::Linear(const aops::LinearOpOptions &options)¶

Constructor with options.

Parameters:: options – Linear operation options

Tensor Linear::weight() const¶

Get the weight tensor of the layer.

Returns:: Weight tensor

Tensor Linear::bias() const¶

Get the bias tensor of the layer.

Returns:: Bias tensor

RMSNorm Layer¶

class RMSNorm¶

Root Mean Square Layer Normalization.

RMSNorm::RMSNorm()¶: Default constructor with epsilon=1e-5 and add_unit_offset=false.

RMSNorm::RMSNorm(float epsilon, bool add_unit_offset = false)¶

Constructor with normalization parameters.

Parameters:

epsilon – Small value added to the denominator for numerical stability (default: 1e-5)
add_unit_offset – Whether to add a unit offset (default: false)

RMSNorm::RMSNorm(const aops::RMSNormOpOptions &options)¶

Constructor with options.

Parameters:: options – RMSNorm operation options

Tensor RMSNorm::weight() const¶

Get the weight tensor of the layer.

Returns:: Weight tensor

SiLU Layer¶

class SiLU¶

Sigmoid Linear Unit activation function (also known as Swish).

SiLU::SiLU()¶: Default constructor.

SiLU::SiLU(const aops::SiLUOpOptions &options)¶

Constructor with options.

Parameters:: options – SiLU operation options

Embedding Layer¶

class Embedding¶

Embedding layer that maps indices to dense vectors.

Embedding::Embedding()¶: Default constructor.

Embedding::Embedding(const aops::EmbeddingOpOptions &options)¶

Constructor with options.

Parameters:: options – Embedding operation options

Embedding::Embedding(int32_t vocab_size, int32_t hidden_size)¶

Constructor with vocabulary and hidden size.

Parameters:

vocab_size – Size of the vocabulary
hidden_size – Dimension of each embedding vector

Tensor Embedding::weight() const¶

Get the embedding weight matrix.

Returns:: Weight tensor of shape [vocab_size, hidden_size]

GELU Layer¶

class GELU¶

Gaussian Error Linear Unit activation function.

GELU::GELU()¶: Default constructor.

GELU::GELU(const aops::GELUOpOptions &options)¶

Constructor with options.

Parameters:: options – GELU operation options

QuickGELU Layer¶

class QuickGELU¶

An approximation of GELU that is faster to compute.

QuickGELU::QuickGELU()¶: Default constructor.

QuickGELU::QuickGELU(const aops::QuickGELUOpOptions &options)¶

Constructor with options.

Parameters:: options – QuickGELU operation options

ReLU Layer¶

class ReLU¶

Rectified Linear Unit activation function.

ReLU::ReLU()¶: Default constructor.

ReLU::ReLU(const aops::ReLUOpOptions &options)¶

Constructor with options.

Parameters:: options – ReLU operation options

LayerNorm Layer¶

class LayerNorm¶

Layer Normalization.

LayerNorm::LayerNorm()¶: Default constructor.

LayerNorm::LayerNorm(const aops::LayerNormOpOptions &options)¶

Constructor with options.

Parameters:: options – LayerNorm operation options

LayerNorm::LayerNorm(const std::vector<int32_t> &normalized_shape, bool elementwise_affine = true, bool bias = true, float eps = 1e-6)¶

Constructor with normalization parameters.

Parameters:

normalized_shape – Shape of the normalized dimensions
elementwise_affine – Whether to use learnable affine parameters (default: true)
bias – Whether to include bias term (default: true)
eps – Small value added to the denominator for numerical stability (default: 1e-6)

Softmax Layer¶

class Softmax¶

Softmax activation function.

Softmax::Softmax()¶: Default constructor.

Softmax::Softmax(const aops::SoftmaxOpOptions &options)¶

Constructor with options.

Parameters:: options – Softmax operation options

Softmax::Softmax(int32_t dim)¶

Constructor with dimension parameter.

Parameters:: dim – Dimension along which to apply softmax

VisionRoPE Layer¶

class VisionRoPE¶

Rotary Positional Encoding for vision tasks.

VisionRoPE::VisionRoPE()¶: Default constructor.

VisionRoPE::VisionRoPE(const aops::VisionRoPEOpOptions &Options)¶

Constructor with options.

Parameters:: Options – VisionRoPE operation options

VisionRoPE::VisionRoPE(const aops::VisionRoPEOpOptionsType type, const aops::Qwen2VLRoPEOpOptions &Options)¶

Constructor with type and Qwen2VL options.

Parameters:

type – Type of VisionRoPE operation
Options – Qwen2VL RoPE operation options

Conv3D Layer¶

class Conv3D¶

3D Convolutional layer.

Conv3D::Conv3D()¶: Default constructor.

Conv3D::Conv3D(int32_t in_channels, int32_t out_channels, const std::vector<int32_t> &kernel_size, const std::vector<int32_t> &stride_size, bool bias = true, aops::Conv3DOpImplType impl_type = aops::Conv3DOpImplType::kDefault)¶

Constructor with convolution parameters.

Parameters:

in_channels – Number of input channels
out_channels – Number of output channels
kernel_size – Size of the convolution kernel
stride_size – Stride of the convolution
bias – Whether to include a bias term (default: true)
impl_type – Implementation type (default: kDefault)

Conv3D::Conv3D(const aops::Conv3DOpOptions &options)¶

Constructor with options.

Parameters:: options – Conv3D operation options

Tensor Conv3D::weight() const¶

Get the weight tensor of the layer.

Returns:: Weight tensor

Tensor Conv3D::bias() const¶

Get the bias tensor of the layer.

Returns:: Bias tensor

CausalMask Layer¶

class CausalMask¶

Causal (autoregressive) attention mask.

CausalMask::CausalMask()¶: Default constructor.

CausalMask::CausalMask(const aops::CausalMaskOpOptions &options)¶

Constructor with options.

Parameters:: options – CausalMask operation options

CausalMask::CausalMask(bool sliding_window, int32_t window_size)¶

Constructor with sliding window parameters.

Parameters:

sliding_window – Whether to use sliding window attention
window_size – Size of the sliding window

MultimodalRoPE Layer¶

class MultimodalRoPE¶

Rotary Positional Encoding for multimodal tasks.

MultimodalRoPE::MultimodalRoPE()¶: Default constructor.

MultimodalRoPE::MultimodalRoPE(const aops::MultimodalRoPEOpOptions &options)¶

Constructor with options.

Parameters:: options – MultimodalRoPE operation options

MultimodalRoPE::MultimodalRoPE(const aops::Qwen2VLMultimodalRoPEOpOptions &options)¶

Constructor with Qwen2VL multimodal options.

Parameters:: options – Qwen2VL MultimodalRoPE operation options

Param Layer¶

class Param¶

Parameter layer that holds trainable parameters.

Param::Param()¶: Default constructor.

Param::Param(const aops::ParamOpOptions &options)¶

Constructor with options.

Parameters:: options – Param operation options

Param::Param(const std::string &name, const Tensor::shape_t &shape = {})¶

Constructor with name and shape.

Parameters:

name – Name of the parameter
shape – Shape of the parameter tensor (default: empty)

Tensor Param::weight() const¶

Get the parameter tensor.

Returns:: Weight tensor

KVCache Layer¶

class KVCache¶

Key-Value cache for autoregressive generation.

KVCache::KVCache()¶: Default constructor.

KVCache::KVCache(const aops::KVCacheOpOptions &options)¶

Constructor with options.

Parameters:: options – KVCache operation options

KVCache::KVCache(int32_t layer_idx, int32_t q_head, int32_t kv_head, int32_t head_dim, bool use_fa2 = true)¶

Constructor with cache parameters.

Parameters:

layer_idx – Layer index
q_head – Number of query heads
kv_head – Number of key/value heads
head_dim – Dimension of each head
use_fa2 – Whether to use FlashAttention-2 (default: true)

void KVCache::setLayerIndex(int32_t layer_idx)¶

Set the layer index.

Parameters:: layer_idx – Layer index

STFT Layer¶

class STFT¶

Short-Time Fourier Transform layer for signal processing.

STFT::STFT()¶: Default constructor.

STFT::STFT(const aops::STFTOpOptions &options)¶

Constructor with options.

Parameters:: options – STFT operation options

STFT::STFT(int n_fft, int hop_length, int win_length, bool onesided = true, bool center = false, const std::string &pad_mode = "constant", bool return_complex = false)¶

Constructor with STFT parameters.

Parameters:

n_fft – Size of Fourier transform
hop_length – Distance between neighboring sliding window frames
win_length – Size of window frame
onesided – Whether to return only non-negative frequency bins (default: true)
center – Whether to pad input on both sides (default: false)
pad_mode – Padding mode (default: “constant”)
return_complex – Whether to return complex tensor (default: false)