giagrad.nn#

Cotainers#

Module

Base class for all neural network modules.

Sequential

A sequential container.

Linear Layers#

Linear

Densely-connected Neural Network layer: \(y = xA^T + b\).

Dropout Layers#

Dropout

Randomly sets some of the input tensor elements to zero during training using a Bernoulli distribution with a probability of p.

DropoutND

Randomly zeroes a specific dimension of the input tensor with probability p.

Convolution Layers#

Conv1D

1D convolution layer.

Conv2D

2D convolution layer.

Conv3D

3D convolution layer.

Normalization Layers#

BatchNormND

Applies Batch Normalization as described in Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .

LayerNorm

Applies Layer Normalization over a mini-batch of inputs as described in the paper Layer Normalization

Loss Functions#

CrossEntropyLoss

Computes the cross entropy loss between input logits and target.

Activations#

Unlike the activation functions already implemented as methods in the giagrad.Tensor, there are also classes that, on their own, have the same behavior as their homologous methods in giagrad.Tensor.

Thet are useful when creating modules such as Sequential:

import giagrad.nn as nn
model = nn.Sequential(
        nn.Linear(128, 40),
        nn.LeakyReLU(neg_slope=3),
        nn.Dropout(0.4),
        nn.Linear(40, 10)
    )

They behave like callable classes:

activation = nn.SiLU(alpha=0.5)
t = Tensor.empty(2, 3).uniform(-10, 10)
activation(t)

ReLU

Applies the Rectified Linear Unit (ReLU) function element-wise.

ReLU6

Applies a modified version of ReLU with maximum size of 6.

Hardswish

Applies the hardswish function, element-wise.

Sigmoid

Applies the element-wise function \(\text{Sigmoid}(x) = \frac{1}{1 + \exp(-x)}\)

ELU

Applies the Exponential Linear Unit (ELU) function element-wise.

SiLU

Applies the Sigmoid Linear Unit (SiLU) function, element-wise.

Tanh

Applies the hyperbolic tangent function element-wise.

LeakyReLU

Applies element-wise \(\text{LeakyReLU}(x) = \max(0, x) + \text{negative_slope} * \min(0, x)\).

SoftPlus

Applies element-wise \(\text{SoftPlus}(x) = \frac{1}{\text{beta}} \cdot \log(1 + \exp(\text{beta} \times data_i))\).

Mish

Applies Mish activation function element-wise.

GELU

Applies element-wise the function \(\text{GELU}(x) = x \times \Phi(x)\).

QuickGELU

Applies \(\text{GELU}(x) = x \times \Phi(x)\) activation function with SiLU approximation.

Softmax

Applies the softmax function through 1-D slices specified by axis.

LogSoftmax

Applies the logsoftmax function through 1-D slices specified by axis.