giagrad#

giagrad.Tensor and giagrad.tensor.Function constitute the base of giagrad.

giagrad.Tensor can be initialized with an array_like object, in fact you can create a tensor out of everything numpy.array constructor accepts, but if the input is a numpy.array, the .data attribute will point to that array

>>> Tensor(range(10))
tensor: [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
>>> Tensor([[1, 2, 1], [3, 4, 3]])
tensor: [[1. 2. 1.]
         [3. 4. 3.]]

By default every tensor’s data is float32 but it can be modified

>>> Tensor(range(10), dtype=np.int8)
tensor: [0 1 2 3 4 5 6 7 8 9]

For some specific initialization such as xavier_normal(), you should create an empty tensor and apply the in-place initialization that you want, see empty() and Initializers

>>> Tensor.empty(2, 2, 4).xavier_normal()
tensor: [[[-0.21414495  0.38195378 -1.3415855  -1.0419445 ]
          [ 0.2715997   0.428172    0.42736086  0.14651838]]

         [[ 0.87417895 -0.56151503  0.4281528  -0.65314466]
          [ 0.69647044  0.25468382 -0.08594387 -0.8892542 ]]]

Function#

class giagrad.tensor.Function#

Abstract class for all Tensor operations.

Operations extend the Tensor class to provide additional functionality. The Function class behavior is accessed through the comm() [1] method. To mantain modularity, the operators are implemented in separate files.

For developer use.

Variables:

parents (list of Tensor) – Tensor/s needed for the child class that inherits Function. parents must not contain other types than Tensor, if other attributes are needed they should be an instance variable, e.g. \(\text{neg_slope}\) variable for Leaky ReLU.

giagrad.tensor.Function.forward

Makes forward pass.

giagrad.tensor.Function.backward

Backpropagate from child tensor created with comm().

Tensor class reference#

class giagrad.Tensor#

Autodifferentiable multi-dimensional array and the core of giagrad.

Tensor extends the functionality of a numpy.array implicitly creating an autoddiferentiable computational graph with the help of giagrad.tensor.Function. An instance is only differentiable iff it has a Function and requires_grad [1]. The name is optional, just for giagrad.display.

Variables:
  • data (array_like) – Weights of the tensor.

  • requires_grad (bool, default: False) – If True makes tensor autodifferentiable.

  • name (str, optional) – Optional name of the tensor. For display purpose.

  • dtype (np.float32) – Data type of the .data

Attributes#

Tensor.T

Returns a transposed view of a 2 dimensional Tensor.

Tensor.shape

Tuple of tensor dimensions.

Tensor.dtype

Data-type of the tensor.

Tensor.size

Size of the tensor.

Tensor.ndim

Number of dimensions.

Gradient#

Tensor.backward

Computes the gradient of all preceeding tensors.

Tensor.no_grad

Makes tensor not autodifferentiable.

Tensor.requires_grad_

Makes tensor autodifferentiable.

Class Methods#

Tensor.comm

Returns a new instance of an autodifferentiable tensor given a giagrad.tensor.Function.

Tensor.empty

Creates a tensor filled with uninitialized data.

Initializers#

giagrad.calculate_gain(nonlinearity, neg_slope=None)[source]#

Returns the recommended gain value for a spcefic nonlinear fuction.

Some initializers are derived from specific nonlinear functions such as Kaiming uniform or Kaiming normal through PReLU definition and have a recommended gain associated.

The values are as follow:

nonlinearity

gain

Linear / Identity

\(1\)

Conv{1,2,3}D

\(1\)

Sigmoid

\(1\)

Tanh

\(\frac{5}{3}\)

ReLU

\(\sqrt{2}\)

Leaky Relu

\(\sqrt{\frac{2}{1 + \text{negative_slope}^2}}\)

SELU

\(\frac{3}{4}\)

Warning

In order to implement Self-Normalizing Neural Networks , you should use nonlinearity='linear' instead of nonlinearity='selu'. This gives the initial weights a variance of 1 / N, which is necessary to induce a stable fixed point in the forward pass. In contrast, the default gain for SELU sacrifices the normalisation effect for more stable gradient flow in rectangular layers.

Parameters:
  • nonlinearity (str) – the non-linear method name

  • neg_slope (Scalar) – optional negative slope constant for Leaky ReLU

Examples

>>> giagrad.calculate_gain('leaky_relu', 2)  # leaky_relu with negative_slope=0.2
0.6324555320336759

Tensor.zeros

Fills tensor data with zeros.

Tensor.ones

Fills tensor data with ones.

Tensor.constant

Fills tensor data with a constant value.

Tensor.normal

Fills tensor data with values drawn from the normal distribution \(\mathcal{N}(\text{mu}, \text{std}^2)\).

Tensor.uniform

Fills Tensor data with values drawn from the uniform distribution \(\mathcal{U}(a, b)\).

Tensor.dirac

Fills the {3, 4, 5}-dimensional Tensor data with the Dirac delta function.

Tensor.xavier_uniform

Fills Tensor data with the also known Glorot uniform initialization.

Tensor.xavier_normal

Fills Tensor data with the also known Glorot normal initialization.

Tensor.kaiming_uniform

Fills Tensor data with the also known He uniform initialization.

Tensor.kaiming_normal

Fills Tensor data with the also known He normal initialization.

Tensor.sparse

Fills the 2D Tensor data as a sparse matrix.

Tensor.orthogonal

Fills Tensor data with a (semi) orthogonal matrix.

Math Ops#

Tensor also supports basic arithmetic operations, reverse ones and in-place too. Here’s an example that showcases several operations that are actually supported:

>>> from giagrad import Tensor
>>> a = Tensor([-4.0, 9.0])
>>> b = Tensor([[2.0], [-3.0]])
>>> c = (a + b) / (a * b) + b**3
>>> d = c * (2 + b + 1) / a
>>> c
tensor: [[  8.25       8.611111]
         [-27.583334 -27.222221]] grad_fn: Sum
>>> d
tensor: [[-10.3125      4.7839503]
         [  0.         -0.       ]] grad_fn: Div
>>> c @ d
tensor: [[ -85.078125   39.46759 ]
         [ 284.45312  -131.9573  ]] grad_fn: Matmul

Note

in-place operations (+=, -=, …) only modify data in-place, they do not create a new instance of Tensor. Logical operators (==, >=, …) return a Tensor not differentiable, i.e. breaks computational graph.

Tensor.sqrt

Returns a new tensor with the square-root of the elements of data.

Tensor.square

Returns a new tensor with the square of the elements of data.

Tensor.exp

Returns a new tensor with the exponential of the elements of data.

Tensor.log

Returns a new tensor with the natural logarithm of the elements of data.

Tensor.reciprocal

Returns a new tensor with the reciprocal of the elements of data.

Tensor.abs

Returns a new tensor with the absolute value of the elements of data.

Tensor.add

Returns a new tensor with the sum of data and other.

Tensor.sub

Returns a new tensor with the substraction of other from data.

Tensor.mul

Returns a new tensor with the multiplication of data to other.

Tensor.pow

Returns a new tensor with data raised to the power of other.

Tensor.matmul

Returns a new tensor with the matrix multiplication of data and other.

Tensor.div

Returns a new tensor with the division of data to other.

Activation Functions#

Tensor.relu

Applies the Rectified Linear Unit (ReLU) function element-wise.

Tensor.sigmoid

Returns a new Tensor with element-wise sigmoid function.

Tensor.elu

Creates a new Tensor applying Exponential Linear Unit (ELU) function to data.

Tensor.silu

Returns a new Tensor with element-wise Sigmoid-Weighted Linear Unit (SiLU) function, also called Swish.

Tensor.tanh

Applies the Tanh function element-wise.

Tensor.leakyrelu

Creates a new Tensor applying Leaky Rectified Linear Unit (Leaky ReLU) function to data.

Tensor.softplus

Applies the Softplus function element-wise.

Tensor.quick_gelu

Returns a new Tensor with element-wise Quick GELU.

Tensor.gelu

Creates a new Tensor applying Gaussina Error Linear Unit (Leaky ReLU) function to data.

Tensor.relu6

Applies a modified version of ReLU with maximum size of 6.

Tensor.mish

Returns a new Tensor with element-wise Mish function.

Tensor.hardswish

Creates a new Tensor applying Hard Swish function to data.

Tensor.softmax

Applies Softmax function to every 1-D slice defined by axis.

Tensor.log_softmax

Applies LogSoftmax function to every 1-D slice defined by axis.

Reduction Ops#

Tensor.mean

Returns the mean value of each 1-D slice of the tensor in the given axis, if axis is a list of dimensions, reduce over all of them.

Tensor.sum

Returns the sum of each 1-D slice of the tensor in the given axis, if axis is a list of dimensions, reduce over all of them.

Tensor.max

Returns the maximum value of each 1-D slice of the tensor in the given axis, if axis is a list of dimensions, reduce over all of them.

Tensor.min

Returns the minimum value of each 1-D slice of the tensor in the given axis, if axis is a list of dimensions, reduce over all of them.

Tensor.var

Calculates the variance over the axis specified by axis.

Tensor.std

Calculates the standard deviation over the axis specified by axis.

Indexing, Slicing, Reshaping Ops#

Tensor.reshape

Returns a new tensor with shape equals newshape.

Tensor.permute

Returns a view of the original tensor with its axes permuted.

Tensor.swapaxes

Permutes two specific axes.

Tensor.pad

Pads tensor.

Tensor.squeeze

Remove axes of length one.

Tensor.unsqueeze

Returns a new tensor with its shape expanded.

Other Operations#

Tensor.einsum

Computes Einstein summation convention on self and input operands.