giagrad#

giagrad.Tensor and giagrad.tensor.Function constitute the base of giagrad.

giagrad.Tensor can be initialized with an array_like object, in fact you can create a tensor out of everything numpy.array constructor accepts, but if the input is a numpy.array, the .data attribute will point to that array

>>> Tensor(range(10))
tensor: [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
>>> Tensor([[1, 2, 1], [3, 4, 3]])
tensor: [[1. 2. 1.]
         [3. 4. 3.]]

By default every tensor’s data is float32 but it can be modified

>>> Tensor(range(10), dtype=np.int8)
tensor: [0 1 2 3 4 5 6 7 8 9]

For some specific initialization such as xavier_normal(), you should create an empty tensor and apply the in-place initialization that you want, see empty() and Initializers

>>> Tensor.empty(2, 2, 4).xavier_normal()
tensor: [[[-0.21414495  0.38195378 -1.3415855  -1.0419445 ]
          [ 0.2715997   0.428172    0.42736086  0.14651838]]

         [[ 0.87417895 -0.56151503  0.4281528  -0.65314466]
          [ 0.69647044  0.25468382 -0.08594387 -0.8892542 ]]]

Function#

class giagrad.tensor.Function#

Abstract class for all Tensor operations.

Operations extend the Tensor class to provide additional functionality. The Function class behavior is accessed through the comm() [1] method. To mantain modularity, the operators are implemented in separate files.

For developer use.

Variables:: parents (list of Tensor) – Tensor/s needed for the child class that inherits Function. parents must not contain other types than Tensor, if other attributes are needed they should be an instance variable, e.g. \(\text{neg_slope}\) variable for Leaky ReLU.

`giagrad.tensor.Function.forward`	Makes forward pass.
`giagrad.tensor.Function.backward`	Backpropagate from child tensor created with `comm()`.

Tensor class reference#

class giagrad.Tensor#

Autodifferentiable multi-dimensional array and the core of giagrad.

Tensor extends the functionality of a numpy.array implicitly creating an autoddiferentiable computational graph with the help of giagrad.tensor.Function. An instance is only differentiable iff it has a Function and requires_grad [1]. The name is optional, just for giagrad.display.

Variables:

data (array_like) – Weights of the tensor.
requires_grad (bool, default: False) – If True makes tensor autodifferentiable.
name (str, optional) – Optional name of the tensor. For display purpose.
dtype (np.float32) – Data type of the .data

Attributes#

`Tensor.T`	Returns a transposed view of a 2 dimensional Tensor.
`Tensor.shape`	Tuple of tensor dimensions.
`Tensor.dtype`	Data-type of the tensor.
`Tensor.size`	Size of the tensor.
`Tensor.ndim`	Number of dimensions.

Gradient#

`Tensor.backward`	Computes the gradient of all preceeding tensors.
`Tensor.no_grad`	Makes tensor not autodifferentiable.
`Tensor.requires_grad_`	Makes tensor autodifferentiable.

Class Methods#

`Tensor.comm`	Returns a new instance of an autodifferentiable tensor given a `giagrad.tensor.Function`.
`Tensor.empty`	Creates a tensor filled with uninitialized data.

Initializers#

giagrad.calculate_gain(nonlinearity, neg_slope=None)[source]#

Returns the recommended gain value for a spcefic nonlinear fuction.

Some initializers are derived from specific nonlinear functions such as Kaiming uniform or Kaiming normal through PReLU definition and have a recommended gain associated.

The values are as follow:

nonlinearity	gain
Linear / Identity	\(1\)
Conv{1,2,3}D	\(1\)
Sigmoid	\(1\)
Tanh	\(\frac{5}{3}\)
ReLU	\(\sqrt{2}\)
Leaky Relu	\(\sqrt{\frac{2}{1 + \text{negative_slope}^2}}\)
SELU	\(\frac{3}{4}\)

Warning

In order to implement Self-Normalizing Neural Networks , you should use nonlinearity='linear' instead of nonlinearity='selu'. This gives the initial weights a variance of 1 / N, which is necessary to induce a stable fixed point in the forward pass. In contrast, the default gain for SELU sacrifices the normalisation effect for more stable gradient flow in rectangular layers.

Parameters:

nonlinearity¶ (str) – the non-linear method name
neg_slope¶ (Scalar) – optional negative slope constant for Leaky ReLU

Examples

>>> giagrad.calculate_gain('leaky_relu', 2)  # leaky_relu with negative_slope=0.2
0.6324555320336759

`Tensor.zeros`	Fills tensor data with zeros.
`Tensor.ones`	Fills tensor data with ones.
`Tensor.constant`	Fills tensor data with a constant value.
`Tensor.normal`	Fills tensor data with values drawn from the normal distribution \(\mathcal{N}(\text{mu}, \text{std}^2)\).
`Tensor.uniform`	Fills Tensor data with values drawn from the uniform distribution \(\mathcal{U}(a, b)\).
`Tensor.dirac`	Fills the {3, 4, 5}-dimensional Tensor data with the Dirac delta function.
`Tensor.xavier_uniform`	Fills Tensor data with the also known Glorot uniform initialization.
`Tensor.xavier_normal`	Fills Tensor data with the also known Glorot normal initialization.
`Tensor.kaiming_uniform`	Fills Tensor data with the also known He uniform initialization.
`Tensor.kaiming_normal`	Fills Tensor data with the also known He normal initialization.
`Tensor.sparse`	Fills the 2D Tensor data as a sparse matrix.
`Tensor.orthogonal`	Fills Tensor data with a (semi) orthogonal matrix.

Math Ops#

Tensor also supports basic arithmetic operations, reverse ones and in-place too. Here’s an example that showcases several operations that are actually supported:

>>> from giagrad import Tensor
>>> a = Tensor([-4.0, 9.0])
>>> b = Tensor([[2.0], [-3.0]])
>>> c = (a + b) / (a * b) + b**3
>>> d = c * (2 + b + 1) / a
>>> c
tensor: [[  8.25       8.611111]
         [-27.583334 -27.222221]] grad_fn: Sum
>>> d
tensor: [[-10.3125      4.7839503]
         [  0.         -0.       ]] grad_fn: Div
>>> c @ d
tensor: [[ -85.078125   39.46759 ]
         [ 284.45312  -131.9573  ]] grad_fn: Matmul

Note

in-place operations (+=, -=, …) only modify data in-place, they do not create a new instance of Tensor. Logical operators (==, >=, …) return a Tensor not differentiable, i.e. breaks computational graph.

`Tensor.sqrt`	Returns a new tensor with the square-root of the elements of data.
`Tensor.square`	Returns a new tensor with the square of the elements of data.
`Tensor.exp`	Returns a new tensor with the exponential of the elements of data.
`Tensor.log`	Returns a new tensor with the natural logarithm of the elements of data.
`Tensor.reciprocal`	Returns a new tensor with the reciprocal of the elements of data.
`Tensor.abs`	Returns a new tensor with the absolute value of the elements of data.
`Tensor.add`	Returns a new tensor with the sum of data and `other`.
`Tensor.sub`	Returns a new tensor with the substraction of `other` from data.
`Tensor.mul`	Returns a new tensor with the multiplication of data to `other`.
`Tensor.pow`	Returns a new tensor with data raised to the power of `other`.
`Tensor.matmul`	Returns a new tensor with the matrix multiplication of data and `other`.
`Tensor.div`	Returns a new tensor with the division of data to `other`.

Activation Functions#

`Tensor.relu`	Applies the Rectified Linear Unit (ReLU) function element-wise.
`Tensor.sigmoid`	Returns a new Tensor with element-wise sigmoid function.
`Tensor.elu`	Creates a new Tensor applying Exponential Linear Unit (ELU) function to data.
`Tensor.silu`	Returns a new Tensor with element-wise Sigmoid-Weighted Linear Unit (SiLU) function, also called Swish.
`Tensor.tanh`	Applies the Tanh function element-wise.
`Tensor.leakyrelu`	Creates a new Tensor applying Leaky Rectified Linear Unit (Leaky ReLU) function to data.
`Tensor.softplus`	Applies the Softplus function element-wise.
`Tensor.quick_gelu`	Returns a new Tensor with element-wise Quick GELU.
`Tensor.gelu`	Creates a new Tensor applying Gaussina Error Linear Unit (Leaky ReLU) function to data.
`Tensor.relu6`	Applies a modified version of ReLU with maximum size of 6.
`Tensor.mish`	Returns a new Tensor with element-wise Mish function.
`Tensor.hardswish`	Creates a new Tensor applying Hard Swish function to data.
`Tensor.softmax`	Applies Softmax function to every 1-D slice defined by `axis`.
`Tensor.log_softmax`	Applies LogSoftmax function to every 1-D slice defined by `axis`.

Reduction Ops#

`Tensor.mean`	Returns the mean value of each 1-D slice of the tensor in the given `axis`, if `axis` is a list of dimensions, reduce over all of them.
`Tensor.sum`	Returns the sum of each 1-D slice of the tensor in the given `axis`, if `axis` is a list of dimensions, reduce over all of them.
`Tensor.max`	Returns the maximum value of each 1-D slice of the tensor in the given `axis`, if `axis` is a list of dimensions, reduce over all of them.
`Tensor.min`	Returns the minimum value of each 1-D slice of the tensor in the given `axis`, if `axis` is a list of dimensions, reduce over all of them.
`Tensor.var`	Calculates the variance over the axis specified by `axis`.
`Tensor.std`	Calculates the standard deviation over the axis specified by `axis`.

Indexing, Slicing, Reshaping Ops#

`Tensor.reshape`	Returns a new tensor with shape equals `newshape`.
`Tensor.permute`	Returns a view of the original tensor with its `axes` permuted.
`Tensor.swapaxes`	Permutes two specific axes.
`Tensor.pad`	Pads tensor.
`Tensor.squeeze`	Remove axes of length one.
`Tensor.unsqueeze`	Returns a new tensor with its shape expanded.

Other Operations#

Tensor.einsum

Computes Einstein summation convention on self and input operands.