giagrad#
giagrad.Tensor
and giagrad.tensor.Function
constitute the base of giagrad.
giagrad.Tensor
can be initialized with an array_like object, in fact you can create a tensor
out of everything numpy.array constructor accepts, but if the input is a numpy.array, the .data
attribute will point to that array
>>> Tensor(range(10))
tensor: [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
>>> Tensor([[1, 2, 1], [3, 4, 3]])
tensor: [[1. 2. 1.]
[3. 4. 3.]]
By default every tensor’s data is float32
but it can be modified
>>> Tensor(range(10), dtype=np.int8)
tensor: [0 1 2 3 4 5 6 7 8 9]
For some specific initialization such as xavier_normal()
, you should create an
empty tensor and apply the in-place initialization that you want, see empty()
and Initializers
>>> Tensor.empty(2, 2, 4).xavier_normal()
tensor: [[[-0.21414495 0.38195378 -1.3415855 -1.0419445 ]
[ 0.2715997 0.428172 0.42736086 0.14651838]]
[[ 0.87417895 -0.56151503 0.4281528 -0.65314466]
[ 0.69647044 0.25468382 -0.08594387 -0.8892542 ]]]
Function#
- class giagrad.tensor.Function#
Abstract class for all Tensor operations.
Operations extend the Tensor class to provide additional functionality. The Function class behavior is accessed through the
comm()
[1] method. To mantain modularity, the operators are implemented in separate files.For developer use.
- Variables:
parents (list of Tensor) – Tensor/s needed for the child class that inherits Function.
parents
must not contain other types than Tensor, if other attributes are needed they should be an instance variable, e.g. \(\text{neg_slope}\) variable for Leaky ReLU.
Makes forward pass. |
|
Backpropagate from child tensor created with |
Tensor class reference#
- class giagrad.Tensor#
Autodifferentiable multi-dimensional array and the core of giagrad.
Tensor extends the functionality of a numpy.array implicitly creating an autoddiferentiable computational graph with the help of
giagrad.tensor.Function
. An instance is only differentiable iff it has a Function and requires_grad [1]. The name is optional, just for giagrad.display.- Variables:
data (array_like) – Weights of the tensor.
requires_grad (bool, default: False) – If
True
makes tensor autodifferentiable.name (str, optional) – Optional name of the tensor. For display purpose.
dtype (np.float32) – Data type of the
.data
Attributes#
Returns a transposed view of a 2 dimensional Tensor. |
|
Tuple of tensor dimensions. |
|
Data-type of the tensor. |
|
Size of the tensor. |
|
Number of dimensions. |
Gradient#
Computes the gradient of all preceeding tensors. |
|
Makes tensor not autodifferentiable. |
|
Makes tensor autodifferentiable. |
Class Methods#
Returns a new instance of an autodifferentiable tensor given a |
|
Creates a tensor filled with uninitialized data. |
Initializers#
- giagrad.calculate_gain(nonlinearity, neg_slope=None)[source]#
Returns the recommended gain value for a spcefic nonlinear fuction.
Some initializers are derived from specific nonlinear functions such as Kaiming uniform or Kaiming normal through PReLU definition and have a recommended gain associated.
The values are as follow:
nonlinearity
gain
Linear / Identity
\(1\)
Conv{1,2,3}D
\(1\)
Sigmoid
\(1\)
Tanh
\(\frac{5}{3}\)
ReLU
\(\sqrt{2}\)
Leaky Relu
\(\sqrt{\frac{2}{1 + \text{negative_slope}^2}}\)
SELU
\(\frac{3}{4}\)
Warning
In order to implement Self-Normalizing Neural Networks , you should use
nonlinearity='linear'
instead ofnonlinearity='selu'
. This gives the initial weights a variance of1 / N
, which is necessary to induce a stable fixed point in the forward pass. In contrast, the default gain forSELU
sacrifices the normalisation effect for more stable gradient flow in rectangular layers.- Parameters:
Examples
>>> giagrad.calculate_gain('leaky_relu', 2) # leaky_relu with negative_slope=0.2 0.6324555320336759
Fills tensor data with zeros. |
|
Fills tensor data with ones. |
|
Fills tensor data with a constant value. |
|
Fills tensor data with values drawn from the normal distribution \(\mathcal{N}(\text{mu}, \text{std}^2)\). |
|
Fills Tensor data with values drawn from the uniform distribution \(\mathcal{U}(a, b)\). |
|
Fills the {3, 4, 5}-dimensional Tensor data with the Dirac delta function. |
|
Fills Tensor data with the also known Glorot uniform initialization. |
|
Fills Tensor data with the also known Glorot normal initialization. |
|
Fills Tensor data with the also known He uniform initialization. |
|
Fills Tensor data with the also known He normal initialization. |
|
Fills the 2D Tensor data as a sparse matrix. |
|
Fills Tensor data with a (semi) orthogonal matrix. |
Math Ops#
Tensor also supports basic arithmetic operations, reverse ones and in-place too. Here’s an example that showcases several operations that are actually supported:
>>> from giagrad import Tensor
>>> a = Tensor([-4.0, 9.0])
>>> b = Tensor([[2.0], [-3.0]])
>>> c = (a + b) / (a * b) + b**3
>>> d = c * (2 + b + 1) / a
>>> c
tensor: [[ 8.25 8.611111]
[-27.583334 -27.222221]] grad_fn: Sum
>>> d
tensor: [[-10.3125 4.7839503]
[ 0. -0. ]] grad_fn: Div
>>> c @ d
tensor: [[ -85.078125 39.46759 ]
[ 284.45312 -131.9573 ]] grad_fn: Matmul
Note
in-place operations (+=
, -=
, …) only modify data in-place, they do not create a new
instance of Tensor. Logical operators (==
, >=
, …) return a Tensor not differentiable,
i.e. breaks computational graph.
Returns a new tensor with the square-root of the elements of data. |
|
Returns a new tensor with the square of the elements of data. |
|
Returns a new tensor with the exponential of the elements of data. |
|
Returns a new tensor with the natural logarithm of the elements of data. |
|
Returns a new tensor with the reciprocal of the elements of data. |
|
Returns a new tensor with the absolute value of the elements of data. |
|
Returns a new tensor with the sum of data and |
|
Returns a new tensor with the substraction of |
|
Returns a new tensor with the multiplication of data to |
|
Returns a new tensor with data raised to the power of |
|
Returns a new tensor with the matrix multiplication of data and |
|
Returns a new tensor with the division of data to |
Activation Functions#
Applies the Rectified Linear Unit (ReLU) function element-wise. |
|
Returns a new Tensor with element-wise sigmoid function. |
|
Creates a new Tensor applying Exponential Linear Unit (ELU) function to data. |
|
Returns a new Tensor with element-wise Sigmoid-Weighted Linear Unit (SiLU) function, also called Swish. |
|
Applies the Tanh function element-wise. |
|
Creates a new Tensor applying Leaky Rectified Linear Unit (Leaky ReLU) function to data. |
|
Applies the Softplus function element-wise. |
|
Returns a new Tensor with element-wise Quick GELU. |
|
Creates a new Tensor applying Gaussina Error Linear Unit (Leaky ReLU) function to data. |
|
Applies a modified version of ReLU with maximum size of 6. |
|
Returns a new Tensor with element-wise Mish function. |
|
Creates a new Tensor applying Hard Swish function to data. |
|
Applies Softmax function to every 1-D slice defined by |
|
Applies LogSoftmax function to every 1-D slice defined by |
Reduction Ops#
Returns the mean value of each 1-D slice of the tensor in the given |
|
Returns the sum of each 1-D slice of the tensor in the given |
|
Returns the maximum value of each 1-D slice of the tensor in the given |
|
Returns the minimum value of each 1-D slice of the tensor in the given |
|
Calculates the variance over the axis specified by |
|
Calculates the standard deviation over the axis specified by |
Indexing, Slicing, Reshaping Ops#
Returns a new tensor with shape equals |
|
Returns a view of the original tensor with its |
|
Permutes two specific axes. |
|
Pads tensor. |
|
Remove axes of length one. |
|
Returns a new tensor with its shape expanded. |
Other Operations#
Computes Einstein summation convention on self and input operands. |