giagrad.nn.CrossEntropyLoss#

class giagrad.nn.CrossEntropyLoss(*args, **kwargs)[source]#

Computes the cross entropy loss between input logits and target.

It is useful when training a classification problem with C classes. The target is expected to contain the unnormalized logits for each class (which do not need to be positive or sum to 1, in general). target has to be a Tensor of size \((C)\) or \((N, C)\).

The target that this criterion expects should contain indices in the range \([0, C)\) where \(C\) is the number of classes. reduction can either be 'mean' (default) or 'sum':

\[\begin{split}\ell(x, y) = \begin{cases} \sum_{n=1}^N \frac{1}{\sum_{n=1}^N w_{y_n} \cdot \mathbb{1}\{y_n \not= \text{ignore_index}\}} l_n, & \text{if reduction} = \text{`mean';}\\ \sum_{n=1}^N l_n, & \text{if reduction} = \text{`sum'.} \end{cases}\end{split}\]

where \(l_n\) is:

\[\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - w_{y_n} \log \frac{\exp(x_{n,y_n})}{\sum_{c=1}^C \exp(x_{n,c})} \cdot \mathbb{1}\{y_n \not= \text{ignore_index}\}\]

and \(x\) is the input, \(y\) is the target, \(w\) is the weight, \(C\) is the number of classes, and \(N\) spans the minibatch dimension.

Variables:

reduction (str, default: 'mean') – Specifies the reduction applied to the output: 'mean' | 'sum'.

Parameters:

pred¶ (Tensor) – Unnormalized logits.
target¶ (Tensor or array_like) – True labels.

Examples

>>> loss = nn.CrossEntropyLoss()
>>> input = Tensor.empty(3, 5, requires_grad=True).uniform()
>>> target = Tensor.empty(3, dtype=np.int8).uniform(b=5)
>>> output = loss(input, target)
>>> output.backward()