giagrad.nn.CrossEntropyLoss#
- class giagrad.nn.CrossEntropyLoss(*args, **kwargs)[source]#
Computes the cross entropy loss between input logits and target.
It is useful when training a classification problem with C classes. The target is expected to contain the unnormalized logits for each class (which do not need to be positive or sum to 1, in general). target has to be a Tensor of size \((C)\) or \((N, C)\).
The target that this criterion expects should contain indices in the range \([0, C)\) where \(C\) is the number of classes.
reduction
can either be'mean'
(default) or'sum'
:\[\begin{split}\ell(x, y) = \begin{cases} \sum_{n=1}^N \frac{1}{\sum_{n=1}^N w_{y_n} \cdot \mathbb{1}\{y_n \not= \text{ignore_index}\}} l_n, & \text{if reduction} = \text{`mean';}\\ \sum_{n=1}^N l_n, & \text{if reduction} = \text{`sum'.} \end{cases}\end{split}\]where \(l_n\) is:
\[\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - w_{y_n} \log \frac{\exp(x_{n,y_n})}{\sum_{c=1}^C \exp(x_{n,c})} \cdot \mathbb{1}\{y_n \not= \text{ignore_index}\}\]and \(x\) is the input, \(y\) is the target, \(w\) is the weight, \(C\) is the number of classes, and \(N\) spans the minibatch dimension.
- Variables:
reduction (str, default: 'mean') – Specifies the reduction applied to the output:
'mean'
|'sum'
.- Parameters:
Examples
>>> loss = nn.CrossEntropyLoss() >>> input = Tensor.empty(3, 5, requires_grad=True).uniform() >>> target = Tensor.empty(3, dtype=np.int8).uniform(b=5) >>> output = loss(input, target) >>> output.backward()