Cross-Entropy Loss

Refreence

Cross-Entropy loss

The Cross-Entropy Loss is actually the only loss we are discussing here. The other losses names written in the title are other names or variations of it. The CE Loss is defined as:

CE=iCtilog(si)CE = -\sum_{i}^{C}t_{i} log (s_{i})

Where tit_i and sis_i are the ground truth and the CNN score for each classiclass_i in CC. As usually an activation function (Sigmoid / Softmax) is applied to the scores before the CE Loss computation, we write f(si)f(s_i) to refer to the activations.

In a binary classification problem, where C=2C'=2, the Cross Entropy Loss can be defined also as [discussion]:

CE=i=1C=2tilog(si)=t1log(s1)(1t1)log(1s1)CE = -\sum_{i=1}^{C'=2}t_{i} log (s_{i}) = -t_{1} log(s_{1}) - (1 - t_{1}) log(1 - s_{1})

Where it’s assumed that there are two classes: C1C_1 and C2C_2. t1t_1 [0,1] and s1s_1 are the ground truth and the score for C1C_1, and t2=1t1t_2=1-t_1 and s2=1s1s_2=1-s_1 are the ground truth and the score for C2C_2. That is the case when we split a Multi-Label classification problem in CC binary classification problems. See next Binary Cross-Entropy Loss section for more details.

Logistic Loss and Multinomial Logistic Loss are other names for Cross-Entropy loss. [Discussion]

def softmax(X):
    exps = np.exp(X)
    return exps / np.sum(exps)


def cross_entropy(predictions, targets):
    N = predictions.shape[0]
    ce = -np.sum(targets * np.log(predictions)) / N
    return ce


predictions = np.array([[0.25, 0.25, 0.25, 0.25], [0.01, 0.01, 0.01, 0.97]]) # (N, num_classes)
targets = np.array([[1, 0, 0, 0], [0, 0, 0, 1]]) # (N, num_classes)

cross_entropy(predictions, targets)
# 0.7083767843022996

log_loss(targets, predictions)
# 0.7083767843022996

log_loss(targets, predictions) == cross_entropy(predictions, targets)
# True

The layers of Caffe, Pytorch and Tensorflow than use a Cross-Entropy loss without an embedded activation function are:

Last updated