# Neural Networks note

**1.Is KL divergence same as cross entropy for image classification?**

__Yes__

In Image classification, we use one-hot encoding for our labels. Therefore, when y_i is the actual label, it equals 1 → log (1) = 0, and the whole term is cancelled. When y_i is not the correct label, it equals 0 and the whole term is also cancelled out.

Therefore, KL divergence = Cross Entropy in image classification tasks

**2.Why cross entropy comes in hand with Softmax layer?**

__Why we need to use softmax function after cross entropy?__

Because thecross entropy loss takes the *logatithm of the probability*. So in order to compute an efficient logarithm, we need to have *a probability distribution that sums up to 1*.

**3.Contractive loss**

- Contrastive Loss is a distance-based Loss function (as opposed to
**prediction error-based**losses like cross entropy) used to learn discriminative features for images. - Like any distance-based loss, it tries to ensure that semantically similar examples are embedded close together. It is calculated on
**Pairs** - This loss measures the
*similarity*between two inputs. - Each sample is composed of two images (
*positive pairs*or*negative pairs*). Our goal is to maximize the distance between*negative pairs*and__minimize__the distance between *positive pairs*. - We want small distance between the positive pairs (because they are similar images/inputs), and great distance than some margin m for negative pairs.

## Enjoy Reading This Article?

Here are some more articles you might like to read next: