Why cross entropy comes in hand with Softmax layer?

Why we need to use softmax function after cross entropy?

Because thecross entropy loss takes the logatithm of the probability. So in order to compute an efficient logarithm, we need to have a probability distribution that sums up to 1.




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • MySQL Syntax
  • Contractive loss
  • EER diagrams and database tables
  • Is KL divergence same as cross entropy for image classification?
  • Joined and be an event volunteer in the conference of Goodle cloud of Toronto