

You might recall that information quantifies the number of bits required to encode and transmit an event.
#ENTROPY IS UPDATE#
Update Dec/2020: Tweaked the introduction to information and entropy to be clearer.Ī Gentle Introduction to Cross-Entropy for Machine Learning.Added intuition for predicted class probabilities. Update Nov/2019: Improved structure and added more explanation of entropy.Added an example of calculating the entropy of the known class labels. Update Oct/2019: Gave an example of cross-entropy for identical distributions and updated description for this case (thanks Ron U).Kick-start your project with my new book Probability for Machine Learning, including step-by-step tutorials and the Python source code files for all examples. Cross-entropy is different from KL divergence but can be calculated using KL divergence, and is different from log loss but calculates the same quantity when used as a loss function.Cross-entropy can be used as a loss function when optimizing classification models like logistic regression and artificial neural networks.
#ENTROPY IS HOW TO#

Although the two measures are derived from a different source, when used as loss functions for classification models, both measures calculate the same quantity and can be used interchangeably. It is closely related to but is different from KL divergence that calculates the relative entropy between two probability distributions, whereas cross-entropy can be thought to calculate the total entropy between the distributions.Ĭross-entropy is also related to and often confused with logistic loss, called log loss. Cross-entropy is commonly used in machine learning as a loss function.Ĭross-entropy is a measure from the field of information theory, building upon entropy and generally calculating the difference between two probability distributions.
