Contrastive loss is another loss function used specifically when we want to learn a feature space where similar data points are closer together and dissimilar data points are further apart. Its often used for metric learning tasks like face verification, image retrieval, and self-supervised learning.
It involves:
- Input Pairs: two inputs (images, text, or embeddings) and a label indicating whether these inputs are similar (positive pair) or dissimilar (negative pair)
- Distance Metric: A measure of how far apart both the feature representations are, often computed using Cosine Similarity Loss or Euclidean Distance.
- Label: A binary label y where 0 represents a dissimilar pair and 1 represents a similar pair.
The formula for contrastive loss is:
where:
- is the distance between the feature representations
- is a margin that defines the minimum distance for dissimilar pairs
- is the binary label indicating similarity
- is the total number of pairs
For similar pairs (), the goal is to minimize the distance so that features of similar points are closer. For dissimilar pairs (), the goal is to ensure that the distance is atleast a margin . If is already greater than , then there is no loss.
Contrastive Loss essentially helps enforce a structured embedding space by pulling similar points closer together and pushing dissimilar points further apart. The margin helps ensure that the model doesnβt waste effort on pairs that are already sufficiently far apart.