Computer vision is a field of AI that focuses on machines interpreting and understanding visual information. This enables computers to gain insights from images, videos, or real-time camera feeds.
This is primarily achieved with Convolutional Neural Network (CNN) or Transformers. Some notable computer vision models are AlexNet, VGGNet, and Contrastive Language-Image Pretraining (CLIP).