Computer Vision
DenseNet: How Connections Revolutionized Deep Learning
·4380 words·21 mins
This series explores DenseNet’s revolutionary approach to neural connectivity that solved vanishing gradients and improved feature reuse, examines its mathematical foundations and practical implementation, and discusses how its limitations eventually paved the way for Vision Transformers. We trace the evolution from convolutional networks to hybrid architectures, showing how each innovation built upon previous breakthroughs while addressing their shortcomings in the endless pursuit of more efficient and powerful deep learning models.
ResNet Overview and Implementatoin
·2612 words·13 mins
ResNet model and the seminal paper, Deep Residual Learning for Image Recognition by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, which won the Best Paper award at CVPR 2016. It is one of the most influential and fundamental papers in the history of deep learning for computer vision.
VGGNet Overview
·1820 words·9 mins
VGGNet is a famous deep learning model used in computer vision—essentially, teaching computers to understand images. It was created by researchers at the Visual Geometry Group (VGG) at the University of Oxford. Since its debut in 2014, VGGNet has become one of the key models that helped advance how machines see and recognize objects in photos. At its core, VGGNet is designed to look at images and decide what is in them.
Gradient-Based Learning Applied to Document Recognition
·860 words·5 mins
LeNet-5 is an early and very influential type of convolutional neural network (CNN) developed by Yann LeCun and his colleagues in 1998, designed mainly to recognize handwritten digits like those in the MNIST dataset. What makes LeNet-5 special is how it combines several clever ideas that allow it to efficiently and accurately understand images despite their complexity—ideas that were crucial stepping stones for today’s deep learning revolution.
Salient Object Detection
·1168 words·6 mins
Salient object detection (SOD) is a crucial task in computer vision that focuses on identifying and segmenting the most visually distinctive objects or regions within an image. The primary aim of SOD is to mimic human visual attention, allowing algorithms to highlight areas that are likely to attract a viewer’s focus.
DeepFake Detection Methods
·1162 words·6 mins
In this blog post, we explore the topic of image generators and their detection techniques. I’ll discuss various methods for detecting image generators and their manipulations. These include analyzing the visual content of an image, examining its metadata, and using machine learning algorithms to identify patterns in the data.
imageNet-Computer Vision Backbone
·1065 words·5 mins
ImageNet is more than just a dataset. The sheer scale of ImageNet, combined with its detailed labeling, made it essentially the backbone of Computer Vision.