Convolutional Neural Networks

Published: May 13, 2023

Table of contents

Main questions
- Halftone images
- How does a convolution kernel get trained?
Papers
- Blogs

A long time ago, I was thinking about systems that perceive only video input. But the more I read about image processing, the more I get a notion that sole video is not enough.

The system must be able to change its view position according to its internal directive. It also must move in the world and have sensor feedback about its movements. This will add so much more context that no manual object segmentation will be required.

For more information on why more sensor data help to understand the world better, refer to Embodied Cognition.

However, if you want to focus on a visual processing system, then I would recommend looking at YOLO v4.

It’s written in C. This model made a breakthrough in real-time object detection in 2012.

Do you know what I like about this model? It's easily compiled on Windows, and highly optimized for different GPUs. But I like it because it implements many biological principles one can find in Hubel’s book “Eye, Brain, and Vision”.

Main questions

Halftone images

Human eyes not exactly percieve the world as a matrix of RGB pixels. There are several types of retinal ganglion cells. What if we convert images using halftone technique (python 1, python 2, c opencv). Also: Rod and Cone Connections With Bipolar Cells in the Rabbit Retina

How does a convolution kernel get trained?

2D convolution is a matrix-matrix multiplication. See here with pictures and formulas.

Papers

A good list compiled in this post on Towards Science

Network in Network Link
A guide to convolution arithmetic for deep learning Link
Deconvolution and Checkerboard Artifacts Link
Multi-Scale Context Aggregation by Dilated Convolutions Link
ResNeXt: Aggregated Residual Transformations for Deep Neural Networks Link
Going deeper with convolutions Link
Flattened convolutional neural networks for feedforward acceleration Link
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs Link
Xception: Deep Learning with Depthwise Separable Convolutions Link
Rethinking the Inception Architecture for Computer Vision Link
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications Link
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices Link