Convolutional Neural Networks

Published:

A long time ago, I was thinking about systems that perceive only video input. But the more I read about image processing, the more I get a notion that sole video is not enough.

The system must be able to change its view position according to its internal directive. It also must move in the world and have sensor feedback about its movements. This will add so much more context that no manual object segmentation will be required.

For more information on why more sensor data help to understand the world better, refer to Embodied Cognition.

However, if you want to focus on a visual processing system, then I would recommend looking at YOLO v4.

It’s written in C. This model made a breakthrough in real-time object detection in 2012.

Do you know what I like about this model? It's easily compiled on Windows, and highly optimized for different GPUs. But I like it because it implements many biological principles one can find in Hubel’s book “Eye, Brain, and Vision”.

Main questions

Halftone images

Human eyes not exactly percieve the world as a matrix of RGB pixels. There are several types of retinal ganglion cells. What if we convert images using halftone technique (python 1, python 2, c opencv). Also: Rod and Cone Connections With Bipolar Cells in the Rabbit Retina

How does a convolution kernel get trained?

2D convolution is a matrix-matrix multiplication. See here with pictures and formulas.

Papers

A good list compiled in this post on Towards Science

  • Network in Network Link
  • A guide to convolution arithmetic for deep learning Link
  • Deconvolution and Checkerboard Artifacts Link
  • Multi-Scale Context Aggregation by Dilated Convolutions Link
  • ResNeXt: Aggregated Residual Transformations for Deep Neural Networks Link
  • Going deeper with convolutions Link
  • Flattened convolutional neural networks for feedforward acceleration Link
  • Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs Link
  • Xception: Deep Learning with Depthwise Separable Convolutions Link
  • Rethinking the Inception Architecture for Computer Vision Link
  • MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications Link
  • ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices Link

Blogs

Rate this page