Posted By

Quick question- Who’s the father of ‘deep learning’?

Geoffrey Hinton. That’s right!

Deep neural networks, also called Convolutional Neural Networks or ConvNets, mimic how the human brain works. Hinton co-developed the work-sleep algorithm to train a network with Brendan Frey and Peter Dylan in 1995. More than two decades later, after several setbacks and breakthroughs, Hinton came up with the Capsule Network theory in October 2017.

It’s no news that the 2018 technology trends revolve around IOT, AI, Cloud SaaS, and IOE. The tech is getting smarter by each day, learning to behave as a human would, albeit at a snail’s pace.

Then, the question arises, how can Capsule Networks change things as we know them?

Traditional Neural Networks Are Flawed

Assume that you feed a ConvNet or CNN with images of a ship to learn what a ship looks like. Following its granular style of understanding an image, CNN tears apart the image by layers.

For instance, the first layer would recognise the edges and curves. The second layer would focus on the straight lines and smaller shapes. As the layers build up and get more complicated, they begin recognising complicated structures, like the hull of a ship, and then the entire ship at once.

All these layers are pooled as they’re added, a process used to make sure that the layers can compute a consistent final image. But, and here’s the flaw we talked about, pooling affects positioning.

So, CNN will recognise a ship on the water. Turn the image upside down, swap the position of the tail and hull, or shrink its size and CNN would still declare this distorted image a proper match with the original one from its learning data set.

The point- ConvNet is a neat piece of tech. But, it doesn’t care much for small changes in the viewpoint.

That’s Where Hinton Enters the Scene with the Capsule Theory

Hinton suggests that human brains have tiny modules, known as ‘capsules’, which handle all sorts of visual stimulus. They encode and understand the position, orientation, and size of objects in front of us. They notice any change in the velocity, deformation, hue, texture, and light reflected from the visual and point it out.

Hinton presents a solution based on it. In a shell, Hinton has offered to build an architecture which resembles the human vision system.

This new deep network will be able to understand hierarchical relationships. It’ll learn to perceive an object as a set of nested neural layers and consider its position as well, thus differentiating between images with even a slight change.

With the right implementation of Capsule Theory, a network will recognise a ship. It’ll also recognise if the ship is upside down, different in size, or has a hull where its tail should have been.

Future Prospects- Pretty Decent

Hinton’s Capsule Theory has been shown to deliver 99.75% accurate results on the standard MNIST dataset. It also reduced error by 45% on the Small NORB dataset.

However, Capsule Networks are still in the womb. Their future could go in any direction from here. What we’re yet to discover is how the networks would react on real-time and more complex data.

Vijith Sivadasan

Written By Vijith Sivadasan

An enterprising visionary and a serial entrepreneur, Vijith is driven by instinct in his pursuit for creative excellence. Passionate about transformational marketing strategies, he enunciates the critical need of analytic skills to maximize business potential. To know more on how he can add value to your business, drop him a line at vijith@codelattice.com