Deep Learning: UNIT- II: CNN : 7. CNN VS Fully Connected
7. CNN vs Fully Connected
- The basic difference between the two types of layers is the density
of the connections. The FC layers are densely connected, meaning that every neuron in
the output is connected to every input neuron. On the other hand, in a
Conv layer, the neurons are not densely connected but are connected only
to neighboring neurons within the width of the convolutional kernel.
- A second main difference between them is weight sharing. In an FC layer, every output neuron is connected
to every input neuron through a different weight . However, in a Conv
layer, the weights are shared among different neurons. This is another
characteristic that enables Conv layers to be used in the case of a large
number of neurons.
The problems with MLP for Image Data?
a. MLP
will react differently to an image and its shifted version
b. MLP
doesnot consider Spatial relations
c. Includes
too many Parameters
a. MLP
will react differently to an image and its shifted version
- Since MLP flattens the image,
it is not positioning invariant
- we fitted our image to ANN
- Converted the image into a single
feature vector,
- hence not considering the
neighbouring pixels, and
- most importantly the Image channels
(R-G-B)
Lets
take an example
- Supposedly we have two images
of the same dog but at two different position
- One
on the Upper left while one on the middle right
- Now since MLP will flatten
the matrix, the neurons
which might be more active for the first image will be dormant for the second one - Making MLP think these two
images having completely different objects
b.
MLP doesnot consider Spatial relations
- Spatial Information (like if a Person
is standing at the right side of the Car or The red car is on the left
side of the blue bike) gets lost when image is flattened
- Flattening also loses the internal
representation of the 2D image.
c. Includes
too many Parameters
- Since MLP is a fully
connected model, it requires a neuron for every input pixel of the image
- Now Lets take an example with an
image of size (1280 x 720) .
- For an image with dimension
as such the vector for the input layer becomes (921600 x 1). if
a Dense layer of 128 is used then the number of parameters equals
= 921600*128.
- This makes MLP infeasible
for large image and it may cause overfitting.
Do
we even require global connectivity ?
- The global connectivity caused due to
densely connected neurons leads to more reduntant parameters which makes
the MLP overfit
With
all the above discussion we need:
- to make the system translation
(position) invariant
- to leverage the spatial correlation
between the pixels
- focus only on the local connectivity
What
should be the SPECIAL Features of CNN?
From
the above discussion and taking inspiration from our visual cortex system,
there are 3 essential properties of image data:
1.
LOCALITY: Correlation between neighbouring pixels in a Image
2.
STATIONARITY: Similar Patterns appearing multiple times in a Image
3.
COMPOSITIONALITY: Extracting higher level features by pooling lower
level features
Comments
Post a Comment