Convolutional Neural Network 2
Q1. Sparse Connection
What does sparsity of connections mean as a benefit of using
convolutional layers?
Choose the correct answer from below:
A.
Each filter is connected to every channel in
the previous layer
B.
Each layer in a convolutional network is
connected only to two other layers
C.
Each activation in the next layer depends on
only a small number of activations from the previous layer
D.
Regularization causes gradient descent to set
many of the parameters to zero
Ans: C
Correct answer: Each activation in the next
layer depends on only a small number of activations from the previous layer.
Reason:
- In
neural network usage, “dense” connections connect all inputs.
- By
contrast, a CNN is “sparse” because only the local “patch” of pixels is
connected, instead using all pixels as an input.
- High
correlation can be found between the sparseness of the output of different
layers, which makes CNN better than traditional Neural networks.
- Due
to this nature of the CNN, Each activation in the next layer depends on
only a small number of activations from the previous layer.
Q2. Data size
As you train your model, you realize that you do not have
enough data. Which of the following data augmentation techniques can be used to
overcome the shortage of data?
Choose the correct answer from below, please note that
this question may have multiple correct answers
A.
Adding Noise
B.
Rotation
C.
Translation
D.
Color Augmentation
Ans: A, B, C, D
The correct answers are:
- Adding
Noise
- Rotation
- Translation
- Color
Augmentation.
Reason:
- Image
augmentation is a process of creating new training examples from the
existing ones.
- Some
data augmentation techniques that can be used to overcome the shortage of
data are Adding Noise, Rotation, Translation, and Color Augmentation.
- Adding
noise to the data aims to improve the generalization performance.
- Random
rotation augmentation will randomly rotate the images from 0 to
360 degrees in clock wise direction.
- Translation just
involves moving the image along the X or Y direction (or both).
- Color
Augmentation alters the intensities of the RGB channels along the
natural variations of the images.
Q3. Accuracy After DA
Is it possible for the training data Accuracy to be lower
than testing Data after the use of data Augmentation?
Choose the correct answer from below:
A.
True
B.
False
Ans: A
Correct answer: True
Reason:
- The
training accuracy could be lowered because we've made it artificially
harder for the network to give the right answers, due to all the different
augmentation techniques used, which makes the model robust.
- However,
during testing because of this robust nature, we can get higher accuracy
than training data.
Q4. fruit augment
We are making a CNN model that classifies 5 different
fruits. The distribution of number of image are as follows:
Banana—20 images
Apple—30 images
Mango—200 images
Watermelon—400 images
Peaches—400 images
Which of the given fruits should undergo augmentation in
order to avoid class imbalance in the dataset?
Choose the correct answer from below:
A.
Banana, Apple
B.
Banana, Apple, Mango
C.
Watermelon, Peaches
D.
All the Fruits
Ans: B
Correct answer: Banana, Apple, Mango
- Image
augmentation is a process of creating new training examples from the
existing ones.
- Imbalanced
classification is the problem of classification when there is an unequal
distribution of classes in the training dataset.
- In
the given question, the number of images is low in Banana, Apple, and
Mango compared to watermelon and peaches. Hence, we have to use
augmentation on them.
Q5. CNN Select Again
Which among the following is False:
Choose the correct answer from below:
A.
Dilated convolution increases the receptive
field size when compared to standard convolution operator
B.
Dropout is a regularization technique
C.
Batch normalization ensures that the weight
of each of the hidden layer of a deep network is normalized
D.
Convolution neural networks are translation
invariant
Ans: C
Correct answer: Batch normalization ensures that
the weight of each of the hidden layers of a deep network is normalized
Reason:
- Compared
to the standard convolution operator, the dilated convolution can first
capture intrinsic sequence information by expanding the field of the
convolution kernel without increasing the parameter amount of the model.
- Dropout
regularization is a technique to prevent neural networks from overfitting.
- Batch
normalization is a technique for training very deep neural networks that
standardizes the inputs to a layer for each mini-batch, NOT
each layer.
- CNN
is translation invariant, the position of the object in the image should
not be fixed for it to be detected by the CNN.
Q6. Reducing Parameters two
methods
Which of the following are the methods for tackling
overfitting?
Choose the correct answer from below, please note that
this question may have multiple correct answers
A.
Improving Network Configuration to increase
parameters
B.
Augmenting Dataset to decrease the number of
samples
C.
Augmenting Dataset to increase the number of
samples
D.
Improving Network Configuration to optimise
parameters
Ans: C, D
Correct Answers:
- Augmenting
Dataset to increase the number of samples
- Improving
Network Configuration to optimise parameters
Explanation:
- Over-parameterization
of the network, makes it prone to overfitting. One way to tackle this
would be to remove some layers from the network.
- Augmenting
the Dataset leads to a greater diversity of data samples being seen by the
network, hence decreasing the likelihood of overfitting the model on the
training dataset.
Q7. Underfitting vs
Overfitting
The given chart below shows, the training data accuracy vs
validation data accuracy for a CNN model with a task of classification for 5
classes.
What is the problem with the model and how to solve the
problem, if any?
Choose the correct answer from below:
A.
Overfitting, adding More Conv2d layers
B.
Underfitting, More epochs
C.
Overfitting, Regularization
D.
No problem
Ans: B
Correct Answer: Underfitting, More epochs
Explanation:
- After
looking at the plot, It seems the model was still improving when we
stopped training, leading to underfitting!
- An
underfit model doesn’t fully learn each and every example in the dataset.
In such cases, we see a low score on both the training set and
test/validation set.
- If
we add more epochs, the model will learn more and could solve the
underfitting problem.
Q8. Data augmentation
effectiveness
Suppose you wish to train
a neural network to locate lions anywhere in the images, and you use a training
dataset that has images similar to the ones shown above. In this case, if we
apply the data augmentation techniques, it will be ______ as there is _______
in the training data.
Choose the correct answer from below:
A.
effective, position bias
B.
ineffective, angle bias
C.
ineffective, position bias
D.
effective, size bias
Ans: A
The correct answer is: effective, position bias.
Reason:
- In
this dataset, we can see position bias in images, as the lions are
positioned at the center of every image.
- Hence,
every image is similar, and in this case, applying data augmentation
techniques like width shift, and height shift may improve the performance
of the network.
Q9. EarlyStopping
Which of the following statement is the best description of
early stopping?
Choose the correct answer from below:
A.
Train the network until a local minimum in
the error function is reached
B.
Simulate the network on a validation dataset
after every epoch of training. Stop the training when the generalization error
starts to increase.
C.
Add a momentum term to the weight update in
the Generalized Delta Rule
D.
A faster version of backpropagation
Ans: B
Correct Answer: Simulate the network on a
validation dataset after every epoch of training. Stop the training when the
generalization error starts to increase.
Explanation:
- During
training, the model is evaluated on a holdout validation dataset after
each epoch.
- If
the performance of the model on the validation dataset starts to degrade
(Example: loss begins to increase or accuracy begins to decrease), then
the training process is stopped.
Q10. EarlyStopping code
Fill the code, for setting early stopping in Tensorflow to
monitor validation accuracy val_accuracy and to stop the training
when there is no improvement after 2 epochs?
custom_early_stopping = EarlyStopping(
___________,
____________
)
Choose the correct answer from below:
A.
monitoring=’val_accuracy’, min_delta=2
B.
mode=’val_accuracy’, min_delta=2
C.
monitor=’val_accuracy’, patience=2
D.
monitoring=’val_accuracy’, patience=2
Ans: C
Correct Answer: monitor=’val_accuracy’,
patience=2
Explanation:
- monitor=’val_accuracy’ is
used to monitor the performance of validation accuracy for every epoch.
- patience=2 means
the training is terminated as soon as 2 epochs with no improvement in the
validation accuracy occur.
- min_delta refers
to minimum change in the validation accuracy to qualify as an improvement,
i.e. an absolute change of less than min_delta, will count as no
improvement.
Q11. tf dot image
How will you apply data augmentation to rotate the image 270o counter-clockwise
using tf.Image?
Choose the correct answer from below:
A.
tf.image.rot(image)
B.
tf.image.rot270(image)
C.
tf.image.rot90(image, k=3)
D.
tf.image.rot(image, k=3)
Ans: C
Correct Answer: tf.image.rot90(image, k=3)
Explanation:
- For
rotating an image or a batch of images counter-clockwise by multiples of
90 degrees, you can use tf.image.rot90(image, k=3).
- k denotes
the number of 90 degrees rotations you want to make.
Confusion Matrix
Confusion Matrix
·
A much better way to evaluate the performance of a classifier is to look
at the confusion
matrix.
·
The general idea is to
·
count the number of times instances of class A are classified
as class B.
·
For example,
ü to know the number
of times the classifier confused images of 5s with 3s,
ü you
would look in
·
the fifth row and
·
third column of the
confusion matrix.
ü
To compute the confusion matrix,
•
you first need to have a set of predictions so
that they can be compared to the actual targets.
•
You could make predictions on
the test set.
•
Remember that you want
to use the test set only at the very end of your project, once you have a classifier
that you are ready to launch.
ü
Instead, you can use the cross_val_predict() function:
from sklearn.model_selection import cross_val_predict
y_train_pred = cross_val_predict(sgd_clf, X_train, y_train_5, cv=3)
·
Just like the cross_val_score() function,
·
cross_val_predict() performs
ü K-fold cross-validation, but instead of returning
the evaluation
scores,
§ it returns the predictions made on each test fold.
·
This means that you get a clean prediction for each instance in the training set.
·
“clean” meaning that the prediction is made by a model that never saw the data during training.
ü
Now you are ready to get the confusion matrix using the confusion_matrix() function.
ü
Just pass it the target classes (y_train_5) and the predicted classes (y_train_pred):
from sklearn.metrics import confusion_matrix
confusion_matrix(y_train_5, y_train_pred)
array([[53057, 1522],
[
1325, 4096]])
ü
Each row in a confusion matrix represents an actual class, while each column represents a predicted class.
ü
The first row of this matrix considers non-5 images (the negative class):
•
53,057 of them were correctly classified as non-5s (they are called true negatives),
•
while the remaining 1,522 were wrongly classified as 5s (false
positives).
•
The second row considers the images of 5s (the positive class):
ü 1,325 were wrongly classified as non-5s
(false
negatives),
ü while
the remaining 4,096 were correctly classified as 5s (true positives).
ü
A perfect classifier would have only
•
true positives and
•
true negatives,
ü so its
confusion matrix would have nonzero values only on its main diagonal (top
left to bottom right):
y_train_perfect_predictions = y_train_5 # pretend we reached perfection
confusion_matrix(y_train_5, y_train_perfect_predictions)
array([[54579, 0],
[
0, 5421]])
·
The confusion matrix gives you a lot of information, but sometimes you may
prefer a more concise metric.
·
The accuracy of the positive predictions; this is called the precision of the classifier.
•
Precision
ü TP is the number of true
positives
ü FP is the number of false
positives.
·
A trivial way to have perfect precision is to make one single positive prediction
and ensure it is correct (precision = 1/1 = 100%).
·
But this would not be very useful, since the classifier would ignore all but one positive
instance.
·
So precision is typically used along with another metric named recall, also called
ü sensitivity or
ü the true positive rate
(TPR):
·
this is the ratio of positive instances that are correctly detected by the classifier.
Recall
•
ü
FN is the number of false negatives.
•
Confusion matrix is explained in Figure 2.
Figure 2. An
illustrated confusion matrix shows examples of true negatives (top left), false
positives (top right), false negatives (lower left), and true positives (lower
right)
Performance Measures
Performance Measures
•
Evaluating a classifier is often significantly trickier than evaluating a regressor.
•
There are many performance measures available.
i.
Confusion Matrix
ii.
True Positive Rate
iii.
True Negative Rate
iv.
False Positive Rate
v.
False Negative Rate
vi.
Precision
vii.
Recall
viii.
Accuracy
ix.
F1-Score
x. Specificity
xi.
Receiver Operating Characteristic (ROC)
xii.
Area Under Curve (AUC)
YouTube Link: https://youtu.be/jL39fMC_I28
About Machine Learning
Welcome! Your Hub for AI, Machine Learning, and Emerging Technologies In today’s rapidly evolving tech landscape, staying updated with the ...
-
This blog provides information for the following subjects 👉 Artificial Intelligence 👉 Machine Learning 👉 Machine Learning Programs 👉 ...
-
Machine Learning 👉 About Machine Learning 1 The Machine Learning Landscape Classification Support Vector Machines Decision Trees Ensem...
-
UNIT 3 Support Vector Machines MCQs -------------------------------------------------------------------------------------------------------...