Machine Learning - Deep Learning: accuracy

Showing posts with label accuracy. Show all posts

Machine Learning 3 UNIT-2 (A) PPTs

Convolutional Neural Network 2

Q1. Sparse Connection

What does sparsity of connections mean as a benefit of using convolutional layers?

Choose the correct answer from below:

A. Each filter is connected to every channel in the previous layer

B. Each layer in a convolutional network is connected only to two other layers

C. Each activation in the next layer depends on only a small number of activations from the previous layer

D. Regularization causes gradient descent to set many of the parameters to zero

Ans: C

Correct answer: Each activation in the next layer depends on only a small number of activations from the previous layer.

Reason:

In neural network usage, “dense” connections connect all inputs.

By contrast, a CNN is “sparse” because only the local “patch” of pixels is connected, instead using all pixels as an input.

High correlation can be found between the sparseness of the output of different layers, which makes CNN better than traditional Neural networks.

Due to this nature of the CNN, Each activation in the next layer depends on only a small number of activations from the previous layer.

Q2. Data size

As you train your model, you realize that you do not have enough data. Which of the following data augmentation techniques can be used to overcome the shortage of data?

Choose the correct answer from below, please note that this question may have multiple correct answers

A. Adding Noise

B. Rotation

C. Translation

D. Color Augmentation

Ans: A, B, C, D

The correct answers are:

Adding Noise
Rotation
Translation
Color Augmentation.

Reason:

Image augmentation is a process of creating new training examples from the existing ones.

Some data augmentation techniques that can be used to overcome the shortage of data are Adding Noise, Rotation, Translation, and Color Augmentation.

Adding noise to the data aims to improve the generalization performance.

Random rotation augmentation will randomly rotate the images from 0 to 360 degrees in clock wise direction.

Translation just involves moving the image along the X or Y direction (or both).

Color Augmentation alters the intensities of the RGB channels along the natural variations of the images.

Q3. Accuracy After DA

Is it possible for the training data Accuracy to be lower than testing Data after the use of data Augmentation?

Choose the correct answer from below:

A. True

B. False

Ans: A

Correct answer: True

Reason:

The training accuracy could be lowered because we've made it artificially harder for the network to give the right answers, due to all the different augmentation techniques used, which makes the model robust.

However, during testing because of this robust nature, we can get higher accuracy than training data.

Q4. fruit augment

We are making a CNN model that classifies 5 different fruits. The distribution of number of image are as follows:

Banana—20 images

Apple—30 images

Mango—200 images

Watermelon—400 images

Peaches—400 images

Which of the given fruits should undergo augmentation in order to avoid class imbalance in the dataset?

Choose the correct answer from below:

A. Banana, Apple

B. Banana, Apple, Mango

C. Watermelon, Peaches

D. All the Fruits

Ans: B

Correct answer: Banana, Apple, Mango

Image augmentation is a process of creating new training examples from the existing ones.

Imbalanced classification is the problem of classification when there is an unequal distribution of classes in the training dataset.

In the given question, the number of images is low in Banana, Apple, and Mango compared to watermelon and peaches. Hence, we have to use augmentation on them.

Q5. CNN Select Again

Which among the following is False:

Choose the correct answer from below:

A. Dilated convolution increases the receptive field size when compared to standard convolution operator

B. Dropout is a regularization technique

C. Batch normalization ensures that the weight of each of the hidden layer of a deep network is normalized

D. Convolution neural networks are translation invariant

Ans: C

Correct answer: Batch normalization ensures that the weight of each of the hidden layers of a deep network is normalized

Reason:

Compared to the standard convolution operator, the dilated convolution can first capture intrinsic sequence information by expanding the field of the convolution kernel without increasing the parameter amount of the model.

Dropout regularization is a technique to prevent neural networks from overfitting.

Batch normalization is a technique for training very deep neural networks that standardizes the inputs to a layer for each mini-batch, NOT each layer.

CNN is translation invariant, the position of the object in the image should not be fixed for it to be detected by the CNN.

Q6. Reducing Parameters two methods

Which of the following are the methods for tackling overfitting?

Choose the correct answer from below, please note that this question may have multiple correct answers

A. Improving Network Configuration to increase parameters

B. Augmenting Dataset to decrease the number of samples

C. Augmenting Dataset to increase the number of samples

D. Improving Network Configuration to optimise parameters

Ans: C, D

Correct Answers:

Augmenting Dataset to increase the number of samples
Improving Network Configuration to optimise parameters

Explanation:

Over-parameterization of the network, makes it prone to overfitting. One way to tackle this would be to remove some layers from the network.
Augmenting the Dataset leads to a greater diversity of data samples being seen by the network, hence decreasing the likelihood of overfitting the model on the training dataset.

Q7. Underfitting vs Overfitting

The given chart below shows, the training data accuracy vs validation data accuracy for a CNN model with a task of classification for 5 classes.

What is the problem with the model and how to solve the problem, if any?

Choose the correct answer from below:

A. Overfitting, adding More Conv2d layers

B. Underfitting, More epochs

C. Overfitting, Regularization

D. No problem

Ans: B

Correct Answer: Underfitting, More epochs

Explanation:

After looking at the plot, It seems the model was still improving when we stopped training, leading to underfitting!
An underfit model doesn’t fully learn each and every example in the dataset. In such cases, we see a low score on both the training set and test/validation set.
If we add more epochs, the model will learn more and could solve the underfitting problem.

Q8. Data augmentation effectiveness

Suppose you wish to train a neural network to locate lions anywhere in the images, and you use a training dataset that has images similar to the ones shown above. In this case, if we apply the data augmentation techniques, it will be ______ as there is _______ in the training data.

Choose the correct answer from below:

A. effective, position bias

B. ineffective, angle bias

C. ineffective, position bias

D. effective, size bias

Ans: A

The correct answer is: effective, position bias.

Reason:

In this dataset, we can see position bias in images, as the lions are positioned at the center of every image.
Hence, every image is similar, and in this case, applying data augmentation techniques like width shift, and height shift may improve the performance of the network.

Q9. EarlyStopping

Which of the following statement is the best description of early stopping?

Choose the correct answer from below:

A. Train the network until a local minimum in the error function is reached

B. Simulate the network on a validation dataset after every epoch of training. Stop the training when the generalization error starts to increase.

C. Add a momentum term to the weight update in the Generalized Delta Rule

D. A faster version of backpropagation

Ans: B

Correct Answer: Simulate the network on a validation dataset after every epoch of training. Stop the training when the generalization error starts to increase.

Explanation:

During training, the model is evaluated on a holdout validation dataset after each epoch.
If the performance of the model on the validation dataset starts to degrade (Example: loss begins to increase or accuracy begins to decrease), then the training process is stopped.

Q10. EarlyStopping code

Fill the code, for setting early stopping in Tensorflow to monitor validation accuracy val_accuracy and to stop the training when there is no improvement after 2 epochs?

custom_early_stopping = EarlyStopping(
___________,
____________
)

Choose the correct answer from below:

A. monitoring=’val_accuracy’, min_delta=2

B. mode=’val_accuracy’, min_delta=2

C. monitor=’val_accuracy’, patience=2

D. monitoring=’val_accuracy’, patience=2

Ans: C

Correct Answer: monitor=’val_accuracy’, patience=2

Explanation:

monitor=’val_accuracy’ is used to monitor the performance of validation accuracy for every epoch.
patience=2 means the training is terminated as soon as 2 epochs with no improvement in the validation accuracy occur.
min_delta refers to minimum change in the validation accuracy to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement.

Q11. tf dot image

How will you apply data augmentation to rotate the image 270^o counter-clockwise using tf.Image?

Choose the correct answer from below:

A. tf.image.rot(image)

B. tf.image.rot270(image)

C. tf.image.rot90(image, k=3)

D. tf.image.rot(image, k=3)

Ans: C

Correct Answer: tf.image.rot90(image, k=3)

Explanation:

For rotating an image or a batch of images counter-clockwise by multiples of 90 degrees, you can use tf.image.rot90(image, k=3).
k denotes the number of 90 degrees rotations you want to make.

Confusion Matrix

Confusion Matrix

· A much better way to evaluate the performance of a classifier is to look at the confusion matrix.

· The general idea is to

· count the number of times instances of class A are classified as class B.

· For example,

ü to know the number of times the classifier confused images of 5s with 3s,

ü you would look in

· the fifth row and

· third column of the confusion matrix.

ü To compute the confusion matrix,

• you first need to have a set of predictions so that they can be compared to the actual targets.

• You could make predictions on the test set.

• Remember that you want to use the test set only at the very end of your project, once you have a classifier that you are ready to launch.

ü Instead, you can use the cross_val_predict() function:

from sklearn.model_selection import cross_val_predict

y_train_pred = cross_val_predict(sgd_clf, X_train, y_train_5, cv=3)

· Just like the cross_val_score() function,

· cross_val_predict() performs

ü K-fold cross-validation, but instead of returning the evaluation scores,

§ it returns the predictions made on each test fold.

· This means that you get a clean prediction for each instance in the training set.

· “clean” meaning that the prediction is made by a model that never saw the data during training.

ü Now you are ready to get the confusion matrix using the confusion_matrix() function.

ü Just pass it the target classes (y_train_5) and the predicted classes (y_train_pred):

from sklearn.metrics import confusion_matrix

confusion_matrix(y_train_5, y_train_pred)

array([[53057, 1522],

[ 1325, 4096]])

ü Each row in a confusion matrix represents an actual class, while each column represents a predicted class.

ü The first row of this matrix considers non-5 images (the negative class):

• 53,057 of them were correctly classified as non-5s (they are called true negatives),

• while the remaining 1,522 were wrongly classified as 5s (false positives).

• The second row considers the images of 5s (the positive class):

ü 1,325 were wrongly classified as non-5s (false negatives),

ü while the remaining 4,096 were correctly classified as 5s (true positives).

ü A perfect classifier would have only

• true positives and

• true negatives,

ü so its confusion matrix would have nonzero values only on its main diagonal (top left to bottom right):

y_train_perfect_predictions = y_train_5 # pretend we reached perfection

confusion_matrix(y_train_5, y_train_perfect_predictions)

array([[54579, 0],

[ 0, 5421]])

· The confusion matrix gives you a lot of information, but sometimes you may prefer a more concise metric.

· The accuracy of the positive predictions; this is called the precision of the classifier.

• Precision

ü TP is the number of true positives

ü FP is the number of false positives.

· A trivial way to have perfect precision is to make one single positive prediction and ensure it is correct (precision = 1/1 = 100%).

· But this would not be very useful, since the classifier would ignore all but one positive instance.

· So precision is typically used along with another metric named recall, also called

ü sensitivity or

ü the true positive rate (TPR):

· this is the ratio of positive instances that are correctly detected by the classifier.

Recall

•

ü FN is the number of false negatives.

• Confusion matrix is explained in Figure 2.

Figure 2. An illustrated confusion matrix shows examples of true negatives (top left), false positives (top right), false negatives (lower left), and true positives (lower right)

Performance Measures

Performance Measures

• Evaluating a classifier is often significantly trickier than evaluating a regressor.

• There are many performance measures available.

i. Confusion Matrix

ii. True Positive Rate

iii. True Negative Rate

iv. False Positive Rate

v. False Negative Rate

vi. Precision

vii. Recall

viii. Accuracy

ix. F1-Score

x. Specificity

xi. Receiver Operating Characteristic (ROC)

xii. Area Under Curve (AUC)

YouTube Link: https://youtu.be/jL39fMC_I28

Machine Learning - Deep Learning

Machine Learning 3 UNIT-2 (A) PPTs

Convolutional Neural Network 2

Confusion Matrix

Performance Measures

About Machine Learning

SOFTWARE ENGINEERING