Machine Learning - Deep Learning: August 2024

TensorFlow & keras 3

Q1. Functional model

Complete the code snippet in order to get the following model summary.

from tensorflow.keras.layers import Dense, Flatten, Input
from tensorflow.keras.models import Model

def create_model_functional():
inp = Input(shape=(28, ))
h1 = Dense(64, activation="relu", name="hidden_1")(inp)
h2 = Dense(_a_ , activation="relu", name="hidden_2")(h1)
out = Dense(4, activation="softmax", name="output")(_b_)
model = Model(inputs=inp, outputs=out, name="simple_nn")

return model

model_functional = create_model_functional()
model_functional.summary()

Choose the correct answer from below:

A. 512, b - h2

B. 64, b - h2

C. 10, b - h1

D. 512, b – inp

Ans: A

Correct Option: a- 512, b - h2

Explanation:

To get the model summary as shown in the question, the value of a should be 512 and the value of b should be h2. This will create a neural network model with 2 hidden layers, the first hidden layer with 64 neurons and the second hidden layer with 512 neurons.

Here's an explanation of the code:

The 'create_model_functional' function creates a functional neural network model using the Keras API from TensorFlow.
The model has an input layer with shape (28,), which means it expects input data with 28 features. The first hidden layer has 64 neurons and uses the ReLU activation function.
The second hidden layer has 'a' neurons and uses the ReLU activation function. In this case, we want 'a' to be 512, so that the second hidden layer has 512 neurons.
The output layer has 4 neurons and uses the softmax activation function, which is suitable for multiclass classification problems.
The 'b' placeholder is used to connect the output of the second hidden layer to the input of the output layer. In this case, we want to connect it to 'h2', which is the output of the second hidden layer.

Q2. Customized loss function

For a certain sequential regression model predicting two outputs, we implemented a loss function that penalizes the prediction error for the second output(y2) more than the first one(y1) because y2 is more important and we want it to be really close to the target value.

import numpy as np
def custom_mse(y_true, y_pred):
loss = np.square(y_pred - y_true)
loss = loss * [0.5, 0.5] #x
loss = np.sum(loss, axis=0) #y
return loss
model.compile(loss=custom_mse, optimizer='adam')

Which of the following option is correct with respect to the above implementation of a custom-made loss function?

Note: The shape of y_pred is (batch_size, 2) in the implementation.

Choose the correct answer from below, please note that this question may have multiple correct answers

A. Custom_mse function's output should have a shape (batch_size, 2)

B. Custom_mse function's output should have a shape (batch_size, )

C. The axis for the sum of loss in line x should be 1

D. The multiplication of [0.5, 0.5] in line y won't be helpful for our requirement

Ans: B,C,D

Correct options :

Custom_mse function's output should have a shape (batch_size, )

The axis for the sum of loss in line x should be 1

The multiplication of [0.5, 0.5] in line y won't be helpful for our requirement

Explanation :

Custom_mse function's output should have a shape (batch_size, ): The first dimension of arguments y_true and y_pred is always the same as batch size. The loss function should always return a vector of length batch_size.

The axis for the sum of loss in line x should be 1: Here we need the loss values for each observation's two outputs to be summed up therefore axis=1 should be used.

The multiplication of [0.5, 0.5] in line y won't be helpful for our requirement: Because we want to penalize the error for y2 more therefore we can use any of the values where the weight for y2 is more eg. [0.3, 0.7].

TensorFlow and Keras-2

Q1. Sigmoid and softmax functions

Which of the following statements is true for a neural network having more than one output neuron ?

Choose the correct answer from below:

A. In a neural network where the output neurons have the sigmoid activation, the sum of all the outputs from the neurons is always 1.

B. In a neural network where the output neurons have the sigmoid activation, the sum of all the outputs from the neurons is 1 if and only if we have just two output neurons.

C. In a neural network where the output neurons have the softmax activation, the sum of all the outputs from the neurons is always 1.

D. The softmax function is a special case of the sigmoid function

Ans: C

Correct option : In a neural network where the output neurons have the softmax activation, the sum of all the outputs from the neurons is always 1.

Explanation :

For the sigmoid activation, when we have more than one neuron, it is possible to have the sum of outputs from the neurons to have any value.

The softmax classifier outputs the probability distribution for each class, and the sum of the probabilities is always 1.

The Sigmoid function is the special case of the Softmax function where the number of classes is 2.

Q2. Type of loss

We want to classify credit card transactions as fraudulent or normal, which loss type is appropriate for this use case?

Choose the correct answer from below, please note that this question may have multiple correct answers

A. Categorical crossentropy

B. Binary crossentropy

C. Adam

D. SGD

Ans: A, B

Correct Option:

Categorical crossentropy
Binary crossentropy

Explanation:
If you have one neuron at the end in the classification NN, then you need to use sigmoid. In that case, it will be binary cross-entropy. If you take 2 neurons at the end, then you have to one-hot encode the target and then you need to use softmax with CCE.

Q3. Callbacks in tensorflow

Which method gets called after each epoch in tensorflow callback?

Choose the correct answer from below:

A. on_epoch_end

B. on_epoch_finished

C. on_end

D. on_training_complete

Ans: A

Correct Option: on_epoch_end

Explanation:

tensorflow callback method on_epoch_end contain functionalities that can be called at the end of each epoch.
tf.keras.callbacks.Callback can be inherited by custom classes in which methods like on_train_begin, on_epoch_begin can be

Q4. Avoid overfitting

Jack was asked to create a classifier for a two-class non-linearly separable dataset consisting of 100 observations. He did not know the complexity of the non-linearity of separation therefore he created a model with 500 nodes in the only hidden layer with ReLU activation and used sigmoid in the output layer.

Jack was aware that his model can overfit the data so he implemented a function that can stop the training as soon as the model starts overfitting.

from keras.callbacks import EarlyStopping
es = EarlyStopping(monitor = 'val_loss',
min_delta = 0,
patience = 3,
restore_best_weights = True)

Now Ryan also wanted to implement such a function and he made the observations given in the options. Which of Ryan's observation(s) are incorrect?

Choose the correct answer from below, please note that this question may have multiple correct answers

A. The training process will be monitored according to the validation loss.

B. The training process will stop as soon as the difference between validation loss of two consecutive epochs is greater than 0.

C. The training process will be stopped if there are more than 3 epochs having val_loss smaller than the latest minimum val_loss value.

D. The best model weights according to val_loss, will be saved after training.

Ans: B, C

Correct options :

The training process will stop if the difference between validation loss of two consecutive epochs is greater than 0.

The training process will be stopped if there are more than 3 epochs having val_loss smaller than the latest minimum val_loss value.

Explanation :

Actually min_delta is set to 0 implying that if the val_loss decreases by any value greater than 0 it will be counted as an improvement.

Actually patience is set to 3 implying that the training will be stopped if there are more than three consecutive epochs with no improvement according to min_delta. (i.e. all val_loss were increasing) Two epochs are said to be improving if the monitored value improves (i.e. here val_loss decreases by a value greater than equal to min_delta).

Monitor: The 'monitor' argument takes the value or the metric based on which the training is evaluated.

min_delta: The 'min_delta' argument takes the value which represents the absolute minimum difference between the monitored value in two consecutive epochs for which the training is stopped, that is the minimum change required to qualify as an improvement.

patience: The 'patience' argument takes the maximum number of epochs for which no improvement was made.

restore_best_weights: The 'restore_best_weights' parameter is set to true if we want to save the best model during training process according to the monitored metric.

Q5. Adding callbacks

We are trying to train a model on a training dataset for 20 epochs.

model.fit(x_train, y_train, epochs=20,callbacks = callback)

Add callbacks to the above model based on the conditions given below:

Cond1. If the validation accuracy at an epoch is less than the previous epoch's accuracy, we have to decrease the learning rate by 10%.

The options for Cond1 are:

a. reduce_lr = ReduceLROnPlateau(monitor='val_acc', factor=0.9,
                              patience=1)
   callback=[reduce_lr]

b. reduce_lr = ReduceLROnPlateau(monitor='val_acc', factor=0.9,
                              patience=0)
   callback=[reduce_lr]

Cond2. For every 3rd epoch, decay the learning rate by 5%.

The options for Cond2 are:

c. def step_decay(epoch):
      initial_lrate = 0.1
      drop = 0.95
      epochs_drop = 3
      lrate = initial_lrate * math.pow(drop,math.floor((epoch)/epochs_drop))
      return lrate

   lrate = LearningRateScheduler(step_decay)
   callback = [lrate]

d. initial_learning_rate = 0.1

   lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
   initial_learning_rate,
   decay_steps = 3,
   decay_rate = 0.95,
   staircase = True)

   model.compile(optimizer = tf.keras.optimizers.SGD(learning_rate = lr_schedule),
              loss ='sparse_categorical_crossentropy',
              metrics = ['accuracy'])

Which of the above options will be correct for requirements in Cond1 and Cond2?

Choose the correct answer from below:

A. a, c and d

B. b, c and d

C. a, b and c

D. a, b and d

Ans: B

Correct option : b, c and d

Explanation :

If we set the patience = 1, the model will wait once more to get a lower accuracy again before decreasing the learning rate. Therefore setting the patience = 0, will decrease the learning rate as soon as the accuracy drops.
If patience=1, would have been applied to metric `loss` then the model would have waited one more time to get a higher loss than the minimum encountered loss before decreasing the learning rate.

Both c and d can be used for updating the learning rate using optimizer and callbacks respectively.

TensorFlow and Keras -1

Q1. Binary classification

In order to perform binary classification on a dataset (class 0 and 1) using a neural network, which of the options is correct regarding the outcomes of code snippets a and b? Here the labels of observation are in the form : [0, 0, 1...].

Common model:

import tensorflow
from keras.models import Sequential
from keras.layers import Dense
from tensorflow.keras.optimizers import SGD
model = Sequential()
model.add(Dense(50, input_dim=2, activation='relu', kernel_initializer='he_uniform'))
opt = SGD(learning_rate=0.01)

Code snippet a:

model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])

Code snippet b:

mode.add(Dense(1, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

The term "Required results" in the options means that the accuracy of the model should be above 60%.

Note: 40% of the dataset is from class 0.

Choose the correct answer from below:

A. Both a and b will give required results.

B. Only b will give the required results.

C. Only a will give the required results.

D. Both a and b will fail to give required results.

Ans: C

Correct option: only a will give the required results.

Explanation :

The task requires that the output layer is configured with a single node and a ‘sigmoid‘ activation function in order to predict the probability for the required class. For applying the softmax function for binary classification, the output layer should have 2 neurons for predicting the probability of the two classes individually.

In order to get the required results using the softmax function we need to have 2 neurons in the output layer and also the labels should be in one-hot encoded format.

Q2. Sequential classification model

import numpy as np
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dense, Dropout, Activation
from tensorflow.keras.optimizers import SGD

model = Sequential()
model.add(Dense(64, activation = 'y', input_dim=50))
model.add(Dense(64, activation = 'y'))
model.add(Dense(x, activation = 'z'))

model.compile(loss ='categorical_crossentropy',
optimizer = SGD(lr = 0.01),
metrics = ['accuracy'])

model.fit(X_train, y_train,
epochs=20)

Ram wants to create a model for the classification of types of malware in 10 different categories. He asked for help from Shyam, and he helped him with the incomplete code as shown above in the snippet. Help Ram in completing the code for classification if the data used has 50 input features. Choose the best-suited option for filling out x, y, and z.

Choose the correct answer from below:

A. x = len(np.unique(y_train)), y = softmax, z = softmax

B. x = 2 * len(np.unique(y_train)), y = relu, z = relu

C. x = len(np.unique(y_train)), y = relu, z = softmax

D. x = 0.5 * len(np.unique(y_train)), y = relu, z = relu

Ans: C

Correct option :

x = len(np.unique(y_train))
y = relu
z = softmax

Explanation :

z : For multiclass classification, softmax activation is used.
x : For the softmax activation, the output layer has the same number of neurons as the number of different classes.

y : ReLu activation function can definitely be used in the intermediate layers. ReLU is not used in the output layer of classification. Because of it's unbounded range, it's difficult to determine thresholds. Though ReLu can be used in regression tasks where negative values don't make sense like predicting prices.

Q3. Multi target output

For a multi-output regression model:

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

def get_model(n_inputs):
model = keras.Sequential()
model.add(Dense(20, input_dim = n_inputs, kernel_initializer='he_uniform', activation='relu'))
model.add(______)
model.compile(loss = 'mae', optimizer = 'adam')
return model

We want to build a neural network for a multi-output regression problem. For each observation, we have 2 outputs. Complete the code snippet to get the desired output.

Choose the correct answer from below:

A. Dense(2)

B. Dense(3)

C. activation('sigmoid')

D. activation('relu')

Ans: A

Correct option: Dense(2).

Explanation:
As we have 2 outputs therefore our output layer of model should have 2 neurons.

Q4. Number of parameters

Consider the following neural network model :

model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

The number of parameters in this model is:

Choose the correct answer from below:

A. 120

B. 96

C. 108

D. 121

Ans: D

Correct option : 121

Explanation :

Number of nodes in the input layer(i) = 8
Number of nodes in the hidden layer(h) = 12
Number of nodes in the output layer(o) = 1
So,
Number of parameters = (8×12+12×1)+12+1 = 121

Q5. Model summary

Complete the following code snippet in order to get a model with the attached model summary.

import tensorflow as tf
model = tf.keras.models.Sequential()

# Create model
model.add(tf.keras.layers.Input(shape=(_a_, )))
model.add(tf.keras.layers._b_( 512 , activation='relu'))
model.add(tf.keras.layers.Dense( _c_, activation='softmax'))

model.summary()

Choose the correct answer from below:

A. a - 32, b - Dense, c - 10

B. a - 12, b - Dense, c - 10

C. a - 10, b - Dense, c - 5

D. a - Dense(33), b - Dense, c – 50

Ans: Correct Option:
a - 32, b - Dense, c - 10

Explanation:

The key for getting a is that in the first layer we will have the number of parameters equal to (no. of features in input * neurons in the first layer) + neurons in the first layer, i.e. 32 x 512 + 512 = 16896
As from the first layer, we got the info from summary as dense. Similarly, for the second layer (i.e. c), we can get the number of neurons from output shape from dense_1.

Q6. Logistic regression model

Which of these neural networks would be most appropriately representing a logistic regression model structure for binary classification?

model = Sequential()
model.add(Dense(units=32 input_shape=(2,), activation = ‘relu’))
model.add(Dense(units=64, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

model = Sequential()
model.add(Dense(units=1, input_shape=(2,), activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

model = Sequential()
model.add(Dense(units=1, input_shape=(2,), activation='sigmoid'))
model.add(Dense(units=1, input_shape=(2,), activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

model = Sequential()
model.add(Dense(units=16))
model.add(Dense(units=32, activation=’relu’))
model.add(Dense(units=64,activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Choose the correct answer from below:

A. a

B. b

C. c

D. d

Ans: B

Correct Option: b

Explanation:

Option B would be most appropriate for representing a logistic regression model structure for binary classification. This is because it has a single input layer with only one neuron and a sigmoid activation function. The sigmoid function maps the output to a probability value between 0 and 1, which is ideal for binary classification problems.
Option A has two layers, with the second layer using the sigmoid activation function. While this could work for binary classification, the use of the ReLU activation function in the first layer is more commonly used in multi-class classification problems.
Option C has two sigmoid layers, which would be more appropriate for a deeper neural network structure for more complex problems.
Option D has a similar structure to Option A, with an additional hidden layer. While this could also work for binary classification, the use of ReLU activation in the second layer may make it more suitable for multi-class classification problems.

Q7. Model hyperparameters

Complete the following model to get the training output attached to the image.

model.compile(optimizer='sgd',
loss='sparse_categorical_crossentropy',
metrics=[‘_a_’])

# train model
model.fit(x=X_train,
          y=y_train,
          epochs = _b_ ,
          validation_data=(X_test, y_test))

Choose the correct answer from below:

A. a - loss, b - 5

B. a - accuracy, b - 100

C. a - loss, b - 25

D. a - val_acc, b - 100

Ans: B

Correct option:
a - accuracy, b - 100

Explanation:
As the image shows the accuracy, therefore metrics has to be accuracy.
Also in the image, the no. of epochs is showing 100.

Q8. Model prediction

We want to use our trained binary classification (trained with binary cross entropy and sigmoid activation function) model 'model', in order to get the label for the first observation in our test dataset of shape (m x n).

Mark the correct option which has the code to meet our requirements.

Note: m represents the number of observations and n represents the number of independent variables.

Choose the correct answer from below:

A. model.predict(test_data[0])

B. 1 if model.predict(test_data[0].reshape(1,-1)) < 0.5 else 0

C. model.predict(test_data[0].reshape(1,-1))

D. 1 if model.predict(test_data[0].reshape(1,-1)) > 0.5 else 0

Ans: D

Correct Answer: 1 if model.predict(test_data[0].reshape(1,-1)) > 0.5 else 0

Explanation:

As the model is trained with sigmoid activation function it’ll give the output with probability between 0 and 1, therefore we need to use the ternary operator.
Also we need to reshape the test_data[0] otherwise the api will throw an error mentioning reshaping the data if it has single sample.

Machine Learning - Deep Learning

TensorFlow & keras 3

TensorFlow and Keras-2

TensorFlow and Keras -1

About Machine Learning

SOFTWARE ENGINEERING