Machine Learning - Deep Learning: Logistic Regression

Showing posts with label Logistic Regression. Show all posts

Machine Learning Programs

👉Data Preprocessing in Machine Learning

👉Data Preprocessing in Machine learning (Handling Missing values )

👉Linear Regression - ML Program - Weight Prediction

👉Naïve Bayes Classifier - ML Program

👉LOGISTIC REGRESSION - PROGRAM

👉KNN Machine Learning Program

👉Support Vector Machine (SVM) - ML Program

👉Decision Tree Classifier on Iris Dataset

👉Classification of Iris flowers using Random Forest

👉DBSCAN

👉 Implement and demonstrate the FIND-S algorithm for finding the most specific hypothesis based on a given set of training data samples. Read the training data from a .CSV file

👉For a given set of training data examples stored in a .CSV file, implement and demonstrate the Candidate-Elimination algorithm to output a description of the set of all hypotheses consistent with the training examples.

👉Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an appropriate data set for building the decision tree and apply this knowledge to classify a new sample.

👉Build an Artificial Neural Network by implementing the Backpropagation algorithm and test the same using appropriate data sets.

👉Write a program to construct a Bayesian network considering medical data. Use this model to demonstrate the diagnosis of heart patients using a standard Heart Disease Data Set.

👉Write a program to implement k-Nearest Neighbors algorithm to classify the iris data set. Print both correct and wrong predictions.

👉Implement the non-parametric Locally Weighted Regression algorithm in order to fit data points. Select appropriate data set for your experiment and draw graphs.

👉Write a program to implement SVM algorithm to classify the iris data set. Print both correct and wrong predictions.

👉Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set for clustering using k-Means algorithm. Compare the results of these two algorithms and comment on the quality of clustering.

👉 Write a program using scikit-learn to implement K-means Clustering

👉Program to calculate the entropy and the information gain

👉Program to implement perceptron.

In order to perform binary classification on a dataset (class 0 and 1) using a neural network, which of the options is correct regarding the outcomes of code snippets a and b? Here the labels of observation are in the form : [0, 0, 1...].

Common model:

import tensorflow
from keras.models import Sequential
from keras.layers import Dense
from tensorflow.keras.optimizers import SGD
model = Sequential()
model.add(Dense(50, input_dim=2, activation='relu', kernel_initializer='he_uniform'))
opt = SGD(learning_rate=0.01)

Code snippet a:

model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])

Code snippet b:

mode.add(Dense(1, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

The term "Required results" in the options means that the accuracy of the model should be above 60%.

Note: 40% of the dataset is from class 0.

Choose the correct answer from below:

A. Both a and b will give required results.

B. Only b will give the required results.

C. Only a will give the required results.

D. Both a and b will fail to give required results.

Ans: C

Correct option: only a will give the required results.

Explanation :

The task requires that the output layer is configured with a single node and a ‘sigmoid‘ activation function in order to predict the probability for the required class. For applying the softmax function for binary classification, the output layer should have 2 neurons for predicting the probability of the two classes individually.

In order to get the required results using the softmax function we need to have 2 neurons in the output layer and also the labels should be in one-hot encoded format.

Q2. Sequential classification model

import numpy as np
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dense, Dropout, Activation
from tensorflow.keras.optimizers import SGD

model = Sequential()
model.add(Dense(64, activation = 'y', input_dim=50))
model.add(Dense(64, activation = 'y'))
model.add(Dense(x, activation = 'z'))

model.compile(loss ='categorical_crossentropy',
optimizer = SGD(lr = 0.01),
metrics = ['accuracy'])

model.fit(X_train, y_train,
epochs=20)

Ram wants to create a model for the classification of types of malware in 10 different categories. He asked for help from Shyam, and he helped him with the incomplete code as shown above in the snippet. Help Ram in completing the code for classification if the data used has 50 input features. Choose the best-suited option for filling out x, y, and z.

Choose the correct answer from below:

A. x = len(np.unique(y_train)), y = softmax, z = softmax

B. x = 2 * len(np.unique(y_train)), y = relu, z = relu

C. x = len(np.unique(y_train)), y = relu, z = softmax

D. x = 0.5 * len(np.unique(y_train)), y = relu, z = relu

Ans: C

Correct option :

x = len(np.unique(y_train))
y = relu
z = softmax

Explanation :

z : For multiclass classification, softmax activation is used.
x : For the softmax activation, the output layer has the same number of neurons as the number of different classes.

y : ReLu activation function can definitely be used in the intermediate layers. ReLU is not used in the output layer of classification. Because of it's unbounded range, it's difficult to determine thresholds. Though ReLu can be used in regression tasks where negative values don't make sense like predicting prices.

Q3. Multi target output

For a multi-output regression model:

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

def get_model(n_inputs):
model = keras.Sequential()
model.add(Dense(20, input_dim = n_inputs, kernel_initializer='he_uniform', activation='relu'))
model.add(______)
model.compile(loss = 'mae', optimizer = 'adam')
return model

We want to build a neural network for a multi-output regression problem. For each observation, we have 2 outputs. Complete the code snippet to get the desired output.

Choose the correct answer from below:

A. Dense(2)

B. Dense(3)

C. activation('sigmoid')

D. activation('relu')

Ans: A

Correct option: Dense(2).

Explanation:
As we have 2 outputs therefore our output layer of model should have 2 neurons.

Q4. Number of parameters

Consider the following neural network model :

model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

The number of parameters in this model is:

Choose the correct answer from below:

A. 120

B. 96

C. 108

D. 121

Ans: D

Correct option : 121

Explanation :

Number of nodes in the input layer(i) = 8
Number of nodes in the hidden layer(h) = 12
Number of nodes in the output layer(o) = 1
So,
Number of parameters = (8×12+12×1)+12+1 = 121

Q5. Model summary

Complete the following code snippet in order to get a model with the attached model summary.

import tensorflow as tf
model = tf.keras.models.Sequential()

# Create model
model.add(tf.keras.layers.Input(shape=(_a_, )))
model.add(tf.keras.layers._b_( 512 , activation='relu'))
model.add(tf.keras.layers.Dense( _c_, activation='softmax'))

model.summary()

Choose the correct answer from below:

A. a - 32, b - Dense, c - 10

B. a - 12, b - Dense, c - 10

C. a - 10, b - Dense, c - 5

D. a - Dense(33), b - Dense, c – 50

Ans: Correct Option:
a - 32, b - Dense, c - 10

Explanation:

The key for getting a is that in the first layer we will have the number of parameters equal to (no. of features in input * neurons in the first layer) + neurons in the first layer, i.e. 32 x 512 + 512 = 16896
As from the first layer, we got the info from summary as dense. Similarly, for the second layer (i.e. c), we can get the number of neurons from output shape from dense_1.

Q6. Logistic regression model

Which of these neural networks would be most appropriately representing a logistic regression model structure for binary classification?

model = Sequential()
model.add(Dense(units=32 input_shape=(2,), activation = ‘relu’))
model.add(Dense(units=64, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

model = Sequential()
model.add(Dense(units=1, input_shape=(2,), activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

model = Sequential()
model.add(Dense(units=1, input_shape=(2,), activation='sigmoid'))
model.add(Dense(units=1, input_shape=(2,), activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

model = Sequential()
model.add(Dense(units=16))
model.add(Dense(units=32, activation=’relu’))
model.add(Dense(units=64,activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Choose the correct answer from below:

A. a

B. b

C. c

D. d

Ans: B

Correct Option: b

Explanation:

Option B would be most appropriate for representing a logistic regression model structure for binary classification. This is because it has a single input layer with only one neuron and a sigmoid activation function. The sigmoid function maps the output to a probability value between 0 and 1, which is ideal for binary classification problems.
Option A has two layers, with the second layer using the sigmoid activation function. While this could work for binary classification, the use of the ReLU activation function in the first layer is more commonly used in multi-class classification problems.
Option C has two sigmoid layers, which would be more appropriate for a deeper neural network structure for more complex problems.
Option D has a similar structure to Option A, with an additional hidden layer. While this could also work for binary classification, the use of ReLU activation in the second layer may make it more suitable for multi-class classification problems.

Q7. Model hyperparameters

Complete the following model to get the training output attached to the image.

model.compile(optimizer='sgd',
loss='sparse_categorical_crossentropy',
metrics=[‘_a_’])

# train model
model.fit(x=X_train,
          y=y_train,
          epochs = _b_ ,
          validation_data=(X_test, y_test))

Choose the correct answer from below:

A. a - loss, b - 5

B. a - accuracy, b - 100

C. a - loss, b - 25

D. a - val_acc, b - 100

Ans: B

Correct option:
a - accuracy, b - 100

Explanation:
As the image shows the accuracy, therefore metrics has to be accuracy.
Also in the image, the no. of epochs is showing 100.

Q8. Model prediction

We want to use our trained binary classification (trained with binary cross entropy and sigmoid activation function) model 'model', in order to get the label for the first observation in our test dataset of shape (m x n).

Mark the correct option which has the code to meet our requirements.

Note: m represents the number of observations and n represents the number of independent variables.

Choose the correct answer from below:

A. model.predict(test_data[0])

B. 1 if model.predict(test_data[0].reshape(1,-1)) < 0.5 else 0

C. model.predict(test_data[0].reshape(1,-1))

D. 1 if model.predict(test_data[0].reshape(1,-1)) > 0.5 else 0

Ans: D

Correct Answer: 1 if model.predict(test_data[0].reshape(1,-1)) > 0.5 else 0

Explanation:

As the model is trained with sigmoid activation function it’ll give the output with probability between 0 and 1, therefore we need to use the ternary operator.
Also we need to reshape the test_data[0] otherwise the api will throw an error mentioning reshaping the data if it has single sample.

Machine Learning MCQs-3 (Logistic Regression, KNN, SVM, Decision Tree)

Machine Learning MCQs-3

(Logistic Regression, KNN, SVM, Decision Tree)

---------------------------------------------------------------------

1.A Support Vector Machine can be used for

Performing linear or nonlinear classification
Performing regression
For outlier detection
All of the above

Ans: 4

2. The decision boundaries in a Support Vector machine is fully determined (or “supported”) by the instances located on the edge of the street?

True
False

Ans: 1

3. Support Vector Machines are not sensitive to feature scaling

True
False

Ans: 2

4. If we strictly impose that all instances be off the street and on the right side, this is called

Soft margin classification
Hard margin classification
Strict margin classification
Loose margin classification

Ans: 2

5.The main issues with hard margin classification are

It only works if the data is linearly separable
It is quite sensitive to outliers
It is impossible to find a margin if the data is not linearly separable
All of the above

Ans: 4

6. The objectives of Soft Margin Classification are to find a good balance between

Keeping the street as large as possible
Limiting the margin violations
Both of the above

Ans: 3

7. The balance between keeping the street as large as possible and limiting margin violations is controlled by this hyperparameter

tol
loss
penalty
C

Ans: 4

8. A smaller C value leads to a wider street but more margin violations.

True
False

Ans: 1

9. If your SVM model is overfitting, you can try regularizing it by reducing the value of

tol
C hyperparameter
intercept_scaling
None of the above

Ans: 2

10. A similarity function like Gaussian Radial Basis Function is used to

Measure how many features are related to each other
Find the most important features
Find the relationship between different features
Measure how much each instance resembles a particular landmark

Ans: 4

11. When using SVMs we can apply an almost miraculous mathematical technique for adding polynomial features and similarity features called the

Kernel trick
Shell trick
Mapping and Reducing
None of the Above

Ans: 1

12. Which is right for the gamma parameter of SVC which acts as a regularization hyperparameter

If model is overfitting, increase it, if it is underfitting, reduce it
If model is overfitting, reduce it, if it is underfitting, increase it
If model is overfitting, keep it same
If it is underfitting, keep it same

Ans: 2

13. LinearSVC is much faster than SVC(kernel="linear"))

True
False

Ans: 1

14. In SVM regression the model tries to

Fit the largest possible street between two classes while limiting margin violations
Fit as many instances as possible on the street while limiting margin violations

Ans: 2

15. Decision Trees can be used for

Classification Tasks
Regression Tasks
Multi-output tasks
All of the above

Ans: 4

16. The iris dataset has

5 features and 3 classes
4 features and 3 classes
2 features and 3 classes
4 features and 2 classes

Ans: 2

17. A node’s gini attribute measures

The number of training instances in the node
The ratio of training instances in the node
Its impurity

Ans: 3

18. If all the training instances of a node belong to the same class then the value of the node's Gini attribute will be

1
0
Any integer between 0 and 1
A negative value

Ans: 2

19. A Gini coefficient of 1 expresses maximal inequality among the training samples

True
False

Ans: 1

20. Gini index for a node is found by subtracting the sum of the square of ratio of each classes in a node from 1

True
False

Ans: 1

21. A decision tree estimates the probability that an instance belongs to a particular class k by finding the corresponding leaf node for the instance and then returning the ratio of training instances of class k

True
False

Ans: 1

22. The Decision Tree classifier predicts the class which has the highest probability

True
False

Ans: 1

23. The CART algorithm splits the training set in two subsets

Using all the features and a threshold tk
Using a single feature k and a threshold tk
Using half of the features and a threshold k

Ans: 2

24. How does the CART algorithm chooses the feature k and the threshold tk for splitting ?

It randomly chooses a feature k
It chooses the mean of the values of the feature k as threshold
It chooses the feature k and threshold tk which produces the purest subsets
It chooses the feature k and threshold tk such that the gini index value of the subsets is 0

Ans: 3

25. The cost function for finding the value of feature k and threshold tk takes into consideration

The Gini index values of the subsets
The number of instances in the subsets
The total number of instances in the node that is being split
All of these

Ans: 4

26. Once the CART algorithm has successfully split the training set in two

It stops splitting further
It splits the subsets using the same logic, then the sub- subsets and so on, recursively
It splits only the right subset
It splits only the left subset

Ans: 2

27.The CART algorithm stops recursion once it reaches the maximum depth (defined by the max_depth hyperparameter), or if it cannot find a split that will reduce impurity

True
False

Ans: 1

28. Which of the following are correct for the CART algorithm

It is a greedy algorithm
It greedily searches for an optimum split at each level
It does not check whether or not the split will lead to the lowest possible impurity several levels down
All of the above are correct

Ans: 4

29. While making a prediction in Decision Tree, each node only requires checking the value of one feature

True
False

Ans: 1

30. Gini impurity is slightly faster to compute in comparison to entropy

True
False

Ans: 1

31. Models like Decision Tree models are often called nonparametric model because

They do not have any parameters
The number of parameters is not determined prior to training
They have lesser parameters as compared to other models
They are easy to interpret and understand

Ans: 2

Machine Learning - Deep Learning

Machine Learning Programs

Machine Learning Programs

TensorFlow and Keras -1

Machine Learning MCQs-3 (Logistic Regression, KNN, SVM, Decision Tree)

Machine Learning MCQs-3

About Machine Learning

SOFTWARE ENGINEERING