MACHINE LEARNING Syllabus:

UNIT-1

Introduction: Brief Introduction to Machine Learning, Abstraction and Knowledge Representation, Types of Machine Learning Algorithms, Definition of learning systems, Goals and applications of machine learning, Aspects of developing a learning system, Data Types, training data, concept representation, function approximation.

Data Pre-processing: Definition, Steps involved in pre-processing, Techniques

UNIT-2

Performance measurement of models: Accuracy, Confusion matrix, TPR, FPR, FNR, TNR, Precision, recall, F1-score, Receiver Operating Characteristic Curve (ROC) curve and AUC.

Supervised Learning1: Linear Regression, Multiple Variable Linear Regression, Naïve Bayes Classifiers, Gradient Descent, Multicollinearity, Bias-Variance trade-off.

UNIT-3

Supervised Learning2: Regularization, Logistic Regression, Squashing function, KNN, Support Vector Machine.

Decision Tree Learning: Representing concepts as decision trees, Recursive induction of decision trees, picking the best splitting attribute: entropy and information gain, searching for simple trees and computational complexity, Occam's razor, overfitting, noisy data, and pruning. Decision Trees – ID3-CART-Error bounds.

UNIT-4

Unsupervised Learning: K-Means, Customer Segmentation, Hierarchical clustering, DBSCAN, Anomaly Detection, Local Outlier Factor, Isolation Forest, Dimensionality Reduction, PCA, GMM, Expectation Maximization.

UNIT-5

Ensemble Models: Ensemble Definition, Bootstrapped Aggregation (Bagging) Intuition, Random Forest and their construction, Extremely randomized trees, Gradient Boosting, Regularization by Shrinkage, XGBoost, AdaBoost.

TEXT BOOKS:

1. Machine Learning – Tom M. Mitchell, - MGH

2. Ethem Alpaydin, “Introduction to Machine Learning”, MIT Press, Prentice Hall of India, Third Edition 2014.

3. The Elements of Statistical Learning, Trevor Hastie, Robert Tibshirani & Jerome Friedman, Springer Verlag, 2001.

REFERENCES: -

1. Machine Learning, SaikatDutt, Subramanian Chandramouli, Amit Kumar Das, Pearson, 2019.

2. Stephen Marsland, “Machine Learning -An Algorithmic Perspective”, Second Edition, Chapman and Hall/CRC Machine Learning and Pattern Recognition Series, 2014.

3. Application of machine learning in industries (IBM ICE Publications).

e-Resources:

1. Andrew Ng, “Machine Learning Yearning” https://www.deeplearning.ai/machinie-learning

2. Shai Shalev-Shwartz, Shai Ben-David, “Understanding Machine Learning: From Theory to Algorithms”, Cambridge University Press. https://www.cs.huji.ac.il/w~shais/UnderstaningMachineLearning/index.html

Ensemble Learning MCQs

1. The model which consists of a group of predictors is called a

Top of Form

Group
Entity
Ensemble
Set

Ans: C

2. A Random forest is an ensemble of Decision Trees

Top of Form

True
False

Ans: A

3. The steps involved in deciding the output of a Random Forest are

Obtain the predictions of all individual trees
Predict the class that gets the most votes
Both of the above
None

Ans: C

4. A hard voting classifier takes into consideration

Top of Form

The probabilities of output from each classifier
The majority votes from the classifiers
The mean of the output from each classifier
The sum of the output from each classifier

Ans: B

5. If each classifier is a weak learner, the ensemble can still be a strong learner?

Top of Form

True
False

Ans: ABottom of FormBottom of Form

6. Ensemble methods work best when the predictors are

Sufficiently diverse
As independent from one another as possible
Making very different types of errors
All of the above

Ans: D

7. To get diverse classifiers we cannot train them using different algorithms

Top of Form

True
False

Ans: B

8. Training the classifiers in an ensemble using very different algorithms increases the chance that they will make very different types of errors, improving the ensemble’s accuracy

Top of Form

True
False

Ans: A

9. When we consider only the majority of the outputs from the classifiers then it is called

Hard Voting
Soft Voting
Both
None

Ans: A

10. Soft voting takes into consideration

The majority of votes from the classifiers
The highest class probability averaged over all the individual classifiers
Both
None of the above

Ans: B

11. In soft voting, the predicted class is the class with the highest class probability, averaged over all the individual classifiers

Top of Form

True
False

Ans: A

12. Soft voting achieves higher performance than hard voting because

Majority votes classifications are often wrong
It gives more weight to highly confident votes
Finding majority is computationally expensive
This statement is false

Ans: B

13. The parameter which decides the voting method in a VotingClassifier is

method
strategy
voting
Type

Ans: C

14. The parameter which holds the list of classifiers which are to be used in the voting classifier is

Top of Form

predictors
classifiers
estimators
Models

Ans: C

15. One way to get a diverse set of classifiers is to use the same training algorithm for every predictor, but to train them on different random subsets of the training set

Top of Form

True
False

Ans: A

16. When sampling is performed with replacement, the method is

A.     Bagging
B.     Pasting
C.     Both
D.     None

Ans: A

17. When sampling is performed without replacement, it is called

Pasting
Bagging
Both
None

Ans: A

18. Both bagging and pasting allow training instances to be sampled several times across multiple predictors, but only bagging allows training instances to be sampled several times for the same predictor

Top of Form

True
False

Ans: A

19. In bagging/pasting training set sampling and training, predictors can all be trained in parallel, via different CPU cores or even different servers

Top of Form

True
False

Ans: A

20. To use the bagging method, the value of the bootstrap parameter in the BaggingClassifier should be set to

True
False

Ans: A

21. To use the pasting method, the value of the bootstrap parameter in the BaggingClassifier should be set to

A. True
B. False

Ans: B

22. Overall, bagging often results in better models

Top of Form

True
False

Ans: A

23. How many training instances with replacement does the BaggingClassifier train if the size of the training set is m?

m/2
m/3
m
m-n where n is the number of features

Ans: C

24. With bagging, it is not possible that some instances are never sampled

True
False

Ans: B

25. Features can also be sampled in the BaggingClassifier

Top of Form

True
False

Ans: A

26. The hyperparameters which control the feature sampling are

max_samples and bootstrap
max_features and bootstrap_features
Both
None

Ans: B

27. Sampling both training instances and features is called the

Random Patches method
Random Subspaces method
Both
None

Ans: A

28. Keeping all training instances (i.e., bootstrap=False and max_samples=1.0) but sampling features (i.e., bootstrap_features=True and/or max_features smaller than 1.0) is called the

Random Patches method
Random Subspaces method
Both
None

Ans: B

29. Random forest is an ensemble of Decision Trees generally trained via ______

Top of Form

Bagging
Pasting
Both
None

Ans: A

30. We can make the trees of a Random Forest even more random by using random thresholds for each feature rather than searching for the best possible thresholds?

Top of Form

No
Yes, and these are called Extremely Randomised Trees ensemble

Ans: B

31. If we look at a single Decision Tree, important features are likely to appear closer to

Top of Form

Leaf of the tree
Middle of the tree
Root of the tree
None of these

Ans: C

32. Feature importances are available via the feature_importances_ method of the RandomForestClassifier object.

True
False

Ans: A

33. The general idea of most boosting methods is to train predictors sequentially, each trying to correct its predecessor.

Top of Form

True
False

Ans: A

34. One of the drawbacks of AdaBoost classifier is that

It is slow
It cannot be parallelized
It cannot be performed on larger training sets
It requires a lot of memory and processing power

Ans: B

35. A Decision Stump is a Decision Tree with

More than two leaf nodes
Max depth of 1, i.e. single decision node with two leaf nodes
Having more than 2 decision nodes
None

Ans: B

36. In Gradient Boosting, instead of tweaking the instance weights at every iteration like AdaBoost does, it tries to fit the new predictor to the residual errors made by the previous predictor.

True
False

Ans: A

37. The learning_rate hyperparameter of GradientBoostingRegressor scales the contribution of each tree?

True
False

Ans: A

38. The ensemble method in which we train a model to perform the aggregation of outputs from all the predictors is called

Boosting
Bagging
Stacking
Pasting

Ans: C

Bottom of Form

Machine Learning - Deep Learning

Machine Learning MCQs

Machine Learning MCQs

Machine Learning -3 Syllabus

Ensemble Learning MCQs

Ensemble Learning MCQs

About Machine Learning

SOFTWARE ENGINEERING