Showing posts with label Ensemble. Show all posts
Showing posts with label Ensemble. Show all posts

Machine Learning -3 Syllabus

 MACHINE LEARNING Syllabus:

UNIT-1

Introduction: Brief Introduction to Machine Learning, Abstraction and Knowledge Representation, Types of Machine Learning Algorithms, Definition of learning systems, Goals and applications of machine learning, Aspects of developing a learning system, Data Types, training data, concept representation, function approximation.

Data Pre-processing: Definition, Steps involved in pre-processing, Techniques

UNIT-2

Performance measurement of models: Accuracy, Confusion matrix, TPR, FPR, FNR, TNR, Precision, recall, F1-score, Receiver Operating Characteristic Curve (ROC) curve and AUC.

Supervised Learning1: Linear Regression, Multiple Variable Linear Regression, Naïve Bayes Classifiers, Gradient Descent, Multicollinearity, Bias-Variance trade-off.

UNIT-3

Supervised Learning2: Regularization, Logistic Regression, Squashing function, KNN, Support Vector Machine.

Decision Tree Learning: Representing concepts as decision trees, Recursive induction of decision trees, picking the best splitting attribute: entropy and information gain, searching for simple trees and computational complexity, Occam's razor, overfitting, noisy data, and pruning. Decision Trees – ID3-CART-Error bounds.

 

UNIT-4

Unsupervised Learning: K-Means, Customer Segmentation, Hierarchical clustering, DBSCAN, Anomaly Detection, Local Outlier Factor, Isolation Forest, Dimensionality Reduction, PCA, GMM, Expectation Maximization.

UNIT-5

Ensemble Models: Ensemble Definition, Bootstrapped Aggregation (Bagging) Intuition, Random Forest and their construction, Extremely randomized trees, Gradient Boosting, Regularization by Shrinkage, XGBoost, AdaBoost.

 

TEXT BOOKS:

1.      Machine Learning – Tom M. Mitchell, - MGH

2.      Ethem Alpaydin, “Introduction to Machine Learning”, MIT Press, Prentice Hall of India, Third Edition 2014.

3.      The Elements of Statistical Learning, Trevor Hastie, Robert Tibshirani & Jerome Friedman, Springer Verlag, 2001.

REFERENCES: -

1.      Machine Learning, SaikatDutt, Subramanian Chandramouli, Amit Kumar Das, Pearson, 2019.

2.      Stephen Marsland, “Machine Learning -An Algorithmic Perspective”, Second Edition, Chapman and Hall/CRC Machine Learning and Pattern Recognition Series, 2014.

3.      Application of machine learning in industries (IBM ICE Publications).

e-Resources:

1.      Andrew Ng, “Machine Learning Yearning” https://www.deeplearning.ai/machinie-learning

2.      Shai Shalev-Shwartz, Shai Ben-David, “Understanding Machine Learning: From Theory to Algorithms”, Cambridge University Press. https://www.cs.huji.ac.il/w~shais/UnderstaningMachineLearning/index.html

Ensemble Learning MCQs

 

Ensemble Learning MCQs

 

1.     The model which consists of a group of predictors is called a

Top of Form

  1.  Group
  2.  Entity
  3.  Ensemble
  4.  Set

Ans: C

2.     A Random forest is an ensemble of Decision Trees

Top of Form

  1.  True
  2.  False

Ans: A

3.     The steps involved in deciding the output of a Random Forest are

  1. Obtain the predictions of all individual trees
  2. Predict the class that gets the most votes
  3. Both of the above
  4. None

Ans: C

4.     A hard voting classifier takes into consideration

Top of Form

  1.  The probabilities of output from each classifier
  2.  The majority votes from the classifiers
  3.  The mean of the output from each classifier
  4.  The sum of the output from each classifier

Ans: B


5. If each classifier is a weak learner, the ensemble can still be a strong learner?

Top of Form

  1.  True
  2.  False

Ans: ABottom of FormBottom of Form

6. Ensemble methods work best when the predictors are

  1.  Sufficiently diverse
  2.  As independent from one another as possible
  3.  Making very different types of errors
  4.  All of the above

Ans: D

7. To get diverse classifiers we cannot train them using different algorithms

Top of Form

  1.  True
  2.  False

Ans: B

8. Training the classifiers in an ensemble using very different algorithms increases the chance that they will make very different types of errors, improving the ensemble’s accuracy

Top of Form

  1.  True
  2.  False

Ans: A

9. When we consider only the majority of the outputs from the classifiers then it is called

  1. Hard Voting
  2. Soft Voting
  3. Both
  4. None

Ans: A

10. Soft voting takes into consideration

  1. The majority of votes from the classifiers
  2. The highest class probability averaged over all the individual classifiers
  3. Both
  4. None of the above

Ans: B

11. In soft voting, the predicted class is the class with the highest class probability, averaged over all the individual classifiers

Top of Form

  1.  True
  2.  False

Ans: A

12. Soft voting achieves higher performance than hard voting because

  1.  Majority votes classifications are often wrong
  2.  It gives more weight to highly confident votes
  3.  Finding majority is computationally expensive
  4.  This statement is false

Ans: B

13. The parameter which decides the voting method in a VotingClassifier is

  1.  method
  2.  strategy
  3.  voting
  4.  Type

Ans: C

14. The parameter which holds the list of classifiers which are to be used in the voting classifier is

Top of Form

  1.  predictors
  2.  classifiers
  3.  estimators
  4.  Models

Ans: C

15. One way to get a diverse set of classifiers is to use the same training algorithm for every predictor, but to train them on different random subsets of the training set

Top of Form

  1.  True
  2.  False

Ans: A

16. When sampling is performed with replacement, the method is

A.     Bagging

B.     Pasting

C.     Both

D.     None

Ans: A

17. When sampling is performed without replacement, it is called

  1. Pasting
  2. Bagging
  3. Both
  4. None

Ans: A

18. Both bagging and pasting allow training instances to be sampled several times across multiple predictors, but only bagging allows training instances to be sampled several times for the same predictor

Top of Form

  1.  True
  2.  False

Ans: A

19. In bagging/pasting training set sampling and training, predictors can all be trained in parallel, via different CPU cores or even different servers

Top of Form

  1.  True
  2.  False

Ans: A

20. To use the bagging method, the value of the bootstrap parameter in the BaggingClassifier should be set to

  1.  True
  2.  False

Ans: A

21. To use the pasting method, the value of the bootstrap parameter in the BaggingClassifier should be set to

A.    True

B.    False

Ans: B

22. Overall, bagging often results in better models

Top of Form

  1.  True
  2.  False

Ans: A

23. How many training instances with replacement does the BaggingClassifier train if the size of the training set is m?

  1.  m/2
  2.  m/3
  3.  m
  4.  m-n where n is the number of features

Ans: C

24. With bagging, it is not possible that some instances are never sampled

  1.  True
  2.  False

Ans: B

25. Features can also be sampled in the BaggingClassifier

Top of Form

  1.  True
  2.  False

Ans: A

26. The hyperparameters which control the feature sampling are

  1.  max_samples and bootstrap
  2.  max_features and bootstrap_features
  3. Both
  4. None

Ans: B

27. Sampling both training instances and features is called the

  1. Random Patches method
  2. Random Subspaces method
  3. Both
  4. None

Ans: A

28. Keeping all training instances (i.e., bootstrap=False and max_samples=1.0) but sampling features (i.e., bootstrap_features=True and/or max_features smaller than 1.0) is called the

  1. Random Patches method
  2. Random Subspaces method
  3. Both
  4. None

Ans: B

29. Random forest is an ensemble of Decision Trees generally trained via ______

Top of Form

  1. Bagging
  2. Pasting
  3. Both
  4. None

Ans: A

30. We can make the trees of a Random Forest even more random by using random thresholds for each feature rather than searching for the best possible thresholds?

Top of Form

  1.  No
  2.  Yes, and these are called Extremely Randomised Trees ensemble

Ans: B

31. If we look at a single Decision Tree, important features are likely to appear closer to

Top of Form

  1. Leaf of the tree
  2. Middle of the tree
  3. Root of the tree
  4. None of these

Ans: C

32. Feature importances are available via the feature_importances_ method of the RandomForestClassifier object.

  1.  True
  2.  False

Ans: A

33. The general idea of most boosting methods is to train predictors sequentially, each trying to correct its predecessor.

Top of Form

  1.  True
  2.  False

Ans: A

34. One of the drawbacks of AdaBoost classifier is that

  1.  It is slow
  2.  It cannot be parallelized
  3.  It cannot be performed on larger training sets
  4.  It requires a lot of memory and processing power

Ans: B

35. A Decision Stump is a Decision Tree with

  1. More than two leaf nodes
  2. Max depth of 1, i.e. single decision node with two leaf nodes
  3. Having more than 2 decision nodes
  4. None

Ans: B

36. In Gradient Boosting, instead of tweaking the instance weights at every iteration like AdaBoost does, it tries to fit the new predictor to the residual errors made by the previous predictor.

  1.  True
  2.  False

Ans: A

37. The learning_rate hyperparameter of GradientBoostingRegressor scales the contribution of each tree?

  1.  True
  2.  False

Ans: A

38. The ensemble method in which we train a model to perform the aggregation of outputs from all the predictors is called

  1.  Boosting
  2.  Bagging
  3.  Stacking
  4.  Pasting

Ans: C

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

Bottom of Form

 

About Machine Learning

Welcome! Your Hub for AI, Machine Learning, and Emerging Technologies In today’s rapidly evolving tech landscape, staying updated with the ...