Machine Learning MCQs - 5
(Ensemble Models)
---------------------------------------------------------------------
1. The model which consists of
a group of predictors is called a
Ans: 3
2. A Random forest is an ensemble of Decision Trees
Ans: 1
3. The steps involved in deciding the output of a Random Forest are
- Obtain the predictions of
all individual trees
- Predict the class that gets
the most votes
- Both of the above
Ans: 3
4. A hard voting classifier
takes into consideration
- The probabilities of output
from each classifier
- The majority votes from the
classifiers
- The mean of the output from
each classifier
- The sum of the output from
each classifier
Ans: 2
5. If each classifier is a
weak learner, the ensemble can still be a strong learner?
Ans: 1
6. Ensemble methods work best
when the predictors are
- Sufficiently diverse
- As independent from one another as possible
- Making very different types of errors
- All of the above
Ans: 4
7. To get diverse classifiers
we cannot train them using different algorithms
Ans: 2
8. Training the classifiers in
an ensemble using very different algorithms increases the chance that they will
make very different types of errors, improving the ensemble’s accuracy
Ans: 1
9. When we consider only the
majority of the outputs from the classifiers then it is called
Ans: 1
10. Soft voting takes into
consideration
- The majority of votes from
the classifiers
- The highest class
probability averaged over all the individual classifiers
Ans: 2
11. In soft voting, the
predicted class is the class with the highest class probability, averaged over
all the individual classifiers
Ans: 1
12. Soft voting achieves higher
performance than hard voting because
- Majority votes
classifications are often wrong
- It gives more weight to
highly confident votes
- Finding majority is
computationally expensive
- This statement is false
Ans: 2
13. When sampling is performed
with replacement, the method is
Ans: 1
14. When sampling is performed
without replacement, it is called
Ans: 1
15. Both bagging and pasting
allow training instances to be sampled several times across multiple
predictors, but only bagging allows training instances to be sampled several
times for the same predictor
16. In bagging/pasting training
set sampling and training, predictors can all be trained in parallel, via
different CPU cores or even different servers
Ans: 1
17. To use the bagging method,
the value of the bootstrap parameter in the BaggingClassifier should be set to
Ans: 1
18. Overall, bagging often
results in better models
Ans: 1
19. With bagging, it is not possible
that some instances are never sampled
Ans: 2
20. Features can also be
sampled in the BaggingClassifier
Ans: 1
21. The hyperparameters which
control the feature sampling are
- max_samples and bootstrap
- max_features and
bootstrap_features
Ans: 2
22. Random forest is an
ensemble of Decision Trees generally trained via ______
Ans: 1
23. We can make the trees of a
Random Forest even more random by using random thresholds for each feature
rather than searching for the best possible thresholds?
- No
- Yes, and these are called
Extremely Randomised Trees ensemble
Ans: 2
24. If we look at a single
Decision Tree, important features are likely to appear closer to
- Leaf of the tree
- Middle of the tree
- Root of the tree
Ans: 3
25. One of the drawbacks of
AdaBoost classifier is that
- It is slow
- It cannot be parallelized
- It cannot be performed on
larger training sets
- It requires a lot of memory
and processing power
Ans: 2
26. A Decision Stump is a
Decision Tree with
- More than two leaf nodes
- Max depth of 1, i.e. single
decision node with two leaf nodes
- Having more than 2 decision
nodes
Ans: 2
27. In Gradient Boosting,
instead of tweaking the instance weights at every iteration like AdaBoost does,
it tries to fit the new predictor to the residual errors made by the previous
predictor.
Ans: 1
28. The learning_rate
hyperparameter of GradientBoostingRegressor scales the contribution of each
tree ?
Ans: 1
29. The ensemble method in
which we train a model to perform the aggregation of outputs from all the
predictors is called
- Boosting
- Bagging
- Stacking
- Pasting
Ans: 3