Machine Learning MCQs - 4
(Clustering, Dimensionality Reduction)
---------------------------------------------------------------------
1. Which of the following is finally
produced by Hierarchical Clustering?
- final estimate of cluster
centroids
- tree showing how close
things are to each other
- assignment of each point to
clusters
- all of the mentioned
Ans: 2
2. Which of the following is required by K-means clustering?
- defined distance metric
- number of clusters
- initial guess as to cluster
centroids
- all of the mentioned
Ans: 4
3. Point out the wrong statement.
- k-means clustering is a
method of vector quantization
- k-means clustering aims to
partition n observations into k clusters
- k-nearest neighbor is same
as k-means
- none of the mentioned
Ans: 3
4. Which of the following
combination is incorrect?
- Continuous – euclidean
distance
- Continuous – correlation
similarity
- Binary – manhattan distance
- None of the mentioned
Ans: 4
5. Hierarchical clustering
should be primarily used for exploration
- True
- False
Ans: 1
6. Which of the following
function is used for k-means clustering?
- k-means
- k-mean
- heatmap
- none of the mentioned
Ans: 1
7. Which of the following
clustering requires merging approach?
- Partitional
- Hierarchical
- Naive Bayes
- None of the mentioned
Ans: 2
8. K-means is not deterministic
and it also consists of number of iterations.
- True
- False
Ans: 1
9. Which
of the following can act as possible termination conditions in K-Means?
|
- For a fixed number of iterations.
- Assignment of observations to clusters does not change
between iterations. Except for cases with a bad local minimum.
- Centroids do not change between successive
iterations.
- Terminate when RSS falls below a threshold
options: - 1, 3 and 4
- 1, 2 and 3
- 1, 2 and 4
- All of the above
Ans: 4
10. Which
of the following clustering algorithms suffers from the problem of
convergence at local optima?
1. K- Means clustering algorithm
2. Agglomerative clustering algorithm
3. Expectation-Maximization clustering algorithm
4. Diverse clustering algorithm options: - 1 only
- 2 and 3
- 2 and 4
- 1 and 3
Ans: 4
11. What could be the possible reason(s) for producing two
different dendrograms using agglomerative clustering algorithm for the same
dataset? - Proximity function used
- of data points used
- of variables used
- All of the above
Ans: 4
12. In
the figure below, if you draw a horizontal line on y-axis for y=2. What will
be the number of clusters formed?
13. In
which of the following cases will K-Means clustering fail to give good
results?
1. Data points with outliers 2. Data points with different densities 3. Data points with round shapes 4. Data points with non-convex shapes
|
options: - 1 and 2
- 2 and 3
- 2 and 4
- 1, 2 and 4
Ans: 4
14. Which
of the following metrics, do we have for finding dissimilarity between two
clusters in hierarchical clustering?
1. Single-link
2. Complete-link
3. Average-link options: - 1 and 2
- 1 and 3
- 2 and 3
- 1, 2 and 3
Ans: 4
15. What
is true about K-Mean Clustering?
1.K-means is extremely sensitive to cluster center
initializations
2. Bad initialization can lead to Poor convergence
speed
3.Bad initialization can lead to bad overall
clustering options: - 1 and 3
- 1 and 2
- 2 and 3
- 1, 2 and 3
Ans: 4
16. Which
of the following can be applied to get good results for K-means algorithm
corresponding to global minima?
1.Try to run algorithm for different centroid
initialization
2.Adjust number of iterations
3.Find out the optimal number of clusters options: - 2 and 3
- 1 and 3
- 1 and 2
- All of above
Ans: 4 17. Which of the following techniques would perform better
for reducing dimensions of a data set? - Removing columns which have
too many missing values
- Removing columns which have
high variance in data
- Removing columns with
dissimilar data trends
- None of these
Ans: 1 18. Dimensionality reduction algorithms are one of the
possible ways to reduce the computation time required to build a model - TRUE
- FALSE
Ans: 1 19. Which of the following algorithms cannot be used for
reducing the dimensionality of data? - t-SNE
- PCA
- LDA
- None of these
Ans: 4 20. PCA can be used for projecting and visualizing data in
lower dimensions. - TRUE
- FALSE
Ans: 1
21. The most popularly used dimensionality
reduction algorithm is Principal Component Analysis (PCA). Which of the
following is/are true about PCA?
1. PCA is an unsupervised method
2. It searches for the directions that
data have the largest variance
3. Maximum number of principal components
<= number of features
4. All principal components are
orthogonal to each other Options: - 1 and 2
- 1 and 3
- 2 and 3
- All of the above
Ans: 4
22. Suppose we are using dimensionality reduction as
pre-processing technique, i.e, instead of using all the features, we reduce the
data to k dimensions with PCA. And then use these PCA projections as our
features. Which of the following statement is correct? - Higher ‘k’ means more
regularization
- Higher ‘k’ means less
regularization
- Can’t Say
Ans: 2
23. What will happen when eigenvalues are roughly equal? - PCA will perform
outstandingly
- PCA will perform badly
- Can’t Say
- None of above
Ans: 2
24. PCA
works better if there is?
1. A linear structure in the data
2. If the data lies on a curved surface and not on a
flat surface
3. If variables are scaled in the same unit options: - 1 and 2
- 2 and 3
- 1 and 3
- 1 ,2 and 3
Ans: 3
25. What
happens when you get features in lower dimensions using PCA?
1.The features will still have interpretability
2.The features will lose interpretability
3.The features must carry all information present
in data
4. The features may not carry all information
present in data
options: - 1 and 3
- 1 and 4
- 2 and 3
- 2 and 4
Ans: 4
26. Which
of the following option(s) is / are true?
1.You need to initialize parameters in PCA
2.You don’t need to initialize parameters in PCA
3.PCA can be trapped into local minima problem
4.PCA can’t be trapped into local minima problem Options: - 1 and 3
- 1 and 4
- 2 and 3
- 2 and 4
Ans: 4
27. Which
of the following options are correct, when you are applying PCA on a image
dataset?
1.It can be used to effectively detect deformable
objects.
2.It is invariant to affine transforms.
3.It can be used for lossy image compression.
4.It is not invariant to shadows. Options: - 1 and 2
- 2 and 3
- 3 and 4
- 1 and 4
Ans: 3 28. Which of the following is
untrue regarding Expectation Maximization algorithm? - An initial guess is made as
to the location and size of the site of interest in each of the sequences, and
these parts of the sequence are aligned
- The alignment provides an
estimate of the base or amino acid composition of each column in the site
- The column-by-column
composition of the site already available is used to estimate the probability
of finding the site at any position in each of the sequences
- The row-by-column
composition of the site already available is used to estimate the probability
Ans: 4
29. Out of the two repeated
steps in EM algorithm, the step 2 is ________ - the maximization step
- the minimization step
- the optimization step
- the normalization step
Ans: 1
30. In the intermediate steps
of EM algorithm, the number of each base in each column is determined and then
converted to fractions.
Ans: 1
|
|
|
|
|
|
|
|
|
|
|
Comments
Post a Comment