Machine Learning - Deep Learning: Intro to Hypothesis Testing

Showing posts with label Intro to Hypothesis Testing. Show all posts

Intro to Hypothesis Testing

Intro to Hypothesis Testing

Q1. Framework for weight loss

A group of people have volunteered to try out a diet for weight loss for 3 months.

How should the null and alternate hypotheses be set up?

A. H0: Diet increases weight; Ha: Diet has no impact on weight

B. H0: Diet reduces weight; Ha: Diet has no impact on weight

C. H0: Diet has no impact on weight; Ha: Diet reduces weight

D. H0: Diet has no impact on weight; Ha: Diet increases weight

Correct Answer: H0: Diet has no impact on weight; Ha: Diet reduces weight

Explanation:

In hypothesis testing,

H0 (the null hypothesis) typically represents the status quo or no effect, while Ha (the alternative hypothesis) represents the effect or change you want to test.
In this case, the null hypothesis (H0) assumes that the diet has no impact on weight, while the alternative hypothesis (Ha) suggests that the diet reduces weight.
This is the most appropriate setup to test whether the diet has an effect on weight loss, as we need evidence/proof to conclude if this diet does in fact aid in weight loss

2. Q2. Marketing the shampoo brand

Weekly sales of shampoo bottles has an average of 1800 . A marketing company feels that this can be improved with right advertisement and promotions.

What should be the null and alternate hypothesis, in order to validate their claim?

Let μ denote the average sales after marketing.

i). H0:μ≤1800 and Ha:μ=1800

ii). H0:μ<1800 and Ha:μ≥1800

iii). H0:μ=1800 and Ha:μ=1800

iv). H0:μ=1800 and Ha:μ>1800

A. i)

B. ii)

C. iii)

D. iv)

Correct option: H0:μ=1800 and Ha:μ>1800

Explanation:

The default assumption should be that the marketing has no effect, and the claim of the marketing company should manifest in the alternate hypothesis Since the default assumption here is that the marketing has no effect, we set up the null hypothesis as H0:μ=1800 .
Now the claim is that the sales will improve. Thus, the alternate hypothesis here should be Ha:μ>1800 .
In this scenario, the burden of proof is on the marketing company to show evidence suggesting that advertisement and promotions can in fact increase average shampoo sales.

Q3. Should we build the gymnasium?

A software company is planning to build a gymnasium for its employees. But before they plan to put their idea into action, they would like to know the interest of their employees.

They plan to survey a sample of their employees to see if there is strong evidence that more than 45% of the employees are interested, in which case they will consider building the gymnasium.

The hypotheses the company is using are:

H₀ : There is not enough evidence to suggest that more than 45% employees are interested in gymnasium (i.e. at most 45% are interested).

H_a : There is enough evidence to suggest that more than 45% employees are interested in gymnasium.

Which of the following would be a Type II error here?

A. More than 45% are actually interested, an it's not concluded that more than 45% are interested.

B. More than 45% are actually interested, and it's concluded that more than 45% percent are interested.

C. At most 45% are actually interested, and it's concluded that more than 45% are interested.

D. At most 45% are actually interested, and they concluded that less than 45% are interested.

HINT-I

Type I error occurs when we reject a true null hypothesis, and type II occurs when we fail to reject a false null hypothesis. Based on this, we can check which statement is actually making us fail to reject a false null hypothesis.

Correct option: More than 45% are actually interested, an it’s not concluded that more than 45% are interested.

Explanation:

Type I error occurs when we reject a true null hypothesis, and type II occurs when we fail to reject a false null hypothesis.

Based on this:

"More than 45% are actually interested, and it's not concluded that more than 45% are interested." is an example of type II error, since null hypothesis is false, but we reject it.

Similarly, "At most 45% are actually interested, and it's concluded that more than 45% are interested." is a Type I error, since null hypothesis is true, but we reject it.

Q4. Soft Drinks

A soft drink manufacturing company claims that the volume of drink of their bottles is 15 oz. A consumer group suspects the bottles are under‐filled and plans to conduct a test.

What is the Type I error in this situation?

A. The consumer group has evidence that the volume of the bottles is not 15 oz.

B. The consumer group does not conclude that the soft drink bottles have less than 15 oz. when the mean actually is less than 15 oz.

C. The consumer group concludes that the soft drink bottles have less than 15 oz. when the mean actually is 15 oz.

D. The consumer group has evidence that the claim is correct.

Correct option: The consumer group concludes that the soft drink bottles have less than 15 oz. when the mean actually is 15 oz.

Explanation:

Type I error occurs when we reject a true null hypothesis, and type II occurs when we fail to reject a false null hypothesis.

Here based on the question, we define our hypothesis as:

Null hypothesis: Mean volume of soft drink in bottle is 15 oz, i.e. μ=15
Alternate hypothesis: Mean volume of soft drink in bottle is not equal to 15oz, i.e. μ < 15

Based on this, for the given situation, the Type 1 error is when the consumer group concludes that the soft drink bottles have less than 15 oz, when the mean actually is 15 oz.

Q5. Other name?

A data scientist is working on a credit scoring model for a bank.

The goal is to determine if an applicant is creditworthy (has a low credit risk) or not creditworthy (has a high credit risk) based on various financial factors.

The bank has set a credit score threshold (significance level), and applicants above this threshold are considered creditworthy, while those below it are considered not creditworthy.

Consider the following scenarios:

Case A:

Reality: Applicant is not creditworthy.

Decision: The model correctly identifies the applicant as not creditworthy.

Case B:

Reality: Applicant is not creditworthy.

Decision: The model incorrectly classifies the applicant as creditworthy.

Case C:

Reality: Applicant is creditworthy.

Decision: The model incorrectly classifies the applicant as not creditworthy.

Case D:

Reality: Applicant is creditworthy.

Decision: The model correctly identifies the applicant as creditworthy.

What names can we give to these cases respectively?

A. True Negative, Type - 2 error, Type - 1 error, True Positive

B. True Positive, Type - 2 error, Type - 1 error, True Negative

C. Type - 2 error, True Positive, True Negative, Type - 1 error

D. True Negative, Type - 1 error, Type - 2 error, True Positive

Correct Option: True Negative, Type - 2 error, Type - 1 error, True Positive

Explanation:

We know that, as per the hypothesis testing framework, we follow:

if pvalue < threshold:

Reject H0

else:

Fail to reject H0

It is called that out that “applicants above this threshold are considered creditworthy, while those below it are considered not creditworthy.”, i.e.

if pvalue < threshold:

Not Creditworthy

else:

Creditworthy

Hence, it is indirectly given in our question that the hypothesis is to be setup as:

H0: Applicant is creditworthy
Ha: Applicant is not creditworthy

So, based on this,

Case A: The model correctly identifies the applicant as not creditworthy, which is a True Negative.
Case D: The model correctly identifies the applicant as creditworthy, which is a True Positive.

Q6. Hypothesis and Conclusion

As a data scientist you are working for an e-commerce company, and you want to determine if the introduction of a new algorithm has led to an increase in the average order value of customer purchases.

After conducting an analysis of customer purchases before and after the introduction of the new algorithm, the data scientist obtains a p-value of 0.002.

The significance level (α) is set at 0.05.

What would be the appropriate hypotheses and conclusion to this situation?

H0: New algorithm has no effect on average order value 



Ha: New algorithm leads to a higher average order value. 



Conclusion: Introduction of the new algorithm has led to a significant increase in the average order value.

H0: New algorithm has no effect on average order value 



Ha: New algorithm leads to a higher average order value. 



Conclusion: Introduction of the new algorithm has not led to a increase in the average order value

H0: New algorithm leads to a higher average order value



Ha: New algorithm has no effect on average order value. 



Conclusion: Introduction of the new algorithm has led to a significant increase in the average order value.

H0: New algorithm leads to a higher average order value



Ha: New algorithm has no effect on average order value. 



Conclusion =  Introduction of the new algorithm has not led to a increase in the average order value

A. a

B. b

C. c

D. d

Q7. Airline passengers

From historical data, it is known that the mean weight of airline passengers with carry-on baggage is 175lb, and the standard deviation is 5lb.

Which of the following would be the most appropriate way to test the claim that the mean weight of airline passengers with carry-on baggage is at most 195lb, with a 95% confidence level?

A. Two tailed test

B. Left tailed test

C. Right tailed test

D. None of the above.

Correct option: Left tailed test

Explanation:

Let µ represent the mean weight of airline passengers with carry-on baggage.

Based on the given question, we define our hypotheses as:

H₀ = µ ≥ 195 (Greater than or Equal to 195lb)
H₁ = µ < 195 (At most 195lb)

Hence, based on the alternate hypothesis, we can see that our test would one directional, towards the left.

Q8. Appropriate test

I. Is there a difference in memory retention for individuals at the age of 20 compared to their memory at age 60?

II. Do people who take daily vitamin live longer than the people who don’t take ?

Which of the following would be the appropriate test for the above two statements?

A. I .Two tailed test , II. One tailed test.

B. I. Two tailed test , II. Two tailed test.

C. I. One tailed test , II. One tailed test.

D. I. One tailed test , II. Two tailed test.

Correct option: I. Two tailed test , II. One tailed test.

Explanation:

For statement I:

Based on the given statement, we define our hypotheses as:

H₀: (memory retention)_age=20 = (memory retention)_age=60
H₁: (memory retention)_age=20 ≠ (memory retention)_age=60

Hence, we would need to use a Two Tailed Test here.

For statement II:

Based on the given statement, we define our hypotheses as:

H₀: (Life Span)_{Daily Vitamin} = (Life Span)_{No Daily Vitamin}
H₁: (Life Span)_{Daily Vitamin} > (Life Span)_{No Daily Vitamin}

Hence, we would need to use a One Tailed Test here.

Q1. Frame work for GRE verbal reasoning

The verbal reasoning section in the GRE exam, has an average score of 150 and a standard deviation of 8.5.

A coaching centre claims to improve these numbers for their students. How should the null and alternate hypotheses be set up?

A. H0: Coaching improves score; Ha: Coaching does not improve score

B. H0: Coaching reduces score; Ha: Coaching improves score

C. H0: Coaching does not improve score; Ha: Coaching reduces score

D. H0: Coaching does not improve score; Ha: Coaching improves score

Correct Answer: H0: Coaching does not improve score; Ha: Coaching improves score

Explanation:

We are given the average score of verbal reasoning section of GRE exam.

By default, we would assume that a student’s score would conform to given distribution, irrespective of whether are enrolled in coaching or not.

Hence, this becomes our Null Hypothesis.
i.e. H0: Coaching does not improve score

However, there is a coaching center that claims to improve the scores.

This means that they have burden of proof.
Hence, this becomes our Alternate hypothesis.
i.e. Ha: Coaching improves score.

Q2. Judge the right way

In a court case, the null hypothesis is that the defendant is innocent.

Identify the Type-2 error among the following.

A. The defendant is innocent, and the judge pronounces him innocent

B. The defendant is innocent, but the judge pronounces him guilty

C. The defendant is guilty, and the judge pronounces him guilty

D. The defendant is guilty, but the judge pronounces him innocent

Correct option: The defendant is guilty, but the judge pronounces him innocent

Explanation:

The null and alternate hypothesis are:

Null hypothesis (H₀) : the defendant is innocent

Alternate hypothesis (H_a): the defendant is innocent guilty

Type-2 error is when the null hypothesis is false, but we fail to reject the null hypothesis (false negative).

Since the null hypothesis is that the defendant is innocent, the type 2 error occurs when defendant is guilty, but the judge wrongly pronounces him innocent.

Q3. Ride the bike

When you ride your bike, the null hypothesis H₀ is that the bike is safe to drive.

Which of these is a Type-1 error?

A. The bike is not safe, but you think it is safe

B. The bike is not safe, and you think it is not safe

C. The bike is safe, but you think it is not safe

D. The bike is safe, and you think it is safe

Correct option: The bike is safe, but you think it is not safe

Explanation:

Null hypothesis (H₀) : the bike is safe to drive

Alternate hypothesis (H_a): the bike is not safe to drive

Type-1 error is when the null hypothesis is true, but we reject the null hypothesis (false positive).

Since the null hypothesis is that the bike is safe, the type 1 error occurs when the bike is safe but we wrongly think that it is not safe

Q4. Appropriate Conclusion

You perform a one-tailed hypothesis test with a significance level of 0.01.

If the p-value is 0.015, what is the appropriate conclusion?

A. Reject the null hypothesis

B. Fail to reject the null hypothesis

C. Inconclusive result

D. P value is not reliable enough to draw conclusions

Correct Option: Fail to reject the null hypothesis

Explanation:

In a hypothesis test, the p-value represents the probability of observing the data, or something more extreme, under the assumption that the null hypothesis is true.
When the p-value is less than or equal to the chosen significance level (α), it provides evidence to reject the null hypothesis in favor of the alternative hypothesis.
However, when the p-value is greater than the significance level, there is not enough evidence to reject the null hypothesis.
In this case, since the p-value is 0.015 and the significance level is 0.01, the appropriate conclusion is to “fail to reject the null hypothesis.”

Machine Learning - Deep Learning

Intro to Hypothesis Testing

About Machine Learning

SOFTWARE ENGINEERING