Hypothesis Testing
category_specifier : "Statistics"
Reference Docs: Linear Regression and Coefficient|Omitted Variable Bias | Standard Error
Motivation
- We want to test (or check) if our estimator for some parameter is significant
Use cases
- We want to check if mean of two groups are significantly different
Application: Is metric difference in two user group significantly different?
- We want to check if regression coefficient is statistically significant
Application:
- Prediction model context: Checking significance of the model coefficients.
- A/B testing (causal relationship) context: Checking if treatment effect coefficient is significant.
Framework
Hypothesis setup
- We have estimated value for the true value \(\beta_1\), as \(\beta_{null}\)
- We are checking if \(\beta_1\), and \(\hat{\beta_1}\) are close enough (Similar)
- Usually, when we are checking if treatment effect is significant (if \(\beta_1\) is not 0), we set \(\beta_{null} = 0\)
Interpretation
- Since we don’t have enough clue (backup) to believe \(\beta_1\) value is significant, we start by assuming H0 (Null Hypothesis).
- If there are significant level of clue to believe \(\beta_1\) is not 0, then we reject null and take H1 (Alternative hypothesis)
Get t-statistic and run T-test
- Implication: Normalize gap between true value (\(\beta_1\)), and hypothesized value (\(\beta_{null}\))
- Interpretation:
- If \(t\) is small, (or close to 0), it means hypothesized value (\(\beta_{null}\), or 0) is close to the true value. This means we have enough clue to take H0 (\(\beta_1\) is close to 0)
- If \(t\) is large enough, it means \(\beta_1\) is not likely to be 0, so we could reject null.
Significance level - deciding if \(t\) is ‘large enough’
- Pick Significance Level \(\alpha\): Probability of rejecting Null when it’s true
- Get a critical value: \(t^*_{N-2, \alpha/2}\) (1.96 when \(\alpha\) = 95%)
- Check if \(t\) value deviates critical value range: If \(t = |\frac{b_1 - \beta^1}{s_{b1}}| >t^*_{N-2, \alpha/2}\), then \(t\) Is ‘large enough’** so we can reject null.
P-Value: ‘How small’ T should be?
- P value: \(p = Pr(| t^*_{N-2, \alpha/2}| \geq |t|)\)
- Probability of \(t\) being smaller than critical value
- Probability of rejecting null, when \(t\) Is large enough
- Conclusion: Smaller P-value implies, \(t\) is more likely to be large.
Rule of Thumb : P value < 0.05
- In many cases we take 95% of critical value
- Under 95% critical value, we can reject null under P value < 0.05
Application
Regression coefficient significance
Setting
Assume we have estimated coefficient for Linear Regression (Linear Regression and Coefficient
Where true line is:
Estimated model is:
Hypothesis setting
- We want to know if X has significant effect on Y based on the given data
- Therefore, we hypothesize if true value of \(\beta_1 = \hat{\beta_1}= 0\), so here \(\beta_{null} =0\)
T-test
- We can get t-statistic value from the trained linear model.
- Compute P-value, and check if p-value is lower than 0.05
Interpretation
-
If p-value for the coefficient is lower than 0.05, we can say the coefficient value is ‘statistically significant’.
-
If the coefficient is not significant, we cannot trust the coefficient value, we cause we lack ground to say \(\beta_1 \neq 0\).
-
In this case, we say the precision if coefficient is low, (High standard error Standard Error). To increase precision:
-
Increase sample size \(N\)
- Guarantee variation of the sample data \(X\) (\(Var(X)\))
- Take more control variables (Using control variables ) / features into regression, to reduce variance of \(e\) ( \(s^2\) )
Comparison of two group’s statistical value (TBD)
Takeaways and Important points.
-
When we reject null, we can say value of \(\beta_1\) is statistically significant, and trust the estimated coefficient value.
-
Caution: It is not appropriate to say ‘True
\(\beta_1\) value is close to 0’ when we don’t reject null
- More appropriate interpretation: ‘We cannot say \(\beta_1\) is not 0 based on the current data, so cannot trust the coefficient. ’