“95% CI” means that if we repeated the sampling process many times, 95% of those intervals would contain the true $\mu$. Not “probability that $\mu$ lies in this interval” — $\mu$ is fixed, interval is random.
For population mean $\mu$: $$\barx \pm t^* \cdot \fracs\sqrtn$$
For a sample mean: $$t = \frac\barx - \mu_0s / \sqrtn$$ Statistics For Dummies
Significance level $\alpha$ = P(Type I error). Power = 1 − P(Type II error). Instead of a single “best guess,” give an interval likely to contain the true parameter.
If IQ ~ $N(100,15^2)$, what’s the probability of IQ > 130? $Z = (130-100)/15 = 2.0$, probability ~ 2.5% (from Z-table). 5. Sampling Distributions and the Central Limit Theorem (CLT) The CLT is the most important theorem in statistics for beginners. Central Limit Theorem: If you take many random samples of size $n$ from any population (with mean $\mu$, s.d. $\sigma$), the distribution of sample means $\barx$ will be approximately normal with mean $\mu$ and standard deviation $\frac\sigma\sqrtn$, as $n$ gets large (usually $n \geq 30$). Why this is magic: It doesn’t matter if the original population is weird — the sample mean follows a normal curve. That allows us to make probability statements about $\barx$. “95% CI” means that if we repeated the
This is crucial for medical tests, spam filters, and machine learning.
Poll says 52% ± 3% (95% CI for proportion). That means the true population proportion is between 49% and 55% with 95% confidence. 8. Linear Regression: Measuring Relationships We want to model $Y$ (response) as a linear function of $X$ (predictor). Power = 1 − P(Type II error)
Where $t^*$ is from the t-distribution with $n-1$ degrees of freedom.