πŸ”’ Math Β· Statistics

Stats mnemonics that make probability stick

Mean, median, mode, hypothesis testing β€” the concepts every stats student needs cold.

πŸ”’ Statistics

Memory tricks

Proven mnemonics β€” fast to learn, hard to forget.

πŸ”’ Statistics
Mean = Add and Divide. Median = Middle. Mode = Most.
Measures of Central Tendency
Mean, median, and mode β€” locked in 3 seconds
Mean: add all values, divide by count. Median: the middle value when sorted. Mode: the value that appears most often. One sentence each.
πŸ”’ Statistics
"Reject if p is less than alpha"
Hypothesis Testing Decision Rule
When to reject the null hypothesis β€” always
If your p-value < Ξ± (usually 0.05), reject Hβ‚€. If p > Ξ±, fail to reject. You never "accept" Hβ‚€ β€” you only fail to reject it.
πŸ”’ Statistics
68 Β· 95 Β· 99.7
Empirical Rule (Normal Distribution)
The empirical rule β€” the three numbers every stats student memorizes
68% of data falls within 1 SD. 95% within 2 SD. 99.7% within 3 SD. These three numbers cover virtually every normal distribution question.
πŸ”’ Statistics
Type I = False Alarm. Type II = Missed Call.
Error Types
Type I and Type II errors β€” impossible to mix up
Type I (Ξ±): reject Hβ‚€ when it's true β€” a false alarm. Type II (Ξ²): fail to reject Hβ‚€ when it's false β€” a missed call. Think: crying wolf vs. ignoring the wolf.
πŸ”’ Statistics
r closer to Β±1 = stronger
Correlation Coefficient
Reading correlation: closer to 1 or βˆ’1 is stronger
r = +1 is perfect positive. r = βˆ’1 is perfect negative. r = 0 means no linear relationship. The closer to either extreme, the stronger the correlation.
Standard Deviation
Standard deviation = spread of data. Small SD = data clustered near mean. Large SD = spread out.
Standard Deviation
How much the data typically varies from the mean
Variance = average squared deviation from mean. SD = √variance. Low SD: data points cluster tightly around the mean. High SD: data is spread widely. About 68% of data falls within 1 SD of the mean in a normal distribution (68-95-99.7 rule).
Basic Probability Rules
Probability: P(A and B) = P(A) Γ— P(B) if independent. P(A or B) = P(A) + P(B) - P(A and B).
Basic Probability Rules
Two essential probability formulas β€” AND and OR
AND (both events occur): multiply probabilities if independent. P(heads AND heads) = 0.5 Γ— 0.5 = 0.25. OR (at least one occurs): add probabilities, subtract the overlap. P(A or B) = P(A) + P(B) - P(A∩B). For mutually exclusive events: P(A or B) = P(A) + P(B).
Confidence Intervals
Confidence interval: estimate Β± margin of error. Wider CI = less precise but more confident.
Confidence Intervals
A range of plausible values for a population parameter
95% CI means: if you repeated the study 100 times, about 95 of the intervals would contain the true population parameter. Wider interval = more confident but less precise. Increasing sample size narrows the interval without sacrificing confidence.
Correlation Coefficient
Correlation vs causation: r measures linear relationship strength, NOT cause and effect
Correlation Coefficient
What r tells you β€” and what it doesn't
r ranges from -1 to +1. r = 1: perfect positive linear relationship. r = -1: perfect negative. r = 0: no linear relationship. Strong correlation does NOT mean one variable causes the other. Always look for lurking variables (confounders).
Normal Distribution
Normal distribution: symmetric, bell-shaped. Mean = median = mode. Described by ΞΌ and Οƒ.
Normal Distribution
The bell curve β€” the most important distribution in statistics
Perfectly symmetric around the mean. 68% of data within 1Οƒ, 95% within 2Οƒ, 99.7% within 3Οƒ. Z-score = (x - ΞΌ)/Οƒ converts any normal distribution to standard normal (ΞΌ=0, Οƒ=1). Use z-table to find probabilities.
Chi-Square Test
Chi-square test: tests whether observed frequencies differ from expected frequencies
Chi-Square Test
Testing whether categorical data fits a pattern or shows an association
χ² = Ξ£(observed - expected)Β²/expected. Large χ² β†’ observed data far from expected β†’ more evidence against null hypothesis. Two uses: goodness-of-fit (does data fit a distribution?) and test of independence (are two categorical variables related?).
Linear Regression
Regression line: Ε· = bβ‚€ + b₁x. Slope b₁ = change in y per unit change in x. Intercept bβ‚€ = y when x=0.
Linear Regression
The line of best fit β€” predicting one variable from another
The regression line minimizes the sum of squared residuals (least squares). Slope: for each 1-unit increase in x, y changes by b₁ units. Only predict within the range of your data (don't extrapolate). RΒ² = proportion of variation in y explained by x.
πŸ“Š Statistics
"Pie Γ  la Mode" β€” Mode is Most Popular
Mode
The tastiest way to remember what mode means
"Pie Γ  la mode" = fashionable in French. MODE = MOST POPULAR number in the dataset.
πŸ“Š Statistics
Median = Middle of the Highway
Median
The median strip runs down the center
Median strip = CENTER of highway. Median = CENTER value when numbers are in order. Resistant to outliers unlike mean.
πŸ“Š Statistics
"Bill Gates walks into a bar..."
Outliers: Mean vs Median
Why income data uses median not mean
Bill Gates walks in β€” average shoots up but no one feels richer. Outliers pull MEAN not MEDIAN.
πŸ“Š Statistics
68 Β· 95 Β· 99.7 β€” "The Radio Station Rule"
Empirical Rule
Three numbers that describe all normal distributions
68% within 1 SD. 95% within 2. 99.7% within 3. Think: "68.95 FM β€” the 99.7!"