卡方检验计算器
计算拟合优度检验和独立性检验的卡方(χ²)统计量。输入观测频数和期望频数,或填写列联表,即时计算 χ²、自由度、p 值和各单元格贡献值。
χ² = ∑ (Oᵢ − Eᵢ)² / Eᵢ
Space, comma, or tab separated values
Must have the same count as observed frequencies
常见问题
什么是卡方检验?
卡方检验是一种统计假设检验,用于评估观察频率与预期频率是否存在显著差异。它适用于分类数据,主要有两种形式:拟合优度检验(将单一分类变量与预期分布进行比较)和独立性检验(在列联表中检验两个分类变量之间的关系)。
如何计算卡方统计量?
卡方统计量的计算公式为:χ² = Σ (O − E)² / E,其中 O 是观察频率,E 是每个类别或单元格的预期频率。将所有类别或单元格的此值求和,得到总 χ² 统计量。
拟合优度检验和独立性检验有什么区别?
拟合优度检验用于检验单组观察计数是否符合指定的预期分布(例如,骰子是否公平?)。独立性检验使用列联表确定两个分类变量是否相关(例如,性别是否影响产品偏好?)。自由度公式不同:拟合优度为 k − 1,独立性为(行数 − 1)× (列数 − 1)。
卡方检验中的自由度是什么?
自由度(df)决定使用哪个卡方分布。拟合优度检验:df = k − 1(k = 类别数)。独立性检验:df = (行数 − 1) × (列数 − 1)。例如,3×4 列联表的 df = (3-1) × (4-1) = 6。
卡方检验的统计显著性 p 值是多少?
标准显著性阈值为 α = 0.05。如果 p 值小于 0.05,结果具有统计显著性,则拒绝零假设。也可以使用 α = 0.10 进行探索性分析,或使用 α = 0.01 进行更严格的标准。p 值表示在零假设为真的情况下,观察到等于或大于当前 χ² 值的概率。
卡方检验的最小预期频率是多少?
当所有预期单元格频率至少为 5 时,卡方近似是可靠的。如果某些单元格的预期频率低于 5,应考虑合并类别、收集更多数据,或对 2×2 表使用 Fisher 精确检验。极小的预期频率可能使 χ² 统计量膨胀并产生误导性的小 p 值。
如何解释每个单元格的贡献?
每个单元格对 χ² 的贡献为 (O − E)² / E。贡献大的单元格表明观察计数偏离预期最多。检查各单元格的贡献有助于识别哪些类别或变量组合驱动了总体关联。单个单元格的贡献超过 3.84(df = 1,α = 0.05)表明存在特别大的差异。
卡方检验可以用于连续数据吗?
不可以。卡方检验仅适用于计数数据(分类变量的频率)。对于连续数据,应使用 t 检验(比较两个均值)、方差分析(比较多个组均值)或皮尔逊/斯皮尔曼相关(测量连续变量之间的关联)。要对连续数据使用卡方检验,必须先将值分组到类别中。
Chi-Square Formula
The chi-square statistic (χ²) measures how much the observed frequencies differ from what we expect under the null hypothesis. The core formula is:
Chi-Square Statistic
χ² = ∑ (Oᵢ − Eᵢ)² / Eᵢ
Oᵢ = Observed frequency for category i
Eᵢ = Expected frequency for category i
∑ = Sum across all categories or cells
The larger the χ² value, the greater the discrepancy between observed and expected frequencies. To determine whether this discrepancy is statistically significant, we compare the χ² statistic to a critical value from the chi-square distribution with the appropriate degrees of freedom — or equivalently, compute the p-value.
Goodness of Fit vs Test of Independence
There are two primary types of chi-square tests, each answering a different question:
| Feature | Goodness of Fit | Test of Independence |
|---|---|---|
| Question | Does the distribution match a specific expected distribution? | Are two categorical variables independent? |
| Input | One set of observed frequencies + one set of expected | 2D contingency table (rows × columns) |
| df formula | k − 1 (k = number of categories) | (rows − 1) × (cols − 1) |
| Example | Is a die fair? Does survey data follow a known distribution? | Is smoking status related to lung disease? Does gender affect preference? |
Goodness of Fit
Use the Goodness of Fit test when you have a single categorical variable and want to compare observed counts to theoretically expected counts. For example, if you roll a die 100 times, the expected frequency for each face is 100/6 ≈ 16.67. The goodness of fit test tells you whether the observed roll counts deviate significantly from this expectation.
Test of Independence
Use the Test of Independence (also called the chi-square contingency test) when you have two categorical variables and want to determine whether they are statistically related. The expected frequency for each cell is calculated as:
Eᵢⱼ = (Row Totalᵢ × Column Totalⱼ) / Grand Total
Degrees of Freedom
Degrees of freedom (df) determine which chi-square distribution to use when computing the p-value. The df reflects how many independent pieces of information you have after accounting for constraints.
- Goodness of Fit:df = k − 1, where k is the number of categories. We lose 1 degree of freedom because the observed frequencies must sum to the total.
- Test of Independence:df = (rows − 1) × (columns − 1). For a 2×2 table, df = 1. For a 3×4 table, df = 6.
A higher df shifts the chi-square distribution to the right, requiring a larger χ² value to achieve statistical significance at the same alpha level.
Chi-Square Critical Values Table
The table below shows critical χ² values for common degrees of freedom and significance levels. If your calculated χ² exceeds the critical value, the result is statistically significant.
| df | α = 0.10 | α = 0.05 | α = 0.025 | α = 0.01 |
|---|---|---|---|---|
| 1 | 2.706 | 3.841 | 5.024 | 6.635 |
| 2 | 4.605 | 5.991 | 7.378 | 9.21 |
| 3 | 6.251 | 7.815 | 9.348 | 11.345 |
| 4 | 7.779 | 9.488 | 11.143 | 13.277 |
| 5 | 9.236 | 11.07 | 12.832 | 15.086 |
| 6 | 10.645 | 12.592 | 14.449 | 16.812 |
| 8 | 13.362 | 15.507 | 17.535 | 20.09 |
| 10 | 15.987 | 18.307 | 20.483 | 23.209 |
| 15 | 22.307 | 24.996 | 27.488 | 30.578 |
| 20 | 28.412 | 31.41 | 34.17 | 37.566 |
Chi-Square Calculation Examples
Example 1: Goodness of Fit — Fair Die
You roll a die 100 times and observe: 16, 18, 16, 14, 12, 24. Is the die fair? Expected frequency for each face: 100 / 6 ≈ 16.67.
χ² = (16 − 16.67)² / 16.67 + (18 − 16.67)² / 16.67 + ...
χ² ≈ 4.68, df = 5
p-value ≈ 0.456
Result: Not significant (α = 0.05). No evidence the die is unfair.
Example 2: Test of Independence — Gender vs. Preference
Survey of 100 people: Do men and women prefer different products? Contingency table: Men — 20 prefer A, 30 prefer B; Women — 35 prefer A, 15 prefer B.
Row totals: Men = 50, Women = 50
Col totals: A = 55, B = 45, Grand = 100
Expected (Men, A) = 50 × 55 / 100 = 27.5
χ² ≈ 8.08, df = 1
p-value ≈ 0.004
Result: Significant (α = 0.05). Gender and preference are related.
Assumptions & Limitations
- Independence: Each observation must be independent of others. Repeated measures or clustered data violate this assumption.
- Expected frequency rule:Each expected cell frequency should be at least 5. If not, consider combining categories or using Fisher's exact test (for 2×2 tables).
- Categorical data: Chi-square tests apply only to counts, not means or proportions expressed as decimals. For continuous data, use t-tests or ANOVA.
- Sample size: The chi-square approximation improves with larger samples. Very small samples may produce unreliable p-values.
- Two-sided only: Chi-square tests are inherently non-directional. They detect any departure from the expected distribution, not a specific direction.