Chi-Square Test of Independence

The Chi-Square Test tab tests whether two categorical variables are independent using Pearson's chi-square test of independence.

Getting started

Open the tab

Select Analysis > Chi-Square Test... from the menu bar.

Run the test

Configure the following in the settings panel:

Select the dataset from Dataset
Select a categorical variable for Row variable
Select a categorical variable for Column variable
Click Run

Both Row variable and Column variable must have at least 2 categories. Only columns with nominal or ordinal measurement scale are available.

Hypotheses

H₀ (null hypothesis): The row variable and column variable are independent.
H₁ (alternative hypothesis): The row variable and column variable are not independent.

Reading the results

The result panel displays the hypotheses, a conclusion at significance level $\alpha = 0.05$ , and the test statistics.

Statistic	Description
$\chi^2$	Pearson's chi-square statistic. Aggregates the deviation between observed and expected frequencies across all cells
df	Degrees of freedom $(r-1)(c-1)$ , where $r$ is the number of row categories and $c$ the number of column categories
p	p-value. The probability of obtaining a chi-square statistic at least as extreme as the observed value, assuming the null hypothesis is true
Cramer's V	Effect size. Computed as $V = \sqrt{\chi^2 / (N \cdot (\min(r, c) - 1))}$ , ranging from 0 to 1. 0 indicates complete independence and 1 indicates complete association. The interpretation of $V$ depends on the table dimension $\min(r, c) - 1$ , so care is needed when comparing $V$ across tables of different sizes

Contingency table

A contingency table is displayed below the result panel. Each cell shows both the observed frequency and the expected frequency. The expected frequency is the theoretical frequency under the null hypothesis of independence, calculated as $(row\ total \times column\ total) / grand\ total$ .

Rows with missing values are excluded from the analysis. The number of excluded rows is shown above the table.

Adjusted standardized residuals

Enable the Adjusted standardized residuals checkbox to display the adjusted standardized residual $d_{ij}$ for each cell.

$d_{ij} = \frac{O_{ij} - E_{ij}}{\sqrt{E_{ij}(1 - n_{i \cdot}/n)(1 - n_{\cdot j}/n)}}$

$O_{ij}$ is the observed frequency, $E_{ij}$ is the expected frequency, $n_{i \cdot}$ is the row total, $n_{\cdot j}$ is the column total, and $n$ is the grand total.

Cells with large absolute residuals deviate strongly from independence. Positive residuals indicate more observations than expected; negative residuals indicate fewer.

When residuals are enabled, cells are colored with a diverging color scale. Residuals exceeding the Bonferroni-corrected critical value are shown in bold. The Bonferroni correction divides the significance level $\alpha$ by the degrees of freedom $(r-1)(c-1)$ to adjust the threshold for each cell.

Limitations of the chi-square approximation

Pearson's chi-square test relies on the test statistic asymptotically following a chi-square distribution. When the sample size is small or many cells have low expected frequencies, the accuracy of this approximation decreases. Check the expected frequencies $(E)$ displayed in the contingency table to assess whether the approximation is appropriate.

For 2x2 tables, Yates' continuity correction and Fisher's exact test are known alternatives, but MIDAS currently computes only the uncorrected Pearson chi-square statistic.

Other test methods

For comparing means between two groups, use Two-Sample Test / Paired Test. For comparing means across three or more groups, use ANOVA. For frequency tabulation of categorical variables, use Crosstab.

References

Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine, 50(302), 157-175.
Agresti, A. (2007). An Introduction to Categorical Data Analysis (2nd ed., pp. 38-40). Wiley.