---
title: Orthogonal Polynomials
description: Generate orthogonal polynomial columns from a numeric column to improve polynomial regression accuracy.
priority: 0.5
---

# Orthogonal Polynomials {#orthogonal-polynomials}

The Orthogonal Polynomials tab generates orthogonal polynomial columns from a numeric column. With the raw polynomial basis $x, x^2, \dots, x^d$, correlations between columns grow extreme as the degree increases, causing coefficients and fitted values to lose significant digits. Using orthogonal polynomial columns as predictors in Linear Regression reduces the condition number of the design matrix to approximately 1, improving numerical precision.

## Basic Usage {#basic-usage}

### Opening Orthogonal Polynomials {#opening}

Select **Data > Orthogonal Polynomials...** from the menu bar to open a new Orthogonal Polynomials tab.

### Generating Columns {#generating-columns}

1. Select the target dataset from the **Dataset** dropdown
2. Select the numeric column to transform in **Source column**
3. Set the maximum polynomial degree in **Degree** (1 to 30). The degree must be less than the number of valid data points in the source column (rows after excluding null, NaN, and Infinity)
4. Click **Preview** to inspect the result
5. Enter a name for the output dataset in **Output Name**
6. Click **Save as Dataset**

The original dataset is not modified. A new derived dataset is created. Rows with null, NaN, or Infinity in the source column are excluded from the derived dataset. For the remaining rows, all original columns are retained and `poly_1`, `poly_2`, ..., `poly_{degree}` columns are appended. The output dataset may have fewer rows than the original.

Each orthogonal polynomial column is normalized to $\|P_j\|^2 = n$, where $n$ is the number of valid data points.

![Generating degree-3 orthogonal polynomials from the x column](../shared/images/orthogonal-polynomials-preview.webp)

## Polynomial Regression Workflow {#polynomial-regression}

To use orthogonal polynomials instead of the raw polynomial basis:

1. In the Orthogonal Polynomials tab, generate degree-$d$ polynomial columns from the `x` column and save the dataset
2. Open a Linear Regression tab and select the saved derived dataset
3. Set `y` as the response variable
4. Set `poly_1`, `poly_2`, ..., `poly_d` as explanatory variables

R-squared, residual SD, fitted values, and prediction intervals are identical to those from raw polynomial regression. The coefficients are expressed in the orthogonal polynomial basis and differ in both value and interpretation from the raw polynomial basis coefficients. Each `poly_j` coefficient represents how much the $j$-th orthogonal polynomial component contributes to the response variable. Because the basis is orthogonal, the $t$-test for each coefficient is independent of the others. If the $p$-value of the highest-degree `poly_d` is large, the degree-$d$ component does not contribute to the model.

To choose the polynomial degree, run regressions with different degrees and compare AIC or Adj. R-squared in [Linear Regression](linear-regression). Orthogonal polynomials improve numerical precision but do not prevent overfitting.

## Next steps {#next-steps}

- **[Linear Regression](linear-regression)** - Regression analysis with orthogonal polynomial columns

## See also {#see-also}

- **[Numerical Computing Fundamentals](concepts-numerical#polynomial-regression)** - How condition numbers affect accuracy
- **[Numerical Accuracy](numerical-accuracy)** - NIST StRD benchmark accuracy verification
- **[Dummy Coding](dummy-coding)** - Encoding categorical variables as dummy variables
