Numerical Computing Fundamentals

This page covers the concepts behind the Numerical Accuracy page.

Floating-Point Numbers and Significant Digits

Computers represent real numbers as finite-precision approximations. All browser applications, including MIDAS, use IEEE 754 double-precision floating-point numbers.

A double-precision number represents a real number as (1)s×(1+f)×2e(-1)^s \times (1 + f) \times 2^e. ss is a 1-bit sign, ee is an 11-bit exponent encoding the scale, and ff is the 52-bit significand representing the fractional part where 0f<10 \le f < 1. For normal numbers1, the integer part of (1+f)(1 + f) is always 1, so this 1 is not stored but implied. The 52 stored bits plus the implicit 1 bit give 53 bits of effective precision, corresponding to about 15.9 decimal digits:

log10(253)15.95\log_{10}(2^{53}) \approx 15.95

A single double-precision number can therefore represent at most about 15.9 significant digits. On the Numerical Accuracy page, LRE (Log Relative Error: a measure of how many significant digits match) close to this value means the computation is accurate to the limit of double precision.

When converting a real number to a floating-point number, any fraction that cannot be represented in a finite number of bits is rounded. This rounding error is tiny for a single operation, but can accumulate over successive computations. The extent of accumulation depends on the algorithm and data; catastrophic cancellation and condition numbers, discussed below, are the primary factors.

Catastrophic Cancellation

Catastrophic cancellation occurs when subtracting two nearly equal floating-point numbers, causing a large loss of significant digits.

Consider a=1.23456789012345a = 1.234567890123\mathbf{45} and b=1.23456789012300b = 1.234567890123\mathbf{00}. Both have 15 significant digits, but their difference ab=0.00000000000045a - b = 0.00000000000045 has only 2. The leading zeros merely indicate position and are not significant, so only the trailing 45 carries meaningful information. The 13 digits that were shared between aa and bb cancel out, leaving only the least reliable bits.

In statistics, variance computation is a classic example. The definitional form (xixˉ)2/(n1)\sum(x_i - \bar{x})^2 / (n-1) computes deviations from the mean, so the subtracted values are not close to each other. The algebraically equivalent form (xi2nxˉ2)/(n1)(\sum x_i^2 - n\bar{x}^2) / (n-1) is vulnerable to cancellation when xi2\sum x_i^2 and nxˉ2n\bar{x}^2 are close in magnitude. This happens when the mean is large relative to the standard deviation.

Condition Number

The condition number measures how much small perturbations in the input are amplified in the output. For a nonsingular matrix AA, the condition number is defined as:

κ(A)=AA1\kappa(A) = \|A\| \cdot \|A^{-1}\|

A\|A\| is the matrix 2-norm (largest singular value). Since AA1AA1=I=1\|A\| \cdot \|A^{-1}\| \ge \|AA^{-1}\| = \|I\| = 1, the condition number has a lower bound of 1. A matrix with a condition number close to 1 is called well-conditioned; one with a large condition number is called ill-conditioned. There is no sharp threshold; the distinction depends on the precision required.

When solving Ax=bAx = b in floating-point arithmetic, a relative perturbation of δ\delta in bb can produce a relative error up to κ(A)δ\kappa(A) \cdot \delta in xx. A condition number of κ\kappa can destroy up to log10(κ)\log_{10}(\kappa) digits of precision, so starting from double precision's roughly 15.9 digits, the number of surviving significant digits is at least approximately:

significant digits15.9log10(κ)\text{significant digits} \gtrsim 15.9 - \log_{10}(\kappa)

This lower bound assumes a numerically stable algorithm. An unstable algorithm may lose more digits than this.

Design Matrix

In the linear regression model Y=Xβ+εY = X\beta + \varepsilon, XX is the design matrix. YY is the response vector, β\beta is the coefficient vector, and ε\varepsilon is the error term. The design matrix has nn rows (one per observation) and pp columns (one per predictor). When an intercept is included, a column of ones is added.

The OLS (ordinary least squares) estimator is defined as β^=(XX)1XY\hat\beta = (X'X)^{-1}X'Y, where XX' denotes the transpose of XX. A large condition number in the design matrix causes rounding errors to be amplified, reducing the accuracy of computing β^\hat\beta. This is a numerical issue, distinct from the statistical problem of inflated variance due to multicollinearity. Design matrices become ill-conditioned when predictors are highly correlated; VIF in OLS Fundamentals measures a related phenomenon for individual predictors.

See OLS Fundamentals for the full mathematical treatment of the design matrix and OLS estimator properties.

Polynomial Regression

In polynomial regression, the columns of the design matrix are 1,x,x2,,xd1, x, x^2, \ldots, x^d. As the degree dd increases, the power columns become strongly correlated and the design matrix becomes ill-conditioned.

The NIST StRD datasets on the Numerical Accuracy page show this effect. The simple regression dataset Norris achieves a coefficient LRE of 12.3, but the 10th degree polynomial Filip drops to 7.3. Filip's design matrix has a condition number around 101410^{14}, and the increasing condition number with polynomial degree limits precision.

To reduce the condition number, replace the monomial basis 1,x,x2,1, x, x^2, \ldots with an orthogonal polynomial basis. With an orthogonal basis, the columns of the design matrix are mutually orthogonal, reducing the condition number to approximately 1. The Orthogonal Polynomials tab in MIDAS applies this transformation.

See also

Footnotes

  1. Normalization adjusts the exponent so that the integer part is 1, making the representation unique. For example, 0.50.5 is represented as 1.0×211.0 \times 2^{-1}. However, the exponent has a minimum value. For numbers close enough to zero that this minimum is reached, the integer part can no longer be kept at 1, producing subnormal numbers where the implicit leading 1 does not apply. Subnormal numbers have reduced precision, but the values encountered in statistical computations are never in this range.