DoE Analysis (Design of Experiments)

The DoE Analysis tab analyzes 2-level factorial experiment data. It supports orthogonal array generation, ANOVA for testing factor effects, and visualization through main effects plots, interaction plots, and Pareto charts.

Open the Tab

Select Analysis > DoE Analysis... from the menu bar.

Generate an Orthogonal Array

Select Data > New DoE Design... from the menu bar, or click New Design... in the top-right corner of the DoE Analysis tab.

Define Factors

Enter a name and two level labels for each factor. At least 2 factors are required.

Array Type

Select an orthogonal array type based on the number of factors.

TypeRunsMax Factors
L443
L887
L161615

Orthogonal arrays are generated from Hadamard matrices. Every pair of factors has balanced level combinations, and factors are uncorrelated in the design matrix.

Replication

Each row in the orthogonal array represents one experimental condition. To run the same condition multiple times, add duplicate rows to your data. Without replication, residual degrees of freedom are low and error estimation becomes imprecise. For example, an L4 with 3 factors and main effects only uses 3 terms plus intercept, leaving zero residual degrees of freedom with only 4 rows. Add replicates or choose an array type with more runs than needed.

Randomization

Enable Randomize run order to shuffle the experimental conditions. Randomization prevents systematic bias from run order and is recommended for actual experiments.

Generate the Dataset

After previewing the array, click Generate to add it as a dataset. The response column is created empty. Double-click cells in the data table to enter your experimental results.

Run an Analysis

Configure the following in the settings panel:

  1. Select a dataset from Dataset
  2. Select a numeric variable for Response Variable
  3. Select 2-level categorical variables under Factors (at least 2)
  4. Choose a Model type
  5. Set the Significance Level. The default is α = 0.05
  6. Click Run Analysis

DoE setup with 3 factors selected and all 2-factor interactions model

Data Requirements

Factors must be categorical variables. Columns with nominal or ordinal measurement scale appear as factor candidates. The current version supports only 2-level factors. Selecting a factor with 3 or more levels produces an error message when you run the analysis.

Select a numeric column for the response variable.

Model Selection

Main effects only: Includes only the main effect of each factor. Use this when interactions between factors can be assumed negligible.

Main effects + all 2-factor interactions: Adds all pairwise 2-factor interaction terms to the main effects. Use this when interactions may be present. More factors mean more interaction terms and fewer residual degrees of freedom. For example, selecting all 7 factors from an L8 with all 2-factor interactions requires 7 main effects + 21 interactions + intercept = 29 parameters, but only 8 rows of data are available. Start with main effects only and add interactions as needed.

Main effects + selected interactions: Adds only the interaction terms you select.

Reading the Results

Results are displayed across four sub-tabs.

ANOVA Table

Tests the effect of each factor and interaction using Type III sums of squares.

ColumnDescription
SourceFactor or interaction name
DFDegrees of freedom. Each term has 1 DF for 2-level factors
Adj SSAdjusted sum of squares, controlling for all other terms
Adj MSAdjusted mean square (Adj SS / DF)
F-ValueF statistic (Adj MS / Error MS). The p-value is the upper-tail probability of the F distribution
P-Valuep-value

R-squared, Adjusted R-squared, and Model SE are shown below the table.

ANOVA table with assumption diagnostics for a 3-factor model with all 2-factor interactions

Click a row to highlight the corresponding factor in the main effects or interaction plots.

Main Effects Plot

Displays the observed mean response at each level of each factor as a line chart. Each factor gets its own subplot, and the Y-axis scale is shared across all subplots. A steeper slope indicates a larger effect on the response.

The horizontal dashed line represents the grand mean.

Enable Show 95% confidence intervals to display error bars for each level mean. The standard error is SE=MSE/ni\text{SE} = \sqrt{\text{MSE} / n_i} where MSE is the residual mean square from the fitted model and nin_i is the number of observations at that level. Interval width uses the t-distribution critical value based on the residual degrees of freedom. These error bars show the precision of individual level means. Do not judge the significance of differences between levels by whether error bars overlap. Use the F-test in the ANOVA Table for that purpose.

Main effects plot showing level means with 95% confidence intervals for Temperature, Pressure, and Catalyst

Click a point to select the corresponding rows in the data table.

Interaction Plot

Shows cell means for each pair of factors. The X-axis represents one factor, and color-coded lines represent levels of the other factor. With nn factors, (n2)\binom{n}{2} subplots are drawn.

Lines that are nearly parallel suggest little interaction. Crossing lines indicate that one factor's effect depends on the level of the other factor. Check the ANOVA Table for formal significance of interactions.

Interaction plot showing cell means for all 3 factor pairs. Lines are nearly parallel, indicating small interactions

Click a point to select the corresponding cell's rows in the data table.

Pareto Chart

Compares the magnitude of each factor and interaction effect using the standardized effect t|t|. Bars are sorted in descending order of t|t|.

The red dashed vertical line marks the critical t-value tα/2,νt_{\alpha/2,\, \nu} for the configured significance level α, where ν\nu is the residual degrees of freedom. Bars exceeding this line are statistically significant at that level. Significant bars are shown in blue, non-significant bars in gray.

Pareto chart showing Pressure, Temperature, and Catalyst as significant main effects. Interactions are non-significant

Click a bar to switch to the corresponding main effects or interaction plot.

Statistical Model

Effect Coding

Each 2-level factor is coded as +1 for the first level (alphabetically) and -1 for the second level. Because the two levels are placed at ±1, the regression coefficient equals half the difference between level means. The effect estimates shown in the ANOVA table and main effects plot are twice the coefficient, representing the full difference in mean response between levels.

Interaction columns are the element-wise product of the corresponding main effect columns.

Estimation

A design matrix including the intercept and all terms is constructed, and coefficients are estimated by ordinary least squares via Householder QR decomposition.

Type III Sums of Squares

Each factor's sum of squares is computed from the full model's t-values as SSj=tj2×MSE\text{SS}_j = t_j^2 \times \text{MSE}. Since each 2-level factor has 1 degree of freedom, F=t2F = t^2 holds. This evaluates each factor's unique contribution after adjusting for all other factors, independent of the order in which factors are entered. See ANOVA for more on Type III sums of squares.

Assumptions

This analysis assumes:

  • Independence: Each experimental run is conducted independently
  • Normality: The error in the response variable follows a normal distribution
  • Homogeneity of variance: The error variance is equal across all factor level combinations

Assumption diagnostics appear below the ANOVA table in the ANOVA Table sub-tab.

Levene's Test for Homogeneity of Variances: Tests whether the error variance is equal across all factor level combinations (cells). Uses the Brown-Forsythe variant based on deviations from cell medians.

Residual Normality (Shapiro-Wilk): Tests whether the model residuals follow a normal distribution. Available for sample sizes from 3 to 5000.

Missing Values

Rows with missing values in any factor or the response variable are excluded from the analysis. The number of excluded rows is shown in the results panel. If your data was generated from an orthogonal array and rows are excluded, the design loses its orthogonality. Factors become correlated, effect estimates lose precision, and Type III sums of squares require careful interpretation. Minimize missing data, or interpret results cautiously when exclusions occur.

  • ANOVA -- One-way and two-way ANOVA without the 2-level restriction
  • Linear Regression -- Regression analysis with continuous predictors