---
title: Agent API (window.midas)
description: Use the window.midas API to control MIDAS programmatically from AI agents, Playwright, or other external tools. Manage datasets, tabs, statistical models, reports, and layout through JavaScript.
priority: 0.6
---

# Agent API (window.midas) {#agent-api-windowmidas}

A JavaScript API for controlling MIDAS programmatically from AI agents, Playwright, or other external tools.

- [Overview](#overview)
- [Usage](#usage)
- [Response Format](#response-format)
- [Data Model](#data-model)
- [Method Reference](#method-reference): [status](#status) / [project](#project) / [datasets](#datasets) / [enums](#enums) / [tabs](#tabs) / [models](#models) / [reports](#reports) / [layout](#layout)
- [Error Codes](#error-codes)

## Overview {#overview}

`window.midas` provides access to MIDAS features from the browser DevTools console or automation tools like Playwright. Use it to manage datasets, tabs, statistical models, reports, and layout.

### Availability {#availability}

- **Project screen**: All methods are available
- **Launcher screen**: `help()`, `project.openFile()`, and `project.openUrl()` are available. Other methods return a `NO_PROJECT` error

When automating with Playwright, use `project.openFile()` or `project.openUrl()` to open a project, then use the other API methods.

### Quick Start for AI Agents {#quick-start-for-ai-agents}

1. Call [`status()`](#status) to check the current project state (datasets, tabs, models)
2. Call [`datasets.list()`](#datasetslist) and [`datasets.describe(id)`](#datasetsdescribeid) to understand the available data
3. Read [Data Model](#data-model) to understand persistence, dataset types, and graph specification
4. Refer to [Method Reference](#method-reference) for the specific operation you need

## Usage {#usage}

### From DevTools Console {#from-devtools-console}

Open the browser DevTools and call methods directly in the console.

```javascript
// Check project status
const result = await window.midas.status();
console.log(result.data);
// { datasets: 3, tabs: 2, models: 1, ... }
```

### From Playwright {#from-playwright}

```javascript
const result = await page.evaluate(async () => {
  return await window.midas.datasets.list();
});
console.log(result.data);
// [{ id: '...', name: 'Iris', rows: 150, columns: 5, type: 'primary' }, ...]
```

### Viewing Help {#viewing-help}

Call `help()` to see a list of available methods and their signatures. The returned object also includes a `documentation` field with the URL of this page for detailed parameter schemas and configuration options.

```javascript
const help = window.midas.help();
console.log(help.documentation);
// "https://midas-app.org/docs/en/agent-api.md"
```

## Response Format {#response-format}

All methods except `help()` are async and return a unified `APIResult<T>` response. `help()` is synchronous and returns a `HelpInfo` object directly.

```javascript
// Success
{
  success: true,
  message: "Found 3 datasets",
  data: [...]
}

// Failure
{
  success: false,
  message: "Dataset not found",
  error: {
    code: "DATASET_NOT_FOUND",
    message: "No dataset with ID 'abc'",
    suggestion: "Use datasets.list() to see available datasets"
  }
}
```

A `warnings` field may be included when the operation succeeds but there are points to note. For example, [`models.run()`](#modelsrunconfig) stores data preparation warnings in the top-level `warnings` and model execution warnings in `data.warnings`.

```javascript
// Example response with warnings
{
  success: true,
  message: "Model run completed",
  warnings: ["3 rows with missing values were excluded from analysis"],
  data: {
    runId: '...',
    warnings: ["Convergence achieved but Hessian is nearly singular"],
    ...
  }
}
```

## Data Model {#data-model}

Understanding how MIDAS manages data helps you use the API effectively.

### Persistence {#persistence}

API operations modify the in-memory project state. Nothing is written to [browser storage](privacy-security) until you call [`project.save()`](#projectsave). If you reload the page without saving, all changes from that session are lost.

```javascript
// Modify a dataset's schema
await window.midas.datasets.setColumnSchema('ds_001', { ... });

// At this point, the change exists only in memory

await window.midas.project.save();
// Now written to browser storage
```

### Dataset Types {#dataset-types}

[`datasets.list()`](#datasetslist) returns two types of datasets.

**Primary** — Original data imported from CSV or other files. The data itself is stored in the project file.

**Derived** — Data created by a transformation such as SQL or cross-tabulation. The project file stores the operation definition (e.g., which SQL was executed) rather than the data itself. The data is a cache and is not included in the project file. Each time the project is opened, derived datasets are recomputed from their parent datasets, reflecting the parent data at that point. Use `parentIds` to inspect dependency relationships.

```javascript
const result = await window.midas.datasets.list();
// [
//   { id: 'ds_001', name: 'Sales', type: 'primary', ... },
//   { id: 'derived_001', name: 'Monthly Total', type: 'derived', parentIds: ['ds_001'], ... }
// ]
```

MIDAS also creates temporary Ephemeral Datasets for internal rendering, but these are not included in [`datasets.list()`](#datasetslist).

### Report Element Lifecycle {#report-element-lifecycle}

A report consists of two parts: Markdown text called **content**, and **elements** such as graphs and model summaries. Writing a reference like `{{graph_builder:element_001}}` in the content renders the corresponding element at that position.

[`reports.addGraph()`](#reportsaddgraphreportid-config) and [`reports.addModelSummary()`](#reportsaddmodelsummaryreportid-modelid) create an element and insert a reference into the content in one step. [`reports.removeElement()`](#reportsremoveelementreportid-elementid) removes both the element and its content reference.

When a model or dataset is deleted, any report elements that depend on it are automatically removed. No manual cleanup is needed.

### Graph Specification {#graph-specification}

Custom Graph configuration has two layers.

**Graph level** — Settings that apply to the entire graph: data source, coordinate system (`coordinates`), faceting (`facets`), and axis scales (`scales`). Specified via [`tabs.configureGraph()`](#tabsconfiguregraphtabid-config) or [`reports.addGraph()`](#reportsaddgraphreportid-config).

**Layer level** — A drawing unit that combines a geometric element (`geom`: points, lines, bars, etc.), statistical transformations (`stats`), and aesthetic mapping (`aes`: which columns map to the x-axis, color, size, etc.). Multiple layers can be stacked on a single graph. Add layers with [`tabs.addGraphLayer()`](#tabsaddgraphlayertabid-layer). Each layer can configure its own color/fill scale via `scales`.

The graph-level `globalAes` serves as the default for all layers, and each layer's `aes` can override it.

For full configuration details, see [Custom Graph](custom-graph) and [Custom Graph Reference](custom-graph-reference).

## Method Reference {#method-reference}

### status() {#status}

Get the current project status.

```javascript
const result = await window.midas.status();
// result.data:
// {
//   datasets: 3,
//   derivedDatasets: 1,
//   tabs: 2,
//   models: 1,
//   reports: 1,
//   activeDatasetId: 'ds_001',
//   activeTabId: 'tab_001'
// }
```

### project {#project}

#### project.save() {#projectsave}

Save the project to browser storage. See [Privacy and Security](privacy-security) for details on storage.

```javascript
await window.midas.project.save();
```

Returns a `SANDBOX_MODE` error in sandbox mode (projects where persistence is disabled, such as demos or trials).

#### project.exportMds() {#projectexportmds}

Export the project as an MDS (MIDAS project file format) binary. The exported data is returned as an `ArrayBuffer`.

```javascript
const result = await window.midas.project.exportMds();
// result.data: { data: ArrayBuffer, size: 12345, suggestedFilename: 'MyProject.mds' }
```

#### project.downloadMds() {#projectdownloadmds}

Download the project as an MDS file through the browser.

```javascript
const result = await window.midas.project.downloadMds();
// result.data: { filename: 'MyProject.mds' }
```

#### project.openFile(data, options?) {#projectopenfile}

Open a project from MDS binary data (`Uint8Array`). Also available on the launcher screen.

```javascript
const buf = await fetch('/project.mds').then(r => r.arrayBuffer());
const result = await window.midas.project.openFile(new Uint8Array(buf));
// result.data: { projectId: 'project-xxx' }
```

Options:

- `sandbox` (boolean, default: `false`) — When `true`, assigns a new ID and skips saving to browser storage
- `onDuplicate` (`'overwrite'` | `'copy'`, default: `'overwrite'`) — Behavior when a project with the same ID already exists in browser storage. `'overwrite'` replaces the existing project; `'copy'` assigns a new ID only when a duplicate exists

If the current project has unsaved changes, a confirmation dialog is shown. Returns `USER_CANCELLED` if the user declines. Signature warnings are not shown as a dialog; they are reported via the `warnings` array in the response.

#### project.openUrl(url, options?) {#projectopenurl}

Fetch an MDS file from a URL and open it. Always opens in sandbox mode (not saved to browser storage). Also available on the launcher screen.

```javascript
const result = await window.midas.project.openUrl('https://example.com/project.mds');
// result.data: { projectId: 'project-xxx' }
```

Options:

- `signal` (AbortSignal) — Used to abort the fetch

If the current project has unsaved changes, a confirmation dialog is shown. Returns `USER_CANCELLED` if the user declines. Signature warnings are not shown as a dialog; they are reported via the `warnings` array in the response.

The same URL validation and security restrictions as [`datasets.importFromURL()`](#datasetsimportfromurlurl-options) apply. Only HTTP/HTTPS protocols are allowed, and access to cloud metadata endpoints is blocked. Blocked URLs return an `INVALID_INPUT` error. Warnings are included in `warnings` for URLs not in the trusted URL list; if "Block connections to untrusted domains" is enabled in settings, untrusted URLs result in an error. See [Privacy and Security](privacy-security) for details. Returns `FETCH_ERROR` on both network failures and timeouts.

### datasets {#datasets}

#### datasets.list() {#datasetslist}

List all datasets in the project.

```javascript
const result = await window.midas.datasets.list();
// result.data: [{ id, name, rows, columns, type, parentIds? }, ...]
```

`type` is either `'primary'` (imported data) or `'derived'` (created by SQL or other operations). `parentIds` contains the IDs of source datasets for derived datasets. Temporary internal datasets (ephemeral) are not included in the list.

#### datasets.describe(id) {#datasetsdescribeid}

Get detailed information about a dataset. Accepts a dataset ID or name (case-insensitive).

```javascript
const result = await window.midas.datasets.describe('Iris');
// result.data:
// {
//   id: 'ds_001',
//   name: 'Iris',
//   type: 'primary',
//   rowCount: 150,
//   columns: [
//     { id: 'col_001', name: 'sepal_length', type: 'float64', scale: 'ratio' },
//     { id: 'col_002', name: 'species', type: 'string', scale: 'nominal', enumName: 'species_enum' },
//     ...
//   ]
// }
```

Each column entry contains `id`, `name`, and `type`. `scale` and `enumName` are optional. `enumName` is present when the column type is enum, indicating the associated enum definition name.

#### datasets.profile(id) {#datasetsprofileid}

Get column-level summary statistics for a dataset. Accepts a dataset ID or name (case-insensitive). Returns a `NO_DATA` error for unmaterialized derived datasets — those not yet recomputed, such as right after opening a saved project. Opening the dataset in a tab or running a query that references it triggers materialization.

```javascript
const result = await window.midas.datasets.profile('Iris');
// result.data:
// {
//   id: 'ds_001',
//   name: 'Iris',
//   rowCount: 150,
//   columns: [
//     {
//       name: 'sepal_length', type: 'float64', scale: 'ratio',
//       nullCount: 0, uniqueCount: 35, nonFiniteCount: 0,
//       min: 4.3, max: 7.9, mean: 5.843, median: 5.8, sd: 0.828
//     },
//     {
//       name: 'species', type: 'string', scale: 'nominal',
//       nullCount: 0, uniqueCount: 3,
//       topValues: [
//         { value: 'setosa', count: 50 },
//         { value: 'versicolor', count: 50 },
//         { value: 'virginica', count: 50 }
//       ]
//     },
//     ...
//   ]
// }
```

Every column includes `nullCount` and `uniqueCount`. `uniqueCount` counts all distinct non-null values, including non-finite values (Infinity, NaN). Numeric columns (`int64`, `float64`) additionally include `nonFiniteCount` (number of Infinity/NaN values), `min`, `max`, `mean`, `median`, and `sd` (sample standard deviation, Bessel-corrected with n-1 divisor). Non-finite values are excluded from statistical computations. `min` through `sd` are `null` when no valid numeric values exist. `sd` is also `null` when only one valid value exists (n < 2). String and enum columns include `topValues` (up to 5 most frequent values).

#### datasets.query(sql, options?) {#datasetsquerysql-options}

Execute a SQL query and return rows (read-only, no dataset creation). SQL follows DuckDB syntax.

```javascript
const result = await window.midas.datasets.query(
  'SELECT species, AVG(sepal_length) as avg_sl FROM Iris GROUP BY species'
);
// result.data: { columns: ['species', 'avg_sl'], totalRows: 3, returnedRows: 3, rows: [...] }
```

Table names are automatically resolved from dataset names (case-insensitive). For dataset names containing spaces or non-ASCII characters, use double quotes in SQL (e.g., `SELECT * FROM "My Dataset"`). Use `options.limit` and `options.offset` for pagination. When `limit` is omitted, all rows are returned.

Only single SELECT statements are accepted; multiple statements separated by semicolons and DML/DDL statements are rejected.

#### datasets.derive(sql, name, options?) {#datasetsderivesql-name-options}

Execute a SQL query and save the result as a new derived dataset. SQL follows DuckDB syntax.

```javascript
const result = await window.midas.datasets.derive(
  'SELECT species, AVG(sepal_length) as avg_sl FROM Iris GROUP BY species',
  'Species Averages'
);
// result.data: { datasetId: 'derived_...', name: 'Species Averages', rowCount: 3, columnCount: 2, overwrote: false }
```

Table names are automatically resolved from dataset names (case-insensitive). For dataset names containing spaces or non-ASCII characters, use double quotes in SQL (e.g., `SELECT * FROM "My Dataset"`). The result data is materialized in memory when `derive()` completes, so it can be read immediately with [`datasets.fetch()`](#datasetsfetchid-options) or [`datasets.profile()`](#datasetsprofileid). By default, if a derived dataset with the same name exists, it is updated in place. The existing dataset ID is preserved, so tabs and dependent datasets referencing that ID remain valid. Set `options.overwrite` to `false` to prevent overwriting; if a derived dataset with the same name already exists, an error is returned.

If the output name resolves to the same dataset ID as any table referenced in the SQL's `FROM` / `JOIN` clauses, or to any ancestor of those referenced tables (a dataset that one of the referenced tables was derived from), the operation would create a dependency cycle and is rejected with a `SELF_REFERENCE` error (e.g., `derive('SELECT species, COUNT(*) FROM Iris GROUP BY species', 'Iris')`, or, given a chain `Iris → A → B`, `derive('SELECT * FROM B', 'A')`). If the output name matches an existing primary dataset, a `NAME_CONFLICT` error is returned (primary datasets cannot be overwritten by derived methods). If a derived dataset with the same name was created by a different method (e.g., trying to overwrite an `addColumns` dataset with `derive`), an `OPERATION_TYPE_MISMATCH` error is returned.

When the SQL references multiple tables (via `JOIN` or subqueries), all referenced datasets are stored as the derived dataset's `parentIds`. You can inspect them via `datasets.list()` — each derived dataset entry includes a `parentIds` array. The Project Lineage tab shows edges from each parent to the derived dataset.

Only single SELECT statements are accepted; multiple statements separated by semicolons and DML/DDL statements are rejected. To bring external data in, use `datasets.importFromURL` or `datasets.importFromBuffer`.

#### datasets.importFromURL(url, options?) {#datasetsimportfromurlurl-options}

Fetch a CSV/TSV file from an external URL and import it as a dataset.

```javascript
const result = await window.midas.datasets.importFromURL(
  'https://example.com/data.csv'
);
// result.data: { datasetId: 'ds_...', name: 'data', rowCount: 100, columnCount: 5 }
```

Use the `name` property in `options` to specify the dataset name after import. When omitted, the name is inferred from the URL. If a dataset with the same name (case-insensitive) already exists, the method returns a `DATASET_ALREADY_EXISTS` error. Set `options.overwrite` to `true` to replace the existing dataset in place (ID preserved). `importFromURL` / `importFromBuffer` can overwrite primary datasets as well (unlike derived methods, which reject such cases with `NAME_CONFLICT`). The returned `columnCount` matches the number of columns in the source file. The internal row number column (`Row#`) added by MIDAS is not included in the count.

Parse failures (empty data, empty header row, column count mismatch between rows, URL validation errors, invalid content type, etc.) return an `INVALID_INPUT` error. Network failures and timeouts return `EXECUTION_ERROR`. Files exceeding the size warning threshold (default 10 MB, configurable in Settings) are still imported, but a size warning is included in `result.warnings`.

URL validation and security restrictions apply. Only HTTP/HTTPS protocols are allowed, and access to cloud metadata endpoints is blocked. Warnings are issued for URLs not in the trusted URL list. If "Block connections to untrusted domains" is enabled in settings, untrusted URLs result in an error. See [Privacy and Security](privacy-security) for details.

#### datasets.importFromBuffer(data, options?) {#datasetsimportfrombufferdata-options}

Import CSV/TSV data from an `ArrayBuffer` or TypedArray (`Uint8Array`, Node.js `Buffer`, etc.) as a dataset. Use this when you want to load a local CSV from a Playwright test without spinning up an HTTP server.

```javascript
// Playwright: load a local CSV via page.evaluate
import { readFileSync } from 'fs';
const csvBytes = Array.from(readFileSync('fixtures/sales.csv'));

const result = await page.evaluate(async (bytes) => {
  const buffer = new Uint8Array(bytes).buffer;
  return await window.midas.datasets.importFromBuffer(buffer, {
    name: 'Sales',
  });
}, csvBytes);
// result.data: { datasetId: 'ds_...', name: 'Sales', rowCount: 500, columnCount: 7 }
```

`data` accepts an `ArrayBuffer` or any `ArrayBufferView` (`Uint8Array`, `DataView`, Node.js `Buffer`, etc.). The `options` properties are:

- `name`: Dataset name after import. Defaults to `"Untitled"`.
- `hasHeader`: Whether to treat the first row as a header. Defaults to `true`.
- `encoding`: Character encoding (`"utf-8"`, `"shift_jis"`, `"euc-jp"`). When omitted, encoding is auto-detected from the byte sequence.
- `overwrite`: Whether to replace an existing dataset with the same name. Defaults to `false`.

The delimiter is auto-detected by PapaParse, so both CSV and TSV can be passed. The returned `columnCount` matches the number of columns in the source file. The internal row number column (`Row#`) added by MIDAS is not included in the count.

If a dataset with the same name (case-insensitive) already exists, the method returns a `DATASET_ALREADY_EXISTS` error. Set `overwrite: true` to replace the existing dataset in place (ID preserved). `importFromBuffer` can overwrite primary datasets as well (unlike derived methods, which reject such cases with `NAME_CONFLICT`). Parse failures (empty data, empty header row, column count mismatch between rows, etc.) return an `INVALID_INPUT` error.

#### datasets.reloadFromURL(options?) {#datasetsreloadfromurloptions}

Re-fetch datasets that were originally imported from a URL and update them with the latest data.

```javascript
// Re-fetch all URL-sourced datasets
const result = await window.midas.datasets.reloadFromURL();
// result.data: { reloaded: [{ datasetId, name, rowCount, previousRowCount }], failed: [] }

// Re-fetch a specific dataset only
const result = await window.midas.datasets.reloadFromURL({
  datasetId: 'primary_abc123',
});
```

When `options.datasetId` is specified, only that dataset is re-fetched. When omitted, all primary datasets that were imported from a URL are targeted. If the specified dataset ID does not exist, the method returns a `DATASET_NOT_FOUND` error. If the dataset exists but was not imported from a URL, it returns an `INVALID_INPUT` error.

Reload preserves the dataset ID, name, derived datasets, and model bindings. Excluded rows and row comments are cleared. If the CSV schema at the source URL has changed (column count, names, or types), that dataset's reload fails.

`result.data.reloaded` contains information about successfully re-fetched datasets (`datasetId`, `name`, updated `rowCount`, and `previousRowCount`). `result.data.failed` contains information about failed datasets (`datasetId`, `name`, `sourceUrl`, and `error`). When some datasets fail, `success` is still `true`. `success` is `false` only when all reloads fail.

#### datasets.addColumns(datasetId, input) {#datasetsaddcolumnsdatasetid-input}

Add computed columns to a dataset. `datasetId` accepts a dataset ID or name (case-insensitive). The result is created as a new derived dataset. `expression` follows DuckDB SQL expression syntax. SQL functions such as `CASE WHEN` and `CAST` are supported.

```javascript
const result = await window.midas.datasets.addColumns('Iris', {
  columns: [
    { name: 'bmi', expression: 'weight / (height * height)' }
  ]
});
// result.data: { datasetId: 'derived_...', name: '...', rowCount: 150, columnCount: 6 }
```

Use `outputName` to specify the output dataset name. If a derived dataset with the same name exists, it is updated in place, preserving the existing dataset ID. If `outputName` resolves to the source `datasetId` or any of its ancestors (datasets it was derived from), a `SELF_REFERENCE` error is returned to prevent a dependency cycle; if it collides with an existing primary dataset, a `NAME_CONFLICT` error is returned. If the existing dataset was created by a different method, an `OPERATION_TYPE_MISMATCH` error is returned.

#### datasets.addOrthogonalPolynomials(datasetId, input) {#datasetsaddorthogonalpolynomialsdatasetid-input}

Add orthogonal polynomial columns to a dataset. `datasetId` accepts a dataset ID or name (case-insensitive). Used as explanatory variables in polynomial regression.

```javascript
const result = await window.midas.datasets.addOrthogonalPolynomials('Iris', {
  column: 'temperature',
  degree: 3
});
// result.data: { datasetId: 'derived_...', name: '...', rowCount: 150, columnCount: 8, columnNames: ['temperature_poly1', 'temperature_poly2', 'temperature_poly3'] }
```

Maximum `degree` is 30. Use `outputName` to specify the output dataset name. If `outputName` resolves to the source `datasetId` or any of its ancestors (datasets it was derived from), a `SELF_REFERENCE` error is returned to prevent a dependency cycle; if it collides with an existing primary dataset, a `NAME_CONFLICT` error is returned. If the existing dataset was created by a different method, an `OPERATION_TYPE_MISMATCH` error is returned.

#### datasets.setColumnSchema(datasetId, columnId, schema) {#datasetssetcolumnschemadatasetid-columnid-schema}

Change a column's data type, measurement scale, or enum definition. `datasetId` accepts a dataset ID or name (case-insensitive).

```javascript
const result = await window.midas.datasets.setColumnSchema('Iris', 'col_002', {
  type: 'enum',
  scale: 'nominal',
  enumName: 'species_enum'
});
// result.data: { datasetId: 'ds_001', columnId: 'col_002', createdDerived: true, derivedDatasetId: 'derived_...', overwrote: false }
```

`schema` accepts `type`, `scale`, and `enumName`. At least one is required. Changing the data type involves SQL type conversion, which creates a new derived dataset. If a derived dataset with the same output name already exists, it is updated in place, preserving the existing dataset ID. If `outputName` points to the source dataset itself or any of its ancestors (datasets it was derived from), a `SELF_REFERENCE` error is returned to prevent a dependency cycle; if it collides with an existing primary dataset, a `NAME_CONFLICT` error is returned. If the existing dataset was created by a different method, an `OPERATION_TYPE_MISMATCH` error is returned. Changing only the measurement scale updates metadata in place without creating a derived dataset.

When converting to enum type, all column values must be in the enum definition or NULL. If out-of-range values are present, the call is rejected with `ENUM_VALUE_MISMATCH`. Use [Convert Column Types](column-type-conversion) first to null-out or exclude unwanted values, or use [`enums.update`](#enumsupdatename-values) to add the missing values to the enum definition.

#### datasets.remove(id) {#datasetsremoveid}

Remove a dataset from the project. Accepts a dataset ID or name (case-insensitive). Closes any open tabs that reference the dataset, then cascade-deletes all dependent derived datasets and models.

```javascript
await window.midas.datasets.remove('Iris');
```

#### datasets.fetch(id, options?) {#datasetsfetchid-options}

Fetch row data from a dataset without side effects. Accepts a dataset ID or name (case-insensitive).

```javascript
const result = await window.midas.datasets.fetch('Iris', { limit: 5, offset: 0 });
// result.data: {
//   datasetId: 'ds_001', name: 'Iris', totalRows: 150, returnedRows: 5,
//   columns: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'],
//   rows: [{ sepal_length: 5.1, sepal_width: 3.5, ... }, ...]
// }
```

Use `limit` and `offset` to control the range of rows returned. When omitted, all rows are returned. Returns a `NO_DATA` error for unmaterialized derived datasets — those not yet recomputed, such as right after opening a saved project. Opening the dataset in a tab or running a query that references it triggers materialization.

#### datasets.buildMapping(datasetId, columnId, input) {#datasetsbuildmapping}

Generate a value → canonical mapping dataset from unique values in a string or enum column.

```javascript
// Key Collision: fullwidth normalize → lowercase
const result = await window.midas.datasets.buildMapping('ds_001', 'city', {
  method: { type: 'key_collision', normalizers: ['fullwidth', 'case'] }
});
// result.data: { datasetId: 'primary_...', changedCount: 3, valueCount: 7 }

// Nearest Neighbor: edit distance
const result2 = await window.midas.datasets.buildMapping('ds_001', 'city', {
  method: { type: 'nearest_neighbor', method: 'levenshtein', threshold: 2 }
});
```

`method.type` must be `'key_collision'` or `'nearest_neighbor'`.

**Key Collision** applies deterministic normalizer functions in order to produce the initial canonical value. Specify the application order in `normalizers`. Available normalizers: `'trim'` (strip surrounding whitespace), `'fullwidth'` (NFKC normalization), `'kana'` (katakana → hiragana), `'case'` (lowercase), `'fingerprint'` (lowercase, strip punctuation, deduplicate tokens, sort tokens).

**Nearest Neighbor** groups nearby values by distance and sets the most frequent value in each cluster as the initial canonical. `method` must be `'levenshtein'` (edit distance). `threshold` (default: 2) sets the distance cutoff. Maximum unique values: 10,000.

Use `overrides` to override the canonical for specific values. Use `name` to set the dataset name (default: `{source}_{column}_mapping`).

The result is a Primary DataSet with `value` and `canonical` columns.

#### datasets.normalize(datasetId, columnId, mappingDatasetId, input?) {#datasetsnormalize}

Normalize source data using a mapping dataset. Applies the value → canonical transformation via SQL JOIN.

```javascript
const result = await window.midas.datasets.normalize('ds_001', 'city', mp.data.datasetId, {
  mode: 'replace'
});
// result.data: { datasetId: 'derived_...', name: 'sales_normalized', rowCount: 100, columnCount: 5 }
```

`mode` is `'add'` (default) to append a `{column}_normalized` column, or `'replace'` to replace the original column with `COALESCE(canonical, original_value)`. Use `name` to set the output dataset name (default: `{source}_normalized`). Any dataset with `value` and `canonical` columns can be used as the mapping dataset.

#### datasets.traceRowLineage(datasetId, rowIndices) {#datasetstracerowlineage}

Trace one hop of row-level lineage: find the parent dataset rows that contributed to the given rows of a derived dataset. Re-runs the derive query with the top-level aggregation removed and the parent `Row #` projected, then matches parent rows by the values of the requested rows.

```javascript
const result = await window.midas.datasets.traceRowLineage('sales_by_region', [0]);
// result.data: { traceable: true, contributions: [{ datasetId: 'ds_001', datasetName: 'sales', rowIndices: [0, 3, 7] }] }
```

`rowIndices` are 0-based row indices within the target dataset. Only non-negative integers are accepted; an empty array, a non-integer, or a negative value returns an `INVALID_INPUT` error. Out-of-range integers are ignored. The returned `contributions` lists the contributing rows per parent dataset, so a JOIN yields multiple entries. Each `rowIndices` is sorted and deduplicated. Passing several rows returns their contributions combined, without a per-row breakdown; call once per row to attribute rows individually.

Traceable shapes: GROUP BY aggregations (including expression keys and GROUP BY by ordinal or output alias), JOIN, FROM subquery and CTE, plain projection and filter, DISTINCT, and whole-table aggregation (all parent rows contribute).

When a query cannot be traced, it returns `traceable: false` with a `reason`.

- `window-function`: contains a window function or a QUALIFY clause
- `set-operation`: contains a set operation such as UNION
- `nondeterministic`: contains TABLESAMPLE or a LIMIT without ORDER BY
- `nested-aggregation`: a FROM subquery or CTE aggregates on its own
- `ambiguous-group-keys`: a GROUP BY key is not in the output, so its value cannot be read per row
- `no-parent-table`: the FROM cannot be resolved to a registered parent dataset
- `not-derived`: the target is not a derived dataset
- `unsupported-operation`: the derived dataset was built by an operation other than a SQL query
- `parse-failed`: the query's structure could not be analyzed

Some reasons include a `detail` string with specifics: the table name that could not be resolved (`no-parent-table`), the original operation type (`unsupported-operation`), or `TABLESAMPLE`.

### enums {#enums}

#### enums.create(name, values) {#enumscreatename-values}

Create an enum definition. Up to 50 values can be specified.

```javascript
const result = await window.midas.enums.create('color', ['red', 'green', 'blue']);
// result.data: { name: 'color', valueCount: 3 }
```

#### enums.list() {#enumslist}

List all enum definitions.

```javascript
const result = await window.midas.enums.list();
// result.data: [{ name: 'color', values: ['red', 'green', 'blue'] }, ...]
```

#### enums.update(name, values) {#enumsupdatename-values}

Update the values of an existing enum definition.

```javascript
await window.midas.enums.update('color', ['red', 'green', 'blue', 'yellow']);
```

Up to 50 values can be specified. Removing values is rejected with `ENUM_VALUE_MISMATCH` if any dataset still contains the removed values in a column of this enum type. This preserves the invariant that enum column values are always in the definition or NULL. Use [Convert Column Types](column-type-conversion) first to null-out or exclude those values, or keep the values in the enum definition.

#### enums.remove(name) {#enumsremovename}

Remove an enum definition. Returns an `ENUM_IN_USE` error if columns still reference this enum.

```javascript
await window.midas.enums.remove('color');
```

### tabs {#tabs}

#### tabs.list() {#tabslist}

List all open tabs.

```javascript
const result = await window.midas.tabs.list();
// result.data: [{ id, type, title }, ...]
```

#### tabs.open(config) {#tabsopenconfig}

Open a new tab. `datasetId` accepts a dataset ID or name (case-insensitive). For a `graph-builder` tab, the dataset is bound to the tab, so it shows in the dataset selector and in [`tabs.getGraphBuilder()`](#tabsgetgraphbuildertabid).

```javascript
// Open Graph Builder
const result = await window.midas.tabs.open({
  type: 'graph-builder',
  title: 'My Graph',
  datasetId: 'ds_001'
});
// result.data: { tabId: 'tab_...', type: 'graph-builder', title: 'My Graph' }

// Open SQL Editor
const result2 = await window.midas.tabs.open({
  type: 'sql-editor',
  initialQuery: 'SELECT * FROM Iris LIMIT 10',
  initialOutputName: 'Preview'
});
```

Available tab types:

| Type | Description |
|------|-------------|
| `graph-builder` | Graph Builder |
| `sql-editor` | SQL Editor |
| `glm` | GLM |
| `glmm` | GLMM |
| `random-forest` | Random Forest |
| `linear-regression` | Linear Regression |
| `pca` | PCA |
| `statistics` | Descriptive Statistics |
| `crosstab` | Crosstab |
| `anova` | ANOVA |
| `kaplan-meier` | Kaplan-Meier |
| `cox-regression` | Cox Regression |
| `doe-analysis` | DOE Analysis |
| `arima` | ARIMA |
| `data-table` | Data Table |
| `report` | Report (requires `reportId`) |
| `computed-column` | Computed Column |
| `dummy-coding` | Dummy Coding |
| `orthogonal-polynomials` | Orthogonal Polynomials |
| `reshape` | Reshape |
| `column-type-conversion` | Type Conversion |
| `enum-definition` | Enum Definition |
| `project-overview` | Project Overview |
| `project-lineage` | Project Lineage |
| `selected-rows` | Selected Rows |
| `excluded-rows` | Excluded Rows |
| `filtered-data` | Filtered Data |
| `model-detail` | Model Detail (requires `modelId`) |
| `glm-diagnostics` | GLM Diagnostics (requires `modelId`) |
| `glm-prediction` | GLM Prediction (requires `modelId`) |
| `sql-query-viewer` | SQL Query Viewer |
| `variant-normalization` | Normalize Variants |
| `apply-mapping` | Apply Mapping |
| `project-diff` | Project Diff |
| `help` | Help |

`report` tabs require `reportId`, and `model-detail`, `glm-diagnostics`, and `glm-prediction` tabs require `modelId`. Set `modelId` to the ID of a saved model obtained from [`models.list()`](#modelslist). Omitting it returns an `INVALID_INPUT` error.

```javascript
// Open a Model Detail tab
const result3 = await window.midas.tabs.open({
  type: 'model-detail',
  modelId: 'model_001'
});
```

The `glm-diagnostics` tab opens both GLM and linear regression (`linear_regression`) models. For a linear regression model, it hides the Deviance/Pearson residual toggle and changes the heading to Residual Diagnostics.

`models.run()` supports six model types: `glm`, `glmm`, `random_forest`, `arima`, `linear_regression`, and `anova`. The corresponding tabs (`glm`, `glmm`, `random-forest`, `arima`, `linear-regression`, `anova`) plus other analysis tabs (`pca`, `kaplan-meier`, `cox-regression`, `doe-analysis`, `crosstab`, `statistics`) can all be opened with `tabs.open()`, but only the six types above can be fitted programmatically — the rest must be configured and run through the GUI.

#### tabs.close(id) {#tabscloseid}

Close a tab.

```javascript
await window.midas.tabs.close('tab_001');
```

#### tabs.closeOthers(keepTabId) {#tabscloseotherskeeptabid}

Close all tabs except the specified one.

```javascript
const result = await window.midas.tabs.closeOthers('tab_001');
// result.data: { closedCount: 3 }
```

#### tabs.getGraphBuilder(tabId) {#tabsgetgraphbuildertabid}

Get Graph Builder tab configuration.

```javascript
const result = await window.midas.tabs.getGraphBuilder('tab_001');
// result.data: { tabId, graphType, datasetId, config, aspectRatio }
```

#### tabs.addGraphLayer(tabId, layer) {#tabsaddgraphlayertabid-layer}

Add a layer to a custom graph. Only works when `graphType` is `'custom'`.

```javascript
const result = await window.midas.tabs.addGraphLayer('tab_001', {
  geom: { type: 'point' },
  aes: { x: 'sepal_length', y: 'sepal_width', color: 'species' }
});
// result.data: { layerIndex: 0 }
```

Aesthetic mappings (`aes`) accept column names or column IDs. Column names are resolved case-insensitively. Available properties are `x`, `y`, `color`, `fill`, `stroke`, `size`, `shape`, `alpha`, `linetype`, `ymin`, `ymax`, `label`, and `group`. Not all properties apply to every geom type — for example, Point and Line do not support `fill`. To use a fixed color, specify `{ fixedColor: '#FF0000' }`. For a fixed size, use a positive number. For a fixed alpha (opacity), use a number between 0 and 1. Out-of-range fixed values are ignored with a warning. When `stats` is omitted, `identity` is used by default. When `position` is omitted, bar geom defaults to `{ type: "stack" }` (stacked); other geoms default to `identity`. Each geom allows only specific position types (e.g. line allows only `identity`); specifying a disallowed position returns an `INVALID_INPUT` error. Use `scales` to configure per-layer color scales (`color`, `fill`). See [configureGraph](#tabsconfiguregraphtabid-config) for details.

When using the Label geom (`{ type: 'label' }`) with aggregating stats (summary, count, bin, etc.), the column mapped via `aes.label` is lost during aggregation. Use `geom.defaults.labelContent` to reference stat output variables instead.

```javascript
await window.midas.tabs.addGraphLayer(tabId, {
  geom: {
    type: 'label',
    defaults: {
      labelContent: { field: '$y', format: '.1f', prefix: 'Mean: ' }
    }
  },
  stats: [{ type: 'summary', params: { fun: 'mean' } }],
});
```

`labelContent` properties:

| Property | Type | Description |
|---|---|---|
| `field` | `string` | Field to display. Stat variables (`$x`, `$y`, `$n`, etc.) or a column name |
| `format` | `string` | d3-format specifier (e.g. `.2f`, `,.0f`) |
| `prefix` | `string` | String prepended to the formatted value |
| `suffix` | `string` | String appended to the formatted value |

When `labelContent` is set, `aes.label` is not required.

Layers also accept these optional properties:

| Property | Type | Description |
|---|---|---|
| `name` | `string` | Display name for the layer |
| `filter` | `string` | SQL WHERE expression to filter data for this layer |
| `visible` | `boolean` | Show or hide the layer (default `true`) |
| `yAxis` | `'primary' \| 'secondary'` | Which Y axis to use |
| `showLegend` | `'auto' \| 'show' \| 'hide'` | Legend visibility for this layer |
| `clickSelection` | `boolean` | Enable click-to-select on data points |

See [Custom Graph Reference](custom-graph-reference) for the list of geom, stat, and position types. Each Statistic's [`params`](custom-graph-reference#stat-params) are documented there with their accepted values and defaults. For facets, coordinates, and other graph-level options, see [Custom Graph](custom-graph).

#### tabs.updateGraphLayer(tabId, layerIndex, layer) {#tabsupdategraphlayertabid-layerindex-layer}

Partially update an existing layer. Only the specified fields are changed; omitted fields retain their current values.

```javascript
await window.midas.tabs.updateGraphLayer('tab_001', 0, {
  geom: { type: 'line' }
});
```

When the geom is changed and the current position is not allowed by the new geom, position is automatically reset to `identity` and a warning is returned. Explicitly specifying a disallowed position returns an `INVALID_INPUT` error.

Pass `scales: null` to remove layer-specific color scales and fall back to the global scale. Pass `position: null` to reset position to the default (unset).

#### tabs.removeGraphLayer(tabId, layerIndex) {#tabsremovegraphlayertabid-layerindex}

Remove a layer.

```javascript
await window.midas.tabs.removeGraphLayer('tab_001', 0);
```

#### tabs.moveToPane(tabId, toPaneId) {#tabsmovetopanetabid-topaneid}

Move a tab to a different pane. Use a pane ID returned by [`layout.split()`](#layoutsplitconfig).

```javascript
await window.midas.tabs.moveToPane('tab_001', 'pane_002');
```

#### tabs.setDataset(tabId, datasetId) {#tabssetdatasettabid-datasetid}

Switch a tab's dataset. `datasetId` accepts a dataset ID or name (case-insensitive). For a `graph-builder` tab, the new dataset shows in the dataset selector and in [`tabs.getGraphBuilder()`](#tabsgetgraphbuildertabid).

```javascript
await window.midas.tabs.setDataset('tab_001', 'ds_002');
```

#### tabs.configureGraph(tabId, config) {#tabsconfiguregraphtabid-config}

Configure a Graph Builder tab in one call. Select the chart type with `graphType`. `datasetId` accepts a dataset ID or name (case-insensitive); pass an empty string to clear it. Column names are resolved case-insensitively. Properties not recognized by `GraphConfigInput` or `LayerDefInput` are reported in `result.warnings`.

```javascript
await window.midas.tabs.configureGraph('tab_001', {
  graphType: 'custom',
  datasetId: 'ds_001',
  layers: [
    { geom: { type: 'point' }, aes: { x: 'weight', y: 'height', color: 'group' } }
  ],
  aspectRatio: '4:3'
});
```

Use `coordinates` to set the coordinate system. `'flipped'` swaps the X and Y axes (useful for horizontal bar charts with long category labels). `'cartesian'` resets to the default Cartesian coordinates.

```javascript
await window.midas.tabs.configureGraph('tab_001', {
  coordinates: 'flipped'
});
```

Use `scales` to configure axis scales. When calling `configureGraph` with `scales`, only the specified axes are updated; unspecified axes retain their existing settings.

```javascript
await window.midas.tabs.configureGraph('tab_001', {
  scales: { y: { type: 'log', title: 'Log scale' } }
});
```

Each axis (`x`, `y`, `y2`) in `scales` accepts:

| Property | Type | Description |
|---|---|---|
| `type` | `'linear' \| 'log' \| 'sqrt' \| 'time' \| 'categorical'` | Scale type |
| `title` | `string` | Axis title |
| `domain` | `{ min?, max? }` | Range for continuous scales (not applicable for `categorical`) |
| `tickCount` | `number` | Number of ticks (not applicable for `categorical`) |
| `limits` | `string[]` | Category display order (`categorical` only) |
| `breaks` | `string[]` | Subset of categories to display (`categorical` only) |
| `labels` | `Record<string, string>` | Custom display names for categories (`categorical` only) |
| `labelRotation` | `'auto' \| 0 \| 45 \| 90` | Label rotation angle |

Per-layer color scales are set via `layers[].scales`, which accepts `color` and `fill` channels.

```javascript
await window.midas.tabs.configureGraph('tab_001', {
  graphType: 'custom',
  datasetId: 'ds_001',
  layers: [{
    geom: { type: 'tile' },
    aes: { x: 'col_x', y: 'col_y', fill: 'col_value' },
    scales: { fill: { scaleType: 'sequential', paletteId: 'viridis' } }
  }]
});
```

Each `color` / `fill` scale accepts:

| Property | Type | Description |
|---|---|---|
| `scaleType` | `'categorical' \| 'sequential' \| 'diverging' \| 'threshold'` | Color scale type |
| `paletteId` | `string` | Palette ID. See [Custom Graph Reference](custom-graph-reference#palettes) for available values |
| `domain` | `{ min?, max?, center? }` | Domain for continuous scales |
| `legendPosition` | `'right' \| 'left' \| 'top' \| 'bottom' \| 'none'` | Legend position |
| `legendTitle` | `string` | Legend title |
| `thresholds` | `number[]` | Threshold values (`threshold` type) |
| `thresholdColors` | `string[]` | Colors for each region (`threshold` type, length = `thresholds.length + 1`) |
| `thresholdVariable` | `'x' \| 'y'` | Variable used for threshold comparison (`threshold` type, default `'y'`) |

Use `facets` to split the graph into panels by one or two categorical variables. Two modes are supported: `wrap` (single variable, auto-arranged panels) and `grid` (row and/or column variables).

```javascript
// Facet wrap: split by one variable
await window.midas.tabs.configureGraph('tab_001', {
  facets: { type: 'wrap', variable: 'species', ncol: 3, scales: 'free_y' }
});

// Facet grid: split by row and/or column variables
await window.midas.tabs.configureGraph('tab_001', {
  facets: { type: 'grid', rows: 'region', cols: 'year' }
});

// Remove facets
await window.midas.tabs.configureGraph('tab_001', { facets: null });
```

Omitting `facets` preserves the current setting.

Facet wrap properties:

| Property | Type | Description |
|---|---|---|
| `type` | `'wrap'` | Facet wrap mode |
| `variable` | `string` | Column name or ID to facet by (required) |
| `ncol` | `number` | Number of columns (auto-calculated if omitted) |
| `nrow` | `number` | Number of rows (auto-calculated if omitted) |
| `complete` | `boolean` | Fill all combinations to show empty panels |
| `scales` | `'fixed' \| 'free_x' \| 'free_y' \| 'free'` | Axis scale sharing across panels (default: `'fixed'`) |

Facet grid properties:

| Property | Type | Description |
|---|---|---|
| `type` | `'grid'` | Facet grid mode |
| `rows` | `string` | Column name or ID for row faceting |
| `cols` | `string` | Column name or ID for column faceting |
| `complete` | `boolean` | Fill all combinations to show empty panels |
| `scales` | `'fixed' \| 'free_x' \| 'free_y' \| 'free'` | Axis scale sharing across panels (default: `'fixed'`) |

At least one of `rows` or `cols` is required for facet grid.

For the simple graph types — `histogram`, `scatter`, `timeseries`, `bar`, `pairplot`, and `datetime_histogram` — set the fields specific to that type via `graphType`. `layers`, `globalAes`, `scales`, `coordinates`, and `facets` apply only to `custom`.

```javascript
await window.midas.tabs.configureGraph('tab_001', {
  graphType: 'histogram',
  datasetId: 'ds_001',
  column: 'sepal_length',
  bins: 20
});
```

Column fields such as `column`, `xColumn`, and `categoryColumn` accept either a column name or a column ID. The type-specific fields are:

| Graph type | Fields |
|---|---|
| `histogram` | `column`, `bins`, `showDensity`, `orientation`, `groupByColumn`, `groupMode`, `facetNcol`, `showAnnotations` |
| `scatter` | `xColumn`, `yColumn`, `colorColumn`, `sizeColumn`, `referenceLines`, `xScaleType`, `yScaleType`, density options (`displayMode`, `densityVisualization`, `densityBandwidth`, `contourLevels`, `densityColorScale`) |
| `timeseries` | `xColumn`, `yColumns`, `rangeColumns`, `rangeOpacity` |
| `bar` | `categoryColumn`, `valueColumns`, `aggregations`, `orientation`, `showValues`, `stackMode`, `sortOrder`, `topN` |
| `pairplot` | `columns` |
| `datetime_histogram` | `column`, `interval`, `showTrend` |

For `bar`, `valueColumns` accepts aggregation target columns plus `'$count'` for a row count, and `aggregations` sets the aggregation method per column.

`boxplot` and `heatmap` cannot be configured; passing either returns an `INVALID_GRAPH_TYPE` error.

### models {#models}

#### models.list() {#modelslist}

List fitted models.

```javascript
const result = await window.midas.models.list();
// result.data: [{ id, type, name, datasetId, family }, ...]
```

`type` is one of `'glm'`, `'glmm'`, `'random_forest'`, `'arima'`, `'linear_regression'`, or `'anova'`.

#### models.run(config) {#modelsrunconfig}

Run a model. Set `config.type` to `'glm'`, `'glmm'`, `'random_forest'`, `'arima'`, `'linear_regression'`, or `'anova'`. `datasetId` accepts a dataset ID or name (case-insensitive) and must refer to a primary or derived dataset (ephemeral datasets are rejected). Columns can be specified by name (case-insensitive). The result is returned directly without opening a tab. To persist the model for later use with [`models.list()`](#modelslist) and [`models.describe()`](#modelsdescribeid), call [`models.save()`](#modelssaverunid-name) with the returned `runId`. Unsaved run results are kept in memory, up to 20. Beyond that limit, the oldest unsaved results are discarded first, and a discarded `runId` can no longer be passed to [`models.save()`](#modelssaverunid-name). All unsaved results are also lost on page reload.

**GLM** (`type: 'glm'`):

```javascript
const result = await window.midas.models.run({
  type: 'glm',
  datasetId: 'ds_001',
  yColumn: 'sepal_length',
  xColumns: ['sepal_width', 'petal_length'],
  family: 'gaussian'
});
// result.data:
// {
//   type: 'glm',
//   runId: '...',
//   family: 'gaussian',
//   link: 'identity',
//   coefficients: [
//     { variable: '(Intercept)', estimate: 2.25, se: 1.02, ciLower: 0.23, ciUpper: 4.27, expEstimate: null, expCiLower: null, expCiUpper: null },
//     ...
//   ],
//   inference: { distribution: 't', df: 147 },
//   fit: { deviance: 42.3, nullDeviance: 234.7, aic: 183.94, bic: 193.47, iterations: 5, converged: true },
//   diagnosticSummary: { nObservations: 150, nIncomplete: 0, degreesOfFreedom: 147, dispersionParameter: 0.29 },
//   warnings: []
// }
```

Fields in `coefficients`:

| Field | Description |
|---|---|
| `estimate` | Coefficient estimate on the link scale |
| `se` | Standard error of the estimate |
| `ciLower` / `ciUpper` | Lower and upper bounds of the Wald confidence interval. The reference distribution is reported in `inference`; see the family-by-family table below |
| `expEstimate` / `expCiLower` / `expCiUpper` | The `exp()` transformation of the estimate and confidence interval. See below for interpretation and the links for which these are `null` |

`diagnosticSummary.dispersionParameter` is the estimate of the dispersion parameter φ. For `gaussian` it is the deviance divided by the residual degrees of freedom n−p; for `gamma` it is Pearson χ² divided by n−p. For these families it is used to compute the SEs and confidence intervals. For `poisson` and `binomial`, SEs are computed with φ = 1, and this field instead contains deviance/(n−p) as an overdispersion diagnostic; it is `null` when the residual degrees of freedom is 0. For `negative-binomial`, it is 1.0 when θ is estimated automatically and Pearson χ²/(n−p) when θ is fixed.

**GLMM** (`type: 'glmm'`):

Specify the random-intercept grouping variable with `groupColumn`. `family` accepts `'gaussian'`, `'binomial'`, `'poisson'`, or `'gamma'` (default `'gaussian'`). `maxIterations` defaults to 100 and `tolerance` to 1e-6. See [GLMM](glmm) for background on the model.

```javascript
const result = await window.midas.models.run({
  type: 'glmm',
  datasetId: 'ds_001',
  yColumn: 'sepal_length',
  xColumns: ['sepal_width', 'petal_length'],
  groupColumn: 'species',
  family: 'gaussian'
});
// result.data:
// {
//   type: 'glmm',
//   runId: '...',
//   family: 'gaussian',
//   link: 'identity',
//   fixedEffects: [{ variable: '(Intercept)', estimate: 2.35, se: 0.87, ... }, ...],
//   inference: { distribution: 'normal', df: null },
//   randomEffects: { groupColumn: 'species', variance: 0.42, residualVariance: 0.14, icc: 0.75, blup: [...] },
//   fit: { logLikelihood: -72.1, iterations: 8, converged: true },
//   diagnosticSummary: { nObservations: 150, nGroups: 3, nFixedEffects: 3, nIncomplete: 0, groupSizes: [...] },
//   warnings: []
// }
```

`link`, `includeIntercept`, `confidenceLevel` are also available as optional parameters, with the same meaning as in GLM. `maxIterations` defaults to 100 and `tolerance` to 1e-6 (different from GLM defaults).

Confidence intervals for GLMM fixed effects are Wald approximations based on the standard normal distribution; `inference` is always `{ distribution: 'normal', df: null }`. When the number of groups is small, intervals based on this approximation can fall below the nominal coverage.

Fields in `randomEffects`:

- `variance` — variance of the random intercepts, σ²_u
- `residualVariance` — for `gaussian`, the residual variance σ²_e; for `gamma`, the estimated dispersion parameter φ. Not returned for `binomial` and `poisson`, where φ is fixed at 1 by theory
- `icc` — intraclass correlation coefficient σ²_u / (σ²_u + σ²_e). σ²_e is the REML-estimated residual variance for `gaussian` + `identity`, π²/3 for `binomial` + `logit`, and 1 for `binomial` + `probit` — the latter two are latent-scale values. For all other family-link combinations, no latent-scale residual variance is defined and `icc` is `null`
- `blup` — predicted random intercepts per group (Best Linear Unbiased Prediction). Each value is the group mean of the residuals from the fixed-effects prediction (y − Xβ̂), shrunk toward 0; groups with fewer observations are shrunk more. `standardError` quantifies the prediction uncertainty: for LMM it is the conditional prediction error standard deviation, and for other families it is an approximation based on the Laplace approximation. `rank` is the rank in descending order of `estimate`

For LMM (`gaussian` + `identity`), `fit.logLikelihood` is the REML log-likelihood, and `fit.aic` and `fit.bic` are based on it. REML-based AIC/BIC cannot be used to compare models with different fixed-effects structures. For other families, the log-likelihood is a Laplace approximation.

**Random Forest** (`type: 'random_forest'`):

Set `taskType` to `'classification'` or `'regression'`. The `metrics` field contains fitting-set (resubstitution) evaluation metrics. For an unbiased estimate of generalization performance, use `oobScore`.

```javascript
const result = await window.midas.models.run({
  type: 'random_forest',
  datasetId: 'ds_001',
  yColumn: 'species',
  xColumns: ['sepal_length', 'sepal_width', 'petal_length', 'petal_width'],
  taskType: 'classification',
  nEstimators: 100,
  randomState: 42
});
// result.data:
// {
//   type: 'random_forest',
//   runId: '...',
//   taskType: 'classification',
//   tuningParameters: { nEstimators: 100, maxDepth: null, ... },
//   featureImportances: [{ feature: 'petal_length', importance: 0.45 }, ...],
//   permutationImportances: [{ feature: 'petal_length', importance: 0.38 }, ...],
//   metrics: { taskType: 'classification', accuracy: 0.96, precision: 0.96, recall: 0.96, f1Score: 0.96, nClasses: 3 },
//   nSamples: 150,
//   oobScore: 0.95,
//   warnings: []
// }
```

For regression tasks, `metrics` contains `{ taskType: 'regression', mse, rmse, mae, r2 }`. `metrics` values are computed on the fitting data (resubstitution) and are not available from [`models.describe()`](#modelsdescribeid). For an unbiased generalization estimate, use `oobScore` (OOB accuracy for classification, OOB R-squared for regression). `permutationImportances` is the mean decrease in OOB prediction accuracy when each predictor is shuffled. Values can be negative, which means shuffling that predictor did not reduce OOB prediction accuracy.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `nEstimators` | `number` | `100` | Number of trees |
| `maxDepth` | `number \| null` | `null` | Maximum tree depth (`null` for unlimited) |
| `minSamplesSplit` | `number` | `2` | Minimum samples to split a node |
| `minSamplesLeaf` | `number` | `1` | Minimum samples in a leaf node |
| `maxFeatures` | `'sqrt' \| 'log2' \| null \| number` | `'sqrt'` | Number of predictors to consider at each split |
| `randomState` | `number` | `42` | Random seed |

`inference` identifies the reference distribution used for `ciLower` and `ciUpper`. `distribution: 't'` indicates a t distribution with `df` degrees of freedom; `distribution: 'normal'` indicates a standard normal distribution, in which case `df` is `null`.

The family-by-family mapping is shown below. The critical value follows t(n−p) exactly only for Gaussian with the identity link. Other dispersion-estimating families apply a t distribution as a small-sample convention, while families marked asymptotic use the standard normal approximation.

| family | link | Reference distribution | Nature |
|--------|------|-------------------------|--------|
| `gaussian` | `identity` | t(n−p) | Exact |
| `gaussian` | non-`identity` | t(n−p) | Small-sample convention |
| `gamma` | any | t(n−p) | Small-sample convention |
| `negative-binomial` | any (fixed θ) | t(n−p) | Small-sample convention |
| `poisson` | any | Standard normal | Asymptotic |
| `binomial` | any | Standard normal | Asymptotic |
| `negative-binomial` | any (estimated θ) | Standard normal | Asymptotic |

`ciLower` and `ciUpper` are the Wald confidence interval `estimate ± criticalValue × se` for each coefficient. The confidence level follows the `confidenceLevel` parameter (default 95). The critical value is the `(1 + confidenceLevel/100) / 2` quantile of the reference distribution reported in `inference`.

`expEstimate`, `expCiLower`, and `expCiUpper` are the `exp()` transformation of the link-scale estimate and confidence interval. For logit link, these correspond to odds ratios (OR); for Poisson and Negative Binomial with log link, incidence rate ratios (IRR); for Gamma and Gaussian with log link, multiplicative effects. For identity, inverse, and probit links, these fields are `null`.

`fit.aic` and `fit.bic` are `number | null`. They are `null` when the log-likelihood constant is undefined (e.g., non-integer weights in Binomial, or a saturated model).

The response may include two kinds of warnings. Top-level `result.warnings` contains data preparation warnings such as exclusion of rows with missing values. `result.data.warnings` contains model execution warnings such as convergence issues. Rows with missing values in any response or explanatory variable are excluded from analysis. The number of excluded rows is available in `diagnosticSummary.nIncomplete`.

When the model does not converge within the maximum number of iterations, no error is raised; the result is returned with `fit.converged: false`, and `message` also reads "did not converge". When complete or quasi-complete separation is suspected, a warning is included in `data.warnings`. When a result is returned, coefficient `se` values are never `null`. When SEs cannot be computed — for example, a non-positive-definite covariance matrix or a rank-deficient design matrix — no result is returned and a `NUMERICAL_ERROR` error is raised instead. A `NUMERICAL_ERROR` is also raised when the fitted mean falls outside the valid range for the family (for example, an identity link applied to Poisson or Gamma); choose a link that keeps the fitted mean in range, such as log.

Specify `family` as `'gaussian'` (default), `'binomial'`, `'poisson'`, `'gamma'`, or `'negative-binomial'`. See [GLM](glm) for guidance on choosing a family. Use `link` to set the link function. When omitted, the default link for the family is used.

| family | Default link | Available links |
|--------|-------------|-----------------|
| `gaussian` | `identity` | `identity`, `log` |
| `binomial` | `logit` | `logit`, `probit` |
| `poisson` | `log` | `log`, `identity` |
| `gamma` | `inverse` | `inverse`, `log`, `identity` |
| `negative-binomial` | `log` | `log` |

**Optional parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `includeIntercept` | `boolean` | `true` | Include intercept term |
| `maxIterations` | `number` | `25` | Maximum number of iterations |
| `tolerance` | `number` | `1e-8` | Convergence tolerance |
| `binomialResponse` | `object` | - | Binomial response format. See below |
| `theta` | `number` | - | Overdispersion parameter for negative binomial. When omitted, estimated via profile likelihood |
| `offsetColumn` | `string` | - | Offset column (e.g., exposure for Poisson regression) |
| `confidenceLevel` | `number` | `95` | Confidence level for confidence intervals (50–99.99) |

When `theta` is omitted and estimated automatically, coefficient SEs and confidence intervals are computed treating the estimated θ as fixed. The estimation uncertainty of θ itself is not propagated into the SEs.

**`binomialResponse` specification:**

Specifies the response variable format when `family: 'binomial'`. When `binomialResponse` is omitted, binary format is assumed.

- `{ format: 'binary' }` -- Binary 0/1 data. Specify the response variable with `yColumn`
- `{ format: 'grouped', successesColumn: '...', trialsColumn: '...' }` -- Successes/trials pair. `yColumn` can be omitted in this case

```javascript
// Grouped Binomial example
const result = await window.midas.models.run({
  type: 'glm',
  datasetId: 'ds_001',
  binomialResponse: { format: 'grouped', successesColumn: 'defects', trialsColumn: 'inspected' },
  xColumns: ['temperature', 'pressure'],
  family: 'binomial',
  link: 'logit'
});
```

**ARIMA** (`type: 'arima'`):

Fit an ARIMA(p,d,q) model to a single time-series column. Specify `order` as `[p, d, q]` for manual order selection, or use `autoSelect` to search over a grid of orders and select the best by AIC or BIC.

| Parameter | Type | Default | Description |
|---|---|---|---|
| `seriesColumn` | `string` | (required) | Column name for the time series |
| `order` | `[number, number, number]` | — | `[p, d, q]` — AR order, differencing order, MA order. Omit to use `autoSelect` |
| `autoSelect` | `object` | `{ maxP: 3, maxD: 1, maxQ: 3, criterion: 'aic' }` | Auto order selection. Used when `order` is omitted |
| `autoSelect.maxP` | `number` | `3` | Maximum AR order to search |
| `autoSelect.maxD` | `number` | `1` | Maximum differencing order to search |
| `autoSelect.maxQ` | `number` | `3` | Maximum MA order to search |
| `autoSelect.criterion` | `'aic' \| 'bic'` | `'aic'` | Information criterion for model selection |
| `includeIntercept` | `boolean` | `true` | Include an intercept (drift when d > 0) |
| `confidenceLevel` | `number` | `95` | Confidence level for coefficient CIs |

The response includes `coefficients` (AR/MA/Intercept with CIs), `fit` (AIC, BIC, log-likelihood, σ², convergence), `residualDiagnostics` (ACF and PACF of residuals), and `nObservations`. When `autoSelect` is used, `orderSearch` contains the AIC/BIC for all candidate orders.

When the fit degenerates (too few observations, or zero variance after differencing), `fit.error` holds a message. In that case the coefficients, AIC, σ², and similar fields are meaningless values (zero or infinity), and `residualDiagnostics.acf`/`.pacf` are empty arrays. This is distinct from non-convergence (`converged: false` with no `fit.error`); failing to converge alone does not set `fit.error`.

Non-finite values (NaN, Infinity, null) in the series are dropped. When observations are dropped, a warning is included in the response.

```javascript
// Manual order
const result = await window.midas.models.run({
  type: 'arima',
  datasetId: 'ds_001',
  seriesColumn: 'temperature',
  order: [1, 1, 1]
});

// Auto order selection
const result2 = await window.midas.models.run({
  type: 'arima',
  datasetId: 'ds_001',
  seriesColumn: 'temperature',
  autoSelect: { maxP: 5, maxD: 2, maxQ: 5, criterion: 'bic' }
});
```

**Linear Regression** (`type: 'linear_regression'`):

Fit a linear regression by ordinary least squares.

| Parameter | Type | Default | Description |
|---|---|---|---|
| `yColumn` | `string` | (required) | Response variable column name |
| `xColumns` | `string[]` | (required) | Explanatory variable column names |
| `includeIntercept` | `boolean` | `true` | Include intercept term |
| `confidenceLevel` | `number` | `95` | Confidence level for confidence intervals |

In the response, `coefficients` has the same shape as for GLM, and `inference` is always a t distribution. `fit` contains the same fields as for GLM plus `rSquared`, `adjustedRSquared`, and `rmse`.

```javascript
const result = await window.midas.models.run({
  type: 'linear_regression',
  datasetId: 'ds_001',
  yColumn: 'sepal_length',
  xColumns: ['sepal_width', 'petal_length']
});
// result.data: { type: 'linear_regression', runId, coefficients, inference,
//   fit: { ..., rSquared: 0.84, adjustedRSquared: 0.84, rmse: 0.33 }, diagnosticSummary, warnings }
```

**ANOVA** (`type: 'anova'`):

Run a one-way or two-way analysis of variance. The number of elements in `factorColumns` determines which.

| Parameter | Type | Default | Description |
|---|---|---|---|
| `responseColumn` | `string` | (required) | Response variable column name |
| `factorColumns` | `string[]` | (required) | Factor columns. One element for one-way, two for two-way |
| `includeInteraction` | `boolean` | `true` | Include the interaction term in two-way ANOVA |
| `ssType` | `'I' \| 'III'` | `'III'` | Sum-of-squares decomposition type for two-way ANOVA |
| `confidenceLevel` | `number` | `95` | Confidence level for Tukey HSD. One of 90, 95, 99 |
| `postHoc` | `boolean` | `true` | Compute Tukey HSD for one-way ANOVA |

The response includes `mode` (`'one-way'` or `'two-way'`), `anovaTable` (sum of squares, degrees of freedom, mean square, and the effect sizes η² and ω² per effect, plus residuals and total), `groupStatistics` (n, mean, standard deviation, minimum, and maximum per group), `nObservations`, and `nExcluded`. For one-way ANOVA with `postHoc` enabled, `tukeyHSD` (pairwise mean differences with SEs and confidence intervals) is also included.

```javascript
const result = await window.midas.models.run({
  type: 'anova',
  datasetId: 'ds_001',
  responseColumn: 'sepal_length',
  factorColumns: ['species']
});
// result.data: { type: 'anova', runId, mode: 'one-way',
//   anovaTable: { ssType, rows: [{ source, ss, df, ms, etaSquared, omegaSquared }, ...], residuals, total },
//   groupStatistics: [{ label, n, mean, std, min, max }, ...],
//   tukeyHSD: { comparisons: [{ group1, group2, meanDiff, se, ciLower, ciUpper }, ...], confidenceLevel },
//   nObservations: 150, nExcluded: 0, warnings: [] }
```

#### models.save(runId, name?) {#modelssaverunid-name}

Save a model run result to the project. After saving, the model is available via [`models.list()`](#modelslist) and [`models.describe()`](#modelsdescribeid).

```javascript
const run = await window.midas.models.run({ ... });
const saved = await window.midas.models.save(run.data.runId, 'My Model');
// saved.data: { modelId: '...', name: 'My Model' }
```

A diagnostic dataset is created on demand when you open the GLM Diagnostics tab. Key columns include `fitted_values`, `deviance_residuals`, `pearson_residuals`, `standardized_residuals`, `leverage`, and `cooks_distance`. Visualize these with [`reports.addGraph()`](#reportsaddgraphreportid-config) or Graph Builder for residual analysis and diagnostic plots.

#### models.describe(id) {#modelsdescribeid}

Get model details. Supports GLM, GLMM, Random Forest, ARIMA, Linear Regression, and ANOVA. The response structure varies by model type (check `result.data.type`). Models can be created via [`models.run()`](#modelsrunconfig) + [`models.save()`](#modelssaverunid-name) or through the GUI.

**GLM** returns coefficients, fit statistics (AIC, BIC, deviance), diagnostic summary, and metadata.

**GLMM** returns fixed effects (same format as GLM coefficients), random effects (group variable, variance, ICC, BLUP), fit statistics (log-likelihood, iterations, convergence), and diagnostic summary.

**Random Forest** returns task type (classification/regression), tuningParameters, MDI variable importances (if available), and OOB permutation importances (when computed).

**ARIMA** returns order (p, d, q), coefficients (AR/MA/Intercept with CIs), fit statistics (AIC, BIC, log-likelihood, σ², convergence), and `residualDiagnostics` (ACF and PACF of residuals). The residual diagnostics and `fit.error` from `models.run()` are persisted, so they are available without refitting. The `acf` and `pacf` arrays are empty either for models saved before this was added or for degenerate fits (with `fit.error` set). For how to interpret coefficients and fit statistics when `fit.error` is set, see the ARIMA notes under [`models.run()`](#modelsrunconfig).

**Linear Regression** returns coefficients (same shape as GLM; `inference` is always a t distribution), fit statistics (R², adjusted R², RMSE, AIC, BIC), diagnostic summary, and metadata.

**ANOVA** returns the ANOVA table (sum of squares, degrees of freedom, mean square, and the effect sizes η² and ω²), group statistics, Tukey HSD (only when computed), and metadata. The response structure is the same as described for ANOVA under [`models.run()`](#modelsrunconfig).

```javascript
const result = await window.midas.models.describe('model_001');
// GLM example - result.data:
// {
//   type: 'glm',
//   family: 'gaussian',
//   link: 'identity',
//   id: 'model_001',
//   name: 'My Model',
//   metadata: {
//     createdAt: '2025-01-15T10:30:00Z',
//     fittingDatasetId: 'ds_001',
//     predictors: ['sepal_width', 'petal_length'],
//     response: 'sepal_length',
//     sampleSize: 150
//   },
//   coefficients: [
//     { variable: '(Intercept)', estimate: 2.25, se: 1.02, ciLower: 0.23, ciUpper: 4.27, expEstimate: null, expCiLower: null, expCiUpper: null },
//     { variable: 'sepal_width', estimate: 0.60, se: 0.24, ciLower: 0.13, ciUpper: 1.07, expEstimate: null, expCiLower: null, expCiUpper: null },
//     ...
//   ],
//   inference: { distribution: 't', df: 147 },
//   fit: { deviance: 42.3, nullDeviance: 234.7, aic: 183.94, bic: 193.47, iterations: 5, converged: true },
//   diagnosticSummary: { ... }
// }
```

For GLMM, the coefficients under `fixedEffects` have the same shape as GLM `coefficients`, and `inference` is always `{ distribution: 'normal', df: null }`. Random Forest does not include `inference`, `coefficients`, or `fit` fields.

```javascript
// GLMM example - result.data:
// {
//   type: 'glmm',
//   family: 'gaussian',
//   link: 'identity',
//   id: 'model_002',
//   name: 'Mixed Model',
//   metadata: { createdAt: '2025-01-15T10:30:00Z', fittingDatasetId: 'ds_001', predictors: ['x1'], response: 'y', sampleSize: 200 },
//   fixedEffects: [
//     { variable: '(Intercept)', estimate: 3.14, se: 0.85, ciLower: 1.47, ciUpper: 4.81, expEstimate: null, expCiLower: null, expCiUpper: null },
//     { variable: 'x1', estimate: 0.52, se: 0.18, ciLower: 0.17, ciUpper: 0.87, expEstimate: null, expCiLower: null, expCiUpper: null }
//   ],
//   inference: { distribution: 'normal', df: null },
//   randomEffects: {
//     groupColumn: 'school',
//     variance: 1.23,
//     residualVariance: 4.56,
//     icc: 0.212,
//     blup: [{ groupId: 'A', estimate: 0.45, standardError: 0.21, rank: 1 }, { groupId: 'B', estimate: -0.32, standardError: 0.19, rank: 2 }]
//   },
//   fit: { logLikelihood: -447.05, iterations: 12, converged: true },
//   diagnosticSummary: { nObservations: 200, nGroups: 10, nFixedEffects: 2, nIncomplete: 0, groupSizes: [{ groupId: 'A', size: 20 }, { groupId: 'B', size: 15 }, { groupId: 'C', size: 25 }] }
// }
```

```javascript
// Random Forest example - result.data:
// {
//   type: 'random_forest',
//   id: 'model_003',
//   name: 'RF Classifier',
//   metadata: { createdAt: '2025-01-15T10:30:00Z', fittingDatasetId: 'ds_001', predictors: ['x1', 'x2', 'x3'], response: 'species', sampleSize: 150 },
//   taskType: 'classification',
//   tuningParameters: {
//     nEstimators: 100,
//     maxDepth: null,
//     minSamplesSplit: 2,
//     minSamplesLeaf: 1,
//     maxFeatures: 'sqrt',
//     randomState: 42
//   },
//   featureImportances: [
//     { feature: 'x1', importance: 0.45 },
//     { feature: 'x2', importance: 0.35 },
//     { feature: 'x3', importance: 0.20 }
//   ],
//   permutationImportances: [
//     { feature: 'x1', importance: 0.38 },
//     { feature: 'x2', importance: 0.42 },
//     { feature: 'x3', importance: 0.12 }
//   ]
// }
```

`featureImportances` is MDI (Mean Decrease in Impurity). `permutationImportances` is OOB permutation importance — the mean decrease in OOB prediction accuracy when each predictor is shuffled; it is `undefined` when not computed. Values can be negative, which means shuffling that predictor did not reduce OOB prediction accuracy. Both arrays follow the order of `metadata.predictors`.

#### models.remove(id) {#modelsremoveid}

Remove a model from the project. Closes any open tabs that reference the model, then cascade-deletes associated derived datasets (diagnostic, ANOVA, etc.).

```javascript
await window.midas.models.remove('model_001');
```

#### models.configure(tabId, config) {#modelsconfiguratabid-config}

Configure a GLM tab. Set family, link function, response variable, and explanatory variables. Column names are resolved case-insensitively.

```javascript
await window.midas.models.configure('glm_001', {
  family: 'binomial',
  link: 'logit',
  yColumn: 'outcome',
  xColumns: ['age', 'treatment'],
});
```

### reports {#reports}

Report text content can be modified with two methods: `addContent()` appends Markdown to the end of the existing content; `setContent()` replaces the entire content. Both methods preserve the report's `elements` — only the text is affected.

#### reports.create(name, description?) {#reportscreatename-description}

Create a new report.

```javascript
const result = await window.midas.reports.create('Analysis Report');
// result.data: { reportId: 'report_...', name: 'Analysis Report' }
```

#### reports.list() {#reportslist}

List reports.

```javascript
const result = await window.midas.reports.list();
// result.data: [{ id, name, elementCount }, ...]
```

#### reports.getContent(reportId) {#reportsgetcontentreportid}

Get report content.

```javascript
const result = await window.midas.reports.getContent('report_001');
// result.data: { content: '## Analysis Results\n...', elements: [{ id, type, title, renderStatus, renderStatusMessage? }, ...] }
```

`renderStatus` is included for every element. `'ok'` means no issue preventing rendering was detected, `'empty'` means data rows exist but no renderable points remain, and `'error'` means the element cannot render due to a missing dataset, unloaded data, or aesthetic misconfiguration. The point-level pre-render check applies only to Custom Graph elements (`graphConfig.type === 'custom'`). Non-custom graph elements are only checked for dataset existence and load state; their `'ok'` does not mean the rendered output was verified.

#### reports.setContent(reportId, content) {#reportssetcontentreportid-content}

Replace all text content of the report. Text added by `addContent()` or `addModelSummary()` is also replaced. If the content contains `{{type:id}}` element references whose IDs are not registered in the report's `elements`, those are reported via `result.warnings`.

Elements whose `{{type:id}}` references are removed from the content are **not** automatically deleted. They remain in the report's `elements` and can be re-referenced by inserting `{{type:id}}` back into the content. To permanently remove an element, use `reports.removeElement()`.

```javascript
const result = await window.midas.reports.setContent('report_001', '## Updated Results\n...');
// result.data: { contentLength: 42 }
// result.warnings: ['Element reference {{data_table:xxx}} not found in report elements']  // when unregistered references exist
```

#### reports.addContent(reportId, markdown) {#reportsaddcontentreportid-markdown}

Append Markdown text to the end of a report.

```javascript
await window.midas.reports.addContent('report_001', '## Analysis Results\n\nThe model shows...');
```

#### reports.addDataTable(reportId, datasetId, options?) {#reportsadddatatablereportid-datasetid-options}

Add a dataset to a report as a data table element. `datasetId` accepts a dataset ID or name (case-insensitive). The element is registered as `type: 'data_table'` under `report.elements`, and a `{{data_table:elementId}}` reference is appended to the content.

```javascript
const result = await window.midas.reports.addDataTable('report_001', 'Species Averages', {
  columns: ['species', 'avg_sl'],
  maxRows: 10
});
// result.data: { elementId, reportId, renderStatus, renderStatusMessage? }
```

Options:

- `columns` (`string[]`) — Columns to display, by column name or column ID. When omitted, all columns are displayed
- `maxRows` (`number`) — Maximum number of rows to display. A positive integer, capped at 1000. When omitted, all rows are displayed (up to 1000)

Returns a `NO_DATA` error for datasets whose data is not materialized. `renderStatus` is `'empty'` when the dataset has no rows, and `'ok'` otherwise.

#### reports.addModelSummary(reportId, modelId) {#reportsaddmodelsummaryreportid-modelid}

Add a model summary to a report. Supports GLM, GLMM, Linear Regression, Random Forest, ARIMA, and ANOVA.

All model types use the element-reference scheme (the same as [`addGraph`](#reportsaddgraphreportid-config)). For GLM, GLMM, and Linear Regression, the coefficient table is added as a `type: 'data_table'` element under `report.elements`, and a `{{data_table:elementId}}` reference is inserted into the report content. For Random Forest, Feature Importance is added as a `type: 'data_table'` element (Feature and MDI columns, plus a Permutation column when permutation importances are available; sorted by Permutation descending when available, otherwise by MDI descending) when `featureImportances` is available. The underlying derived dataset is registered in `project.datasets`, so it also appears in the Data tab listing. Deleting a model automatically removes associated coefficient datasets and prunes any report element that references the deleted datasets or model — `data_table`, `model_stats`, `graph_builder`, `crosstab`, `statistics_summary`, and `anova`. The `modelId` argument must refer to a saved model; otherwise the call returns `APIResult.success === false`.

For GLM, GLMM, and Linear Regression, the Model Fit / Random Effects / OLS Fit summary is registered as a `type: 'model_stats'` element under `report.elements`, and a `{{model_stats:elementId}}` reference is inserted into the report content. The element stores only the model id and resolves values from `project.models[modelId]` at render time, so when the same model id is refitted and `project.models` is overwritten, existing reports automatically reflect the new values. If the model is deleted after the report is created, the `model_stats` element renders a "Model not found" placeholder.

GLM and GLMM coefficient tables share the same columns: Variable, Estimate, Std. Error, Lower N%, Upper N%. For logit and log links, exp-transformed columns are appended: OR / IRR / exp(Est.), exp(Lower N%), exp(Upper N%). N is the confidence level saved with the model (`confidenceLevel`, default 95). Linear Regression coefficient tables additionally include Std. Coef. and VIF. Confidence intervals are Wald-type (`estimate ± criticalValue × SE`). For GLM, the critical value depends on family and link: families that estimate the dispersion parameter $\phi$ from data use $t_{1-\alpha/2,\, n-p}$, while families with $\phi = 1$ use $z_{1-\alpha/2}$ (see the table under [`models.run()`](#modelsrunconfig)). For GLMM fixed effects, MIDAS always uses the asymptotic standard normal distribution as an implementation choice. Linear Regression always uses the t distribution.

What each model type renders:

- **GLM**: The `model_stats` element renders a Model Fit section with AIC, BIC, Deviance, Null Deviance, Converged, and iterations.
- **GLMM**: When BLUP data exists, a BLUP table (Group, Random Intercept, Std. Error, Rank — sorted by estimate descending) is added as a `data_table` element, with a `### Random Effects (BLUP)` heading and `{{data_table:blupElementId}}` reference in the content (`result.data.blupElementId` returns the id). The `model_stats` element renders a Random Effects section with Group Variable, Number of Groups, Random Intercept Variance, Residual Variance (LMM only), and ICC; and a Model Fit section with Log-Likelihood, AIC, BIC, Converged, and iterations. For LMM (Gaussian + identity link), labels read REML Log-Likelihood and ICC; for Binomial + logit/probit, they read Log-Likelihood (Laplace) and ICC (latent scale). For other family+link combinations (Poisson, Gamma, etc.), ICC is not displayed (null) because no theoretically grounded latent-scale residual variance exists.
- **Linear Regression**: Five elements are registered — coefficient table, ANOVA Type I, ANOVA Type III, Prediction Intervals (per-observation prediction / confidence intervals), and a `model_stats` element. The `model_stats` element renders an OLS Fit section with R², Adjusted R², RMSE, and N observations, plus an Information Criteria section with AIC and BIC. Converged / iterations are not shown because they are trivial for OLS.
- **Random Forest**: A Feature Importance `data_table` element (Feature and MDI columns, plus a Permutation column when permutation importances are available; sorted by Permutation descending when available, otherwise by MDI descending) when `featureImportances` is available, and a `model_stats` element rendering a Model Configuration section (Task Type, Number of Trees, Max Depth, Min Samples Split, Min Samples Leaf, Max Features) and OOB Accuracy (classification) or OOB R² (regression).
- **ANOVA**: ANOVA Table and Group Statistics are registered as `data_table` elements. For one-way ANOVA with a computed Tukey HSD, a Tukey HSD table is also added. No `model_stats` element is registered.

```javascript
const result = await window.midas.reports.addModelSummary('report_001', 'model_001');
// result.data: { reportId, addedText, elementId?, statsElementId?, anovaTypeIElementId?, anovaTypeIIIElementId?, groupStatisticsElementId?, tukeyHSDElementId?, predictionIntervalsElementId?, blupElementId? }
// - elementId: id of the coefficient / Feature Importance / ANOVA Table data_table element (RF: only when featureImportances is non-empty)
// - statsElementId: id of the model_stats element that renders Model Fit / Random Effects / OLS Fit / Model Configuration (not returned for ANOVA)
// - blupElementId: id of the BLUP data_table element for GLMM (only when BLUP data exists)
// - anovaTypeIElementId / anovaTypeIIIElementId / predictionIntervalsElementId: Linear Regression only
// - groupStatisticsElementId: id of the Group Statistics table for ANOVA
// - tukeyHSDElementId: id of the Tukey HSD table for ANOVA (only when Tukey HSD is computed)
```

When called multiple times for the same model, the coefficients dataset is reused when an existing derived dataset has both the same name and an identical operation definition. Calling this method after refitting the same model with different fit conditions returns `APIResult.success === false` with a `Dataset with name "X" already exists` error, because the derived dataset would have the same name but a different operation definition. To record summaries for multiple fit configurations, either delete the previous model or save the new model under a different name before calling this method. Report elements and text are added anew each time.

#### reports.addGraph(reportId, config) {#reportsaddgraphreportid-config}

Add a graph to a report as a report element. Creates a Custom Graph without opening a tab. `datasetId` accepts a dataset ID or name (case-insensitive). Column names are also resolved case-insensitively. Properties not recognized by `AddGraphInput` or `LayerDefInput` are reported in `result.warnings`. Axis scales (`scales`), per-layer color scales (`layers[].scales`), and facets (`facets`) can be specified. See [configureGraph](#tabsconfiguregraphtabid-config) for the `facets` property reference.

```javascript
const result = await window.midas.reports.addGraph('report_001', {
  datasetId: 'ds_001',
  layers: [
    { geom: { type: 'point' }, aes: { x: 'weight', y: 'height' } }
  ],
  title: 'Weight vs Height',
  aspectRatio: 'custom',
  height: 500,
});
// result.data: { elementId, reportId, renderStatus, renderStatusMessage? }
```

`renderStatus` indicates the rendering outcome. `'ok'` means data points exist and all specified aes properties are applied. `'partial'` means data points exist and will be drawn, but some aes properties were not supported by the geom and were ignored (the graph may not match the intended visualization). `'empty'` means data rows exist but no renderable points remain after type filtering. `'error'` means the graph cannot render due to aesthetic misconfiguration. `renderStatusMessage` provides a reason when the status is not `'ok'`.

The graph is stored as a report element and a `{{graph_builder:elementId}}` reference is appended to the report content.

See [Custom Graph Reference](custom-graph-reference) for the list of geom, stat, and position types. Each Statistic's [`params`](custom-graph-reference#stat-params) are documented there with their accepted values and defaults. For facets, coordinates, and other graph-level options, see [Custom Graph](custom-graph).

`coordinates` accepts `'flipped'` or `'cartesian'`. When `'flipped'` is specified, the X and Y axes are swapped (e.g., vertical bars become horizontal bars). Defaults to `'cartesian'`.

`aspectRatio` accepts `'16:9'`, `'4:3'`, `'1:1'`, `'3:4'`, `'9:16'`, or `'custom'`. Defaults to `'16:9'`. When a preset value other than `'custom'` is specified, the aspect ratio determines the displayed height and `height` is not used for rendering. When `'custom'` is specified, `height` sets the height. `height` defaults to 400, minimum 200, maximum 5000.

#### reports.updateElement(reportId, elementId, config) {#reportsupdateelementreportid-elementid-config}

Replace the configuration of an existing `graph_builder` element (full replacement). The element retains its ID and position in the report content. `config` has the same structure as `addGraph` (`datasetId` accepts a dataset ID or name).

```javascript
const result = await window.midas.reports.updateElement('report_001', 'graph-xxx', {
  datasetId: 'ds_001',
  layers: [
    { geom: { type: 'bar' }, aes: { x: 'category', y: 'count' } }
  ],
  title: 'Updated Chart',
});
// result.data: { elementId, reportId, renderStatus, renderStatusMessage? }
```

Specifying an element type other than `graph_builder` (e.g. `data_table`, `model_stats`) returns an error.

#### reports.removeElement(reportId, elementId) {#reportsremoveelementreportid-elementid}

Remove an element from a report and its `{{type:elementId}}` reference from the content. Supports all element types (`graph_builder`, `data_table`, `model_stats`, `crosstab`, `statistics_summary`, `anova`). Associated resources (derived datasets, models) are not deleted.

```javascript
await window.midas.reports.removeElement('report_001', 'graph-xxx');
// result.data: { elementId, reportId }
```

#### reports.remove(reportId) {#reportsremovereportid}

Remove a report from the project. Closes any open tabs that reference the report.

```javascript
await window.midas.reports.remove('report_001');
```

### layout {#layout}

#### layout.split(config) {#layoutsplitconfig}

Split a pane to create a new area.

```javascript
const result = await window.midas.layout.split({
  tabId: 'tab_001',
  direction: 'horizontal'  // 'horizontal' or 'vertical'
});
// result.data: { newPaneId: 'pane_...', originalPaneId: 'pane_...' }
```

Use the returned `newPaneId` with [`tabs.moveToPane()`](#tabsmovetopanetabid-topaneid) to place tabs in the new pane.

## Error Codes {#error-codes}

| Code | Description |
|------|-------------|
| `ERROR` | General error |
| `NO_PROJECT` | No project is loaded |
| `NOT_FOUND` | Specified resource not found |
| `DATASET_NOT_FOUND` | No dataset matching the table name |
| `COLUMN_NOT_FOUND` | Column not found |
| `INVALID_TAB_TYPE` | Invalid tab type |
| `INVALID_TAB_TYPE_FOR_OPERATION` | Tab type does not support this operation |
| `INVALID_GRAPH_TYPE` | Layer operation attempted on non-custom graph |
| `INVALID_INPUT` | Invalid input parameter |
| `INDEX_OUT_OF_RANGE` | Layer index out of range |
| `DATASET_ALREADY_EXISTS` | Dataset with the same name (case-insensitive) already exists (when overwrite is false) |
| `SELF_REFERENCE` | The dataset being overwritten is a dependency (ancestor) of the operation |
| `NAME_CONFLICT` | Output name of a derived method collides with an existing primary dataset |
| `OPERATION_TYPE_MISMATCH` | The existing derived dataset was created by a different method. Each derived method (`derive`, `addColumns`, `setColumnSchema`, `addOrthogonalPolynomials`) writes its own operation type. Overwriting across types is rejected — e.g., calling `derive()` on a dataset originally created by `addColumns()`, or calling `addColumns()` on a dataset created by `derive()` |
| `AMBIGUOUS_TABLE_NAME` | Multiple datasets match the table name case-insensitively |
| `EXECUTION_ERROR` | SQL execution error |
| `UNSUPPORTED_MODEL_TYPE` | Unsupported model type |
| `MODEL_EXECUTION_ERROR` | Model execution error |
| `NUMERICAL_ERROR` | Numerical computation error (e.g., matrix singularity) |
| `INSUFFICIENT_DATA` | Insufficient valid observations |
| `NO_DATA` | Dataset has no data loaded |
| `NO_CONTAINER` | No active container |
| `NO_TARGET` | No table reference in SQL |
| `NO_CONFIG` | No Graph Builder configuration |
| `SPLIT_FAILED` | Pane split failed |
| `SANDBOX_MODE` | Cannot save in sandbox mode |
| `ENUM_ALREADY_EXISTS` | Enum definition already exists |
| `ENUM_NOT_FOUND` | Enum definition not found |
| `ENUM_IN_USE` | Enum definition is referenced by columns |
| `ENUM_VALUE_MISMATCH` | Data contains values outside the enum definition |
| `FETCH_ERROR` | URL fetch failure (network error, timeout, HTTP error) |
| `USER_CANCELLED` | User cancelled the operation (e.g. declined unsaved changes confirmation) |

## Reference {#reference}

- Live reference: Run `window.midas.help()` in the project screen
