Data Preparation and Import

Loading a File

Click the Open File button on the launcher screen and select a file. To use sample data, choose from the "Sample Data" section on the launcher screen. See Getting Started for detailed steps.

Supported File Formats

MIDAS supports four file formats: CSV, TSV, MDS, and ZIP.

CSV (Comma-Separated Values) The most common data format. Columns are separated by commas (,). File extension is typically .csv.

TSV (Tab-Separated Values) A file format where columns are separated by tab characters. File extension is .tsv or .txt.

MDS (MIDAS Project File) MIDAS's native project file format. Contains datasets, analysis settings, and reports. See Project File (MDS) for details.

ZIP (Multiple CSV/TSV Files) A ZIP archive containing CSV or TSV files. Each file is imported as a separate dataset.

Excel files (.xlsx) cannot be loaded directly. Save your spreadsheet as CSV from Excel's "Save As" menu.

Character Encoding UTF-8, Shift-JIS, and EUC-JP encodings are supported. Encoding is auto-detected. When saving CSV from Excel, UTF-8 is recommended: select "CSV UTF-8 (Comma delimited)" format.

File Structure

MIDAS treats the first row as a header row. The values in the first row become column names, and subsequent rows become data. If your CSV does not have a header row, uncheck the "First row is header" checkbox in the Import Data dialog preview. MIDAS then generates column names automatically (Column1, Column2, ...) and treats the first row as data.

If the header row is empty or contains only blank cells, or if any row has a different number of columns than the header, MIDAS rejects the file with an error instead of silently dropping data. Fix the file in a text editor and retry the import.

Example:

Name,Age,Country
Alice,25,USA
Bob,30,Japan
Charlie,28,UK

Missing Values Empty cells in CSV files are loaded as missing values (null). Strings such as "NA" or "-" are not treated as missing values; they are loaded as text. Missing values are excluded from statistical calculations and graph rendering. Rows containing missing values are not removed from the dataset.

Data Types

MIDAS automatically determines data types when loading.

boolean Boolean values represented by true/false, 1/0, yes/no, y/n, etc.

int64 (integer) Numbers without decimal points (e.g., 1, 42, -10).

float64 (floating point) Numbers with decimal points (e.g., 3.14, 0.5, -2.71).

date Date data (e.g., 2025-11-17, 2025/11/17).

datetime Data including both date and time (e.g., 2025-11-17 14:30:00).

string Text data that does not match any of the above types.

enum Categorical data with a fixed set of ordered values. Create enum definitions and convert string columns to enum type. See Enum Definitions for details.

Data types are displayed below the column name as int64. If a data type is not correctly determined, use Column Type Conversion to fix it. The original dataset is not modified; conversion results are created as a new dataset.

Measurement Scales

MIDAS automatically assigns a measurement scale (Nominal, Ordinal, Interval, Ratio) to each column. Measurement scales affect the available graph types and statistical methods. Right-click a column in the Data Table to change its scale.

See Data Types and Measurement Scales for what each scale means and how it affects analysis.

Next steps

Data Table - View, filter, and sort your loaded data
Creating Graphs - Visualize your data