Data Types and Measurement Scales
MIDAS automatically infers data types and measurement scales when loading data. Data type and measurement scale are independent concepts: data type describes how the value is represented (number, date, text, etc.), while measurement scale describes its analytical properties (nominal, ordinal, interval, ratio). Measurement scales filter which items appear in Basic Statistics — statistics that do not fit the scale are not displayed — so verify the scale is correct after loading.
See Data Preparation and Import for instructions on loading data and changing types.
Measurement Scales
Measurement scales classify "what operations are meaningful for a given data". Based on Stevens' (1946) four levels of measurement. The scales form a hierarchy — nominal < ordinal < interval < ratio — where higher scales support more operations and include all operations available to lower scales.
Nominal Scale
Data representing categories with no meaningful order. Only equality (, ) is meaningful.
Examples: Gender (male/female), colors (red/blue/green), country names
Ordinal Scale
Categories with meaningful order, but no defined interval between values. Comparisons (, ) are meaningful.
Examples: Satisfaction (low/medium/high), grade level (1st/2nd/3rd year), grades (A/B/C/D)
Interval Scale
Equally spaced numeric data where differences are meaningful, but ratios are not. The zero point is arbitrary.
Examples: Temperature (Celsius), year (AD)
- The difference between 20°C and 10°C is a meaningful 10°C
- However, 20°C is not "twice as warm" as 10°C
Ratio Scale
Equally spaced numeric data with a true zero point. Both differences and ratios are meaningful.
Examples: Height, weight, price, age
- The difference between 20kg and 10kg is a meaningful 10kg
- Furthermore, 20kg is "twice as heavy" as 10kg
Ratio scale is not assigned by auto-inference. Whether zero represents a true origin depends on the meaning of the data and cannot be determined from values alone. For data with a true zero point such as height or weight, right-click the column in the Data Table and set ratio scale from Edit Scale of Measurement.
Auto-Inference from Data Types to Measurement Scales
MIDAS determines data types on import and automatically assigns measurement scales. Auto-inference is only an initial value — after loading, data type and measurement scale can be changed independently.
| Data Type | Inferred Scale | Reason |
|---|---|---|
| boolean | Nominal | true/false are unordered categories |
| int64 | Interval | Whether zero represents a true origin depends on the data |
| float64 | Interval | Same as integers |
| date | Interval | Date differences are meaningful, but ratios typically are not |
| datetime | Interval | Same as dates |
| string | Nominal | Text is treated as categories |
| enum | Nominal | Can be changed to ordinal when order is defined |
Auto-inference may not match the actual meaning of the data. For example, postal codes and ID columns are loaded as numeric and assigned interval scale, but they are semantically nominal. For these columns, right-click the column in the Data Table and select Edit Scale of Measurement to change to the appropriate scale. Note that numeric strings with leading zeros (0060001, 001, etc.) are automatically loaded as string, so leading zeros are preserved.
For ordered categories stored as text, create an Enum definition and convert the column to Enum type. Defining the order in an Enum does not automatically make the column ordinal, so after conversion, change to ordinal from Edit Scale of Measurement.
References
- Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103(2684), 677-680. https://www.jstor.org/stable/1671815
See also
- Data Preparation and Import - Data type list and type conversion
- Basic Statistics - Statistics displayed by measurement scale