Data Types and Measurement Scales
MIDAS automatically infers data types and measurement scales when loading data. Measurement scales directly affect the available graph types and statistical methods, so verify they are set correctly.
See Data Preparation and Import for instructions on loading data and changing types.
Measurement Scales
Measurement scales classify "what operations are meaningful for a given data". Based on Stevens' (1946) four levels of measurement.
Nominal Scale
Data representing categories with no meaningful order. Only equality (, ) is meaningful.
Examples: Gender (male/female), colors (red/blue/green), country names
In MIDAS:
- Bar charts and other category-based summaries
- Cross tabulation (chi-square test)
- Group separation by Color/Fill in Graph Builder
Ordinal Scale
Categories with meaningful order, but no defined interval between values. Comparisons (, ) are meaningful.
Examples: Satisfaction (low/medium/high), grade level (1st/2nd/3rd year), grades (A/B/C/D)
In MIDAS:
- All nominal operations, plus order-aware graph display
- Defining order in enum type makes graph axes respect the specified order
Interval Scale
Equally spaced numeric data where differences are meaningful, but ratios are not. The zero point is arbitrary.
Examples: Temperature (Celsius), year (AD)
- The difference between 20°C and 10°C is a meaningful 10°C
- However, 20°C is not "twice as warm" as 10°C
In MIDAS:
- Histograms, scatter plots, and other continuous value graphs
- Mean, standard deviation, correlation coefficients
- Hypothesis testing (t-test)
Ratio Scale
Equally spaced numeric data with a true zero point. Both differences and ratios are meaningful.
Examples: Height, weight, price, age
- The difference between 20kg and 10kg is a meaningful 10kg
- Furthermore, 20kg is "twice as heavy" as 10kg
In MIDAS:
- All interval scale operations
- Coefficient of variation (CV) and geometric mean calculation
Scales and Analysis Methods
| Analysis Method | Required Scale | MIDAS Feature |
|---|---|---|
| Frequency counts | Nominal or above | Crosstab, Statistics |
| Median, quartiles | Ordinal or above | Statistics |
| Mean, standard deviation | Interval or above | Statistics |
| Correlation coefficient | Interval or above | Statistics (select 2 columns) |
| t-test | Interval or above | Two-Sample Test, Paired Test |
| Mann-Whitney U, Wilcoxon signed-rank | Ordinal or above | Two-Sample Test, Paired Test |
| Regression analysis | Interval or above | Linear Regression, GLM |
| Histogram | Interval or above | Graph Builder |
| Bar chart | Nominal, Ordinal | Graph Builder |
| Coefficient of variation, geometric mean | Ratio | Statistics (Comparison) |
Auto-Inference from Data Types to Measurement Scales
MIDAS determines data types on import and automatically assigns measurement scales.
| Data Type | Inferred Scale | Reason |
|---|---|---|
| boolean | Nominal | true/false are unordered categories |
| int64 | Ratio | Integers typically have a natural zero point |
| float64 | Ratio | Same as integers |
| date | Interval | Date differences are meaningful, but ratios typically are not |
| datetime | Interval | Same as dates |
| string | Nominal | Text is treated as categories |
| enum | Nominal | Can be changed to ordinal when order is defined |
Auto-inference may not match the actual meaning of the data. For example, postal codes are loaded as numeric but are semantically nominal. Similarly, 5-point Likert scales should be treated as ordinal rather than ratio. Right-click the column in the Data Table and select Edit Scale to change the measurement scale.
References
- Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103(2684), 677-680.
See also
- Data Preparation and Import - Data type list and type conversion
- Basic Statistics - Statistics displayed by measurement scale