Feature Tags
The Feature Analysis panel uses a comprehensive tagging system to highlight important characteristics of your data. These tags are automatically generated and use a colour-coded system to indicate different levels of significance.
The tags are organised into three main categories:
Tag Category | Tag Name | Description |
---|---|---|
Distribution | Small number of outliers | Column contains few statistical outliers |
Distribution | Medium number of outliers | Column contains moderate outliers |
Distribution | Significant number of outliers | Column contains many outliers |
Distribution | High-skewness | Data shows significant asymmetry |
Distribution | Low-variance | Data shows minimal variation |
Distribution | High-variance | Data shows significant spread |
Distribution | Zeros-ratio | High proportion of zero values |
Pattern | Chronological-order | Values in ascending time sequence |
Pattern | Reverse-chronological-order | Values in descending time sequence |
Pattern | All-unique-value | Every value is unique |
Pattern | Unary | Only one unique value |
Pattern | High-cardinality | Large number of unique values |
Balance | Multi-imbalance | Uneven distribution across categories |
Balance | Multi-balance | Even distribution across categories |
The Feature Tags uses a traffic light colour system to indicate different levels of significance or potential data quality concerns e.g. imbalance datasets, high/low variance features, or presence of outliers.
Color | Significance | Example Tags |
---|---|---|
Green | Good/Normal condition. | Multi-balance, Normal distribution |
Yellow | Warning/Moderate concern. | Small number of outliers, Minor skewness |
Red | Critical/Significant concern | Significant number of outliers, Multi-imbalance |