Skip to main content

Feature Tags

The Feature Analysis panel uses a comprehensive tagging system to highlight important characteristics of your data. These tags are automatically generated and use a colour-coded system to indicate different levels of significance.

The tags are organised into three main categories:

Tag CategoryTag NameDescription
DistributionSmall number of outliersColumn contains few statistical outliers
DistributionMedium number of outliersColumn contains moderate outliers
DistributionSignificant number of outliersColumn contains many outliers
DistributionHigh-skewnessData shows significant asymmetry
DistributionLow-varianceData shows minimal variation
DistributionHigh-varianceData shows significant spread
DistributionZeros-ratioHigh proportion of zero values
PatternChronological-orderValues in ascending time sequence
PatternReverse-chronological-orderValues in descending time sequence
PatternAll-unique-valueEvery value is unique
PatternUnaryOnly one unique value
PatternHigh-cardinalityLarge number of unique values
BalanceMulti-imbalanceUneven distribution across categories
BalanceMulti-balanceEven distribution across categories

The Feature Tags uses a traffic light colour system to indicate different levels of significance or potential data quality concerns e.g. imbalance datasets, high/low variance features, or presence of outliers.

ColorSignificanceExample Tags
GreenGood/Normal condition.Multi-balance, Normal distribution
YellowWarning/Moderate concern.Small number of outliers, Minor skewness
RedCritical/Significant concernSignificant number of outliers, Multi-imbalance