Correlation ratio
Definition
The Correlation Ratio is a measure of the relationship between a numerical feature and a categorical feature. It captures how much the distribution of a numerical feature varies across the different categories.
Range of score: 0 to 1
A score of 0 indicates no correlation, implying that the category of the categorical feature provides no information about the numerical feature.
A value of 1, on the other hand, indicates a perfect correlation, implying that the category of the categorical feature can perfectly predict the value of the numerical feature.
How it works
The Correlation Ratio is calculated by comparing the variance within each category to the total variance of the dataset. If the variance within each category is small compared to the total variance, this means that the category of the categorical feature has a significant effect on the numerical feature, and the Correlation Ratio will be close to 1. Conversely, if the variance within each category is large compared to the total variance, this means that the category of the categorical feature has little effect on the numerical feature, and the Correlation Ratio will be close to 0.