Model Validation

Model validation ensures that machine learning models generalise well to unseen data by assessing their performance using independent validation sets

Selection Model Validation in evoML

Create a New Trial
Under Splitting Options, select Model Validation method. Existing model validation approaches:
- Holdout (all tasks)
- K-fold cross validation (classification & regression)
- Sliding window (timeseries)
- Expanding window (timeseries)

Model Validation Options

1. Holdout (all models)

The dataset is split into two subsets: one for training and one for validation.

Setting	Details
Size	The fraction of the training dataset to include in the validation subset.
Keep order	Whether or not to shuffle the data.

2. K-fold Cross Validation (Classification & Regression)

The dataset is divided into K subsets. The model is trained on K-1 subsets and validated on the remaining one, repeating the process for each subset.

Setting	Details
K	Number of subsets into which to divide the training data.
Keep order	Whether or not to shuffle the data.

3. Sliding Window (Timeseries)

The model is trained on a fixed-length training window and validated on a forecast window. Both windows move forward in time by a defined slide length between rounds. An optional gap can be added between them.

Setting	Details
Evaluation Window	Number of time steps in the forecast window (typically 15% of the dataset).
Train Window	Number of time steps in the training window (typically 45% of the dataset).
Slide	Number of time steps by which both windows move forward between rounds (typically 15% of the dataset).

4. Expanding Window (Timeseries)

The model is trained on an initially defined training window, which expands over time. The forecast window moves forward by a defined expansion length in each round. An optional gap can be added between them.

Setting	Details
Evaluation Window	Number of time steps in the forecast window (typically 15% of the dataset).
Initial Train Window	Number of time steps in the first round of training (typically 45% of the dataset).
Expansion Length	Number of time steps by which the training window grows in each subsequent round (typically 15% of dataset).

Selection Model Validation in evoML​

Model Validation Options​

1. Holdout (all models)​

2. K-fold Cross Validation (Classification & Regression)​

3. Sliding Window (Timeseries)​

4. Expanding Window (Timeseries)​