Skip to main content

Task

The EvoML platform intelligently detects the machine learning task based on the selected target column. EvoML supports a range of tasks specified including:

  1. Classification
  2. Regression
  3. Timeseries Classification
  4. Timeseries Regression
  5. NLP Classification
  6. NLP Regression

1. Classification

Use Cases:

  • Customer segmentation: Classifying customers into different groups based on behavior, preferences, or demographics.
  • Fraud detection: Identifying fraudulent transactions in a financial dataset.

Features:

  • Multi-class support: You can have multiple classes for classification (e.g., classifying emails as spam or not spam).
  • Imbalanced learning: This can handle datasets where some classes are more frequent than others.

Workflow:

  • Input: Dataset with labeled categories (e.g., 0, 1, 2 for classes).
  • Target: A categorical column that represents class labels.
  • Model output: The model will predict the class label for each sample.

2. Regression

Use Cases:

  • Price prediction: Predicting house prices, stock prices, etc.
  • Demand forecasting: Estimating future demand for a product or service based on historical data.

Features:

  • Linear/Non-linear optimization: Can handle both linear and complex non-linear relationships between features and the target variable.

Workflow:

  • Input: Dataset with continuous numerical values for prediction (e.g., house features).
  • Target: A continuous numerical value (e.g., house price, demand).
  • Model output: The model predicts a continuous numerical value.

3. Timeseries Classification

Use Cases:

  • Anomaly detection: Identifying unusual patterns or outliers in time-based data.
  • Pattern recognition: Classifying different patterns in time series data (e.g., sensor readings or financial data).

Features:

  • Sequence classification: Recognizing patterns in sequences, such as time-dependent events.
  • Temporal feature extraction: Extracting features related to time, such as trends, seasonality, etc.

Workflow:

  • Input: Sequence data (e.g., a time-series dataset with sequential observations).
  • Target: A categorical label (e.g., anomaly vs normal).
  • Model output: Classifies the sequence into predefined categories.

4. Timeseries Regression

Use Cases:

  • Stock prediction: Forecasting future stock prices based on past trends.
  • Demand forecasting: Predicting future demand based on historical data.

Features:

  • Advanced forecasting: Techniques like XGBoost, etc., for forecasting future values.

Workflow:

  • Input: Sequence data (e.g., historical stock prices or product demand).
  • Target: A continuous numerical value (e.g., stock price at time t+1).
  • Model output: Predicts a future continuous value (e.g., price or demand for the next time step).

5. NLP Classification

Use Cases:

  • Sentiment Analysis: Classifying text (such as customer reviews or social media posts) into categories like positive, neutral, or negative sentiment.
  • Spam Detection: Classifying emails or messages as either spam or not spam based on the content.

Features:

  • Sequence Classification: EvoML can classify text sequences (e.g., sentences or paragraphs) into multiple categories.
  • Multi-class Support: You can classify text into more than two categories (e.g., detecting various types of customer complaints or topics).

Workflow:

  • Input: Text data (e.g., sentences, documents, tweets, reviews).
  • Target: A categorical column representing class labels (e.g., spam or not spam for email classification).
  • Model Output: The model will predict the class label for each text sample, such as identifying if an email is spam or categorizing a tweet as positive or negative.

6. NLP Regression

Use Cases:

  • Sentiment Scoring: Predicting the sentiment score (e.g., a rating from 1 to 10) of a piece of text based on its content, such as movie reviews or product ratings.
  • Text-based Predictive Models: Using text to predict continuous values like sales volume or customer engagement metrics based on user feedback.

Features:

  • Sequence Regression: Models can be trained to predict continuous numerical values based on sequences of text (e.g., predicting the likelihood of a product's price change based on customer feedback).

Workflow:

  • Input: Text data (e.g., sentences, reviews, articles).
  • Target: A continuous numerical value (e.g., sentiment score, readability score, product sales).
  • Model Output: The model predicts a continuous value (e.g., predicting a sentiment score of 4.5/5 or estimating the expected sales based on reviews).