Task
The EvoML platform intelligently detects the machine learning task based on the selected target column. EvoML supports a range of tasks specified including:
- Classification
- Regression
- Timeseries Classification
- Timeseries Regression
- NLP Classification
- NLP Regression
1. Classification
Use Cases:
- Customer segmentation: Classifying customers into different groups based on behavior, preferences, or demographics.
- Fraud detection: Identifying fraudulent transactions in a financial dataset.
Features:
- Multi-class support: You can have multiple classes for classification (e.g., classifying emails as spam or not spam).
- Imbalanced learning: This can handle datasets where some classes are more frequent than others.
Workflow:
- Input: Dataset with labeled categories (e.g., 0, 1, 2 for classes).
- Target: A categorical column that represents class labels.
- Model output: The model will predict the class label for each sample.
2. Regression
Use Cases:
- Price prediction: Predicting house prices, stock prices, etc.
- Demand forecasting: Estimating future demand for a product or service based on historical data.
Features:
- Linear/Non-linear optimization: Can handle both linear and complex non-linear relationships between features and the target variable.
Workflow:
- Input: Dataset with continuous numerical values for prediction (e.g., house features).
- Target: A continuous numerical value (e.g., house price, demand).
- Model output: The model predicts a continuous numerical value.
3. Timeseries Classification
Use Cases:
- Anomaly detection: Identifying unusual patterns or outliers in time-based data.
- Pattern recognition: Classifying different patterns in time series data (e.g., sensor readings or financial data).
Features:
- Sequence classification: Recognizing patterns in sequences, such as time-dependent events.
- Temporal feature extraction: Extracting features related to time, such as trends, seasonality, etc.
Workflow:
- Input: Sequence data (e.g., a time-series dataset with sequential observations).
- Target: A categorical label (e.g., anomaly vs normal).
- Model output: Classifies the sequence into predefined categories.
4. Timeseries Regression
Use Cases:
- Stock prediction: Forecasting future stock prices based on past trends.
- Demand forecasting: Predicting future demand based on historical data.
Features:
- Advanced forecasting: Techniques like XGBoost, etc., for forecasting future values.
Workflow:
- Input: Sequence data (e.g., historical stock prices or product demand).
- Target: A continuous numerical value (e.g., stock price at time t+1).
- Model output: Predicts a future continuous value (e.g., price or demand for the next time step).
5. NLP Classification
Use Cases:
- Sentiment Analysis: Classifying text (such as customer reviews or social media posts) into categories like positive, neutral, or negative sentiment.
- Spam Detection: Classifying emails or messages as either spam or not spam based on the content.
Features:
- Sequence Classification: EvoML can classify text sequences (e.g., sentences or paragraphs) into multiple categories.
- Multi-class Support: You can classify text into more than two categories (e.g., detecting various types of customer complaints or topics).
Workflow:
- Input: Text data (e.g., sentences, documents, tweets, reviews).
- Target: A categorical column representing class labels (e.g., spam or not spam for email classification).
- Model Output: The model will predict the class label for each text sample, such as identifying if an email is spam or categorizing a tweet as positive or negative.
6. NLP Regression
Use Cases:
- Sentiment Scoring: Predicting the sentiment score (e.g., a rating from 1 to 10) of a piece of text based on its content, such as movie reviews or product ratings.
- Text-based Predictive Models: Using text to predict continuous values like sales volume or customer engagement metrics based on user feedback.
Features:
- Sequence Regression: Models can be trained to predict continuous numerical values based on sequences of text (e.g., predicting the likelihood of a product's price change based on customer feedback).
Workflow:
- Input: Text data (e.g., sentences, reviews, articles).
- Target: A continuous numerical value (e.g., sentiment score, readability score, product sales).
- Model Output: The model predicts a continuous value (e.g., predicting a sentiment score of 4.5/5 or estimating the expected sales based on reviews).