EvoML Core Features
EvoML provides a comprehensive suite of tools for end-to-end machine learning workflows. Below is a breakdown of its core capabilities:
1. Data Upload and Processing
Data sources:
- Supports structured formats (CSV, Excel, JSON).
- Handles unstructured data formats (text, images).
- Connects to databases (SQL, NoSQL).
- Integrates with cloud storage (S3, Azure Blob).
- Supports FTP connections.
Automated data preparation:
- Automated data cleaning and preprocessing.
- Missing value detection and handling.
- Outlier detection.
- Data type inference and conversion.
Feature analysis:
- Statistical insights (mean, median, variance).
- Pattern detection and analysis.
- Visualization tools:
- Histograms
- Density plots
- Box plots
- Correlation matrices
2. Model Development
Hyperparameter Tuning:
- Tree-Structured Parzen Estimator.
- Non-Dominated Sorting Genetic Algorithm II.
- Random.
Model generation:
- Enable manual setting of model parameters.
- Automated/manual model selection.
- Cross-validation strategies.
- Automated reporting.
3. Feature Engineering
Impute, Encode and Scale:
- Multiple imputation strategies.
- Categorical encoding methods.
- Numeric scaling techniques.
- Standardization and normalization.
- Generate embeddings from text.
Feature Generation:
- Automated feature creation.
- Interaction term generation.
- Polynomial feature expansion.
Feature Selection:
- Minimum redundancy maximum relevance.
- Feature importance ranking.
- Correlation-based selection.
- Dimensionality reduction techniques.
4. Timeseries Handling
Model Validation:
- Time-based cross-validation.
- Sliding Window.
- Expanding Window.
Feature Engineering:
- Time-based feature generation.
- Lag generation.
- Rolling windows.
5. Multi-objective Optimization
- User-defined objective function.
- Up to three simultaneous optimization criteria.
- Pareto optimization.
6. Evaluation
Model Evaluation:
- Classification/regression metrics.
- Custom evaluation criteria.
- Visualisations.
Explainability:
- Feature importance analysis.
- SHAP values.
- Partial dependence plots.
- Model-agnostic explanations.
7. EvoML Client
Code Interface:
- Python client library.
- RESTful API integration.
- Batch processing support.
- Real-time prediction capabilities.
- Model retraining.
- Extendable code interface.
8. Deployment
Pipeline Generation:
- Production-ready code generation.
- Environment management.
- Dependency handling.
Model deployment:
- Automated API generation.
- Docker containerization.
- Model monitoring capabilities.
- Integration with MLOps tools.