Sourcify
Effortlessly find and manage open-source dependencies for your projects.

Transparent Automated Machine Learning for Python with rich documentation and model explanation.

MLJAR is a specialized Automated Machine Learning (AutoML) framework designed for tabular data, built specifically for the Python ecosystem. Unlike 'black-box' AutoML solutions, MLJAR's technical architecture emphasizes transparency and explainability (XAI). It operates by systematically training various machine learning algorithms (XGBoost, LightGBM, CatBoost, Random Forest, etc.), performing hyperparameter optimization, and constructing sophisticated ensembles. Its unique value proposition in the 2026 market lies in its 'Automatic Documentation' feature, which generates comprehensive Markdown and HTML reports for every model trained, including learning curves, performance metrics, and feature importance. The framework offers four distinct modes—'Explain', 'Perform', 'Compete', and 'Optuna'—allowing engineers to balance training speed versus predictive accuracy. MLJAR effectively bridges the gap between rapid prototyping and production-grade modeling by providing out-of-the-box support for cross-validation, feature engineering (Golden Features), and SHAP-based model explanations, making it a critical tool for industries requiring audit-ready AI such as finance and healthcare.
MLJAR is a specialized Automated Machine Learning (AutoML) framework designed for tabular data, built specifically for the Python ecosystem.
Explore all tools that specialize in explain model predictions. This domain focus ensures MLJAR delivers optimized results for this specific requirement.
Explore all tools that specialize in hyperparameter tuning. This domain focus ensures MLJAR delivers optimized results for this specific requirement.
Automatically searches for high-impact feature transformations using algebraic operations on existing columns.
Generates a complete documentation suite for every model in Markdown format, including validation metrics.
Enables deep hyperparameter tuning, ensembling, and stacking targeted at high-performance leaderboard scenarios.
Optimized for speed and interpretability, providing immediate insights into data through simple models and SHAP values.
Performs pre-flight checks on data, including missing value detection, constant column removal, and categorical encoding.
Combines the predictions of multiple base models using a meta-model to minimize variance and bias.
Directly integrates Shapley Additive Explanations for local and global model interpretability.
Verify Python 3.8+ environment is active.
Install the package via pip: pip install mljar-supervised.
Import the AutoML class from supervised.automl.
Prepare training data as a Pandas DataFrame (X) and target array (y).
Initialize AutoML with a specific mode (e.g., mode='Compete' for high accuracy).
Configure result_path to specify where reports and models will be saved.
Execute the .fit(X, y) method to begin the automated search process.
Navigate to the result_path directory to view the auto-generated README.md and HTML reports.
Evaluate model performance using SHAP plots and feature importance charts included in the reports.
Deploy the model using the .predict() method or export to ONNX for production environments.
All Set
Ready to go
Verified feedback from other users.
"Highly praised for its transparency and the quality of its automated reports. Users find it significantly more usable than H2O or TPOT for quick insights."
Post questions, share tips, and help other users.
Effortlessly find and manage open-source dependencies for your projects.

End-to-end typesafe APIs made easy.

Page speed monitoring with Lighthouse, focusing on user experience metrics and data visualization.

Topcoder is a pioneer in crowdsourcing, connecting businesses with a global talent network to solve technical challenges.

Explore millions of Discord Bots and Discord Apps.

Build internal tools 10x faster with an open-source low-code platform.

Open-source RAG evaluation tool for assessing accuracy, context quality, and latency of RAG systems.

AI-powered synthetic data generation for software and AI development, ensuring compliance and accelerating engineering velocity.