
Weave (by Weights & Biases)
The lightweight toolkit for tracking, evaluating, and iterating on LLM applications in production.

The open-source framework for full-lifecycle ML observability and LLM evaluation.

Evidently AI is a leading open-source framework designed for data scientists and ML engineers to evaluate, test, and monitor machine learning models from development to production. By 2026, it has solidified its position as the industry standard for LLM observability, offering a sophisticated suite for detecting data drift, prediction drift, and regression/classification performance degradation. Its architecture centers around 'Reports' and 'Test Suites,' which allow for both interactive visual analysis and automated pipeline integration. The platform has expanded beyond traditional tabular data into unstructured text and embeddings, making it a critical tool for RAG (Retrieval-Augmented Generation) evaluation and hallucination detection. Evidently provides an open-source Python library for local execution and a managed Cloud platform for team collaboration, centralized dashboarding, and persistent monitoring. Its modular design allows it to integrate seamlessly with existing data stacks like Airflow, MLflow, and Grafana, providing a non-intrusive layer of observability that ensures model reliability and data integrity across complex enterprise AI ecosystems.
Evidently AI is a leading open-source framework designed for data scientists and ML engineers to evaluate, test, and monitor machine learning models from development to production.
Explore all tools that specialize in data drift detection. This domain focus ensures Evidently AI delivers optimized results for this specific requirement.
Uses statistical methods like MMD (Maximum Mean Discrepancy) to detect shifts in vector space embeddings.
Specific metrics for measuring context precision, recall, and faithfulness in Retrieval-Augmented Generation.
A centralized platform to visualize performance over time across multiple environments.
Extensible Python class structure allowing users to define domain-specific business logic as monitorable metrics.
Stores model and data profiles as compressed JSON snapshots rather than raw data.
Configurable triggers based on statistical test failures rather than simple threshold crossing.
Automated checks for missing values, duplicates, and feature range violations during ingestion.
Install via pip: 'pip install evidently'
Import the library and required components (Reports, Metrics, Tests).
Load your reference (training) and current (production) datasets as Pandas DataFrames.
Configure a Report or Test Suite by selecting specific metrics (e.g., DataDriftPreset).
Execute the report: 'report.run(reference_data=ref, current_data=curr)'.
Visualize results locally using '.show()' in a Jupyter notebook.
Export metrics to JSON or HTML for integration into automated pipelines.
Connect to Evidently Cloud by creating a Workspace and Project.
Push data snapshots to the cloud for persistent monitoring: 'workspace.add_snapshot(project_id, report)'.
Configure alerts and thresholds within the Cloud UI for real-time monitoring.
All Set
Ready to go
Verified feedback from other users.
"Users praise the library for its ease of integration and high-quality visualizations, though some find the cloud pricing jump significant."
Post questions, share tips, and help other users.

The lightweight toolkit for tracking, evaluating, and iterating on LLM applications in production.

Enterprise Data Observability for Reliability, Cost Governance, and AI Pipeline Trust.

AI Observability platform that provides tools for responsible AI adoption through open source technologies.
Zod is a TypeScript-first schema validation library with static type inference.
ZenML is the AI Control Plane that unifies orchestration, versioning, and governance for machine learning and GenAI workflows.
Powering the immersive web

A comprehensive XR platform for creating and deploying immersive experiences.

Zapier unlocks transformative AI to safely scale workflows with the world's most connected ecosystem of integrations.