
Alation
The Data Intelligence Platform for the Modern AI and Cloud Enterprise.

The DataHub metadata platform gives context for AI to safely manage and use data.

DataHub is an open-source metadata platform designed for modern data ecosystems. It provides a centralized system for managing and discovering metadata across diverse data sources, including datasets, dashboards, ML models, and pipelines. DataHub enables data discovery, governance, and quality control through features such as search, lineage tracking, PII tagging, and data contracts. Its architecture supports both UI-based and programmatic ingestion via APIs and SDKs, catering to different user preferences. DataHub Cloud is an enterprise-ready SaaS offering built on DataHub Core, providing additional features for scaling data management, accelerating AI initiatives, and implementing data governance practices. Key use cases include empowering self-serve workflows, improving data quality, and simplifying data discovery for AI platforms.
DataHub is an open-source metadata platform designed for modern data ecosystems.
Explore all tools that specialize in data governance. This domain focus ensures DataHub delivers optimized results for this specific requirement.
Explore all tools that specialize in manage metadata. This domain focus ensures DataHub delivers optimized results for this specific requirement.
Explore all tools that specialize in monitor data quality. This domain focus ensures DataHub delivers optimized results for this specific requirement.
Explore all tools that specialize in track data lineage. This domain focus ensures DataHub delivers optimized results for this specific requirement.
Automatically captures and visualizes data lineage across various data systems, including databases, ETL pipelines, and BI tools.
Allows users to define and run data quality checks and assertions on datasets, ensuring data accuracy and reliability.
Automatically detects and tags personally identifiable information (PII) within datasets, ensuring compliance with data privacy regulations.
Enables the creation and enforcement of data contracts between data producers and consumers, ensuring data consistency and reliability.
Simplifies metadata ingestion from various data sources through an intuitive user interface, reducing the need for complex configurations.
Install DataHub CLI: python3 -m pip install --upgrade acryl-datahub datahub
Start DataHub with Docker: datahub docker quickstart
Configure metadata ingestion sources (e.g., databases, data warehouses)
Use the UI-based ingestion to set up integrations in minutes
Define ownership and PII tags for data assets for governance
Implement data quality tests and assertions
Explore data lineage and dependencies using the DataHub UI
Leverage APIs and SDKs for programmatic control and automation
All Set
Ready to go
Verified feedback from other users.
"DataHub is praised for its comprehensive metadata management capabilities and ease of use, though some users mention a learning curve."
Post questions, share tips, and help other users.

The Data Intelligence Platform for the Modern AI and Cloud Enterprise.

The open-source data discovery and metadata engine for modern data-driven enterprises.

Declarative data governance and pipeline management for the Hadoop ecosystem.

Enterprise-grade data governance and metadata management for hybrid-cloud ecosystems.

The Easy and Open Data Lakehouse Platform built for sub-second SQL queries and Git-like data management.

AI-powered cloud data management solution for the entire data lifecycle.