
dbt
dbt empowers data teams to deliver reliable, governed data faster and at scale.

The open-source Python framework for reproducible, maintainable, and modular data science code.

Kedro is an open-source Python framework designed to help data scientists and engineers create production-ready data pipelines. Originally developed by McKinsey's QuantumBlack and now a part of the LF AI & Data Foundation, Kedro addresses the 'notebook-to-production' gap by enforcing software engineering best practices—such as modularity, separation of concerns, and versioning—within the data science workflow. Its architecture centers around a 'Data Catalog' which abstracts data access, and a 'Pipeline' structure composed of 'Nodes' (pure Python functions). This decoupling allows teams to swap data sources or execution environments without rewriting core logic. In the 2026 market, Kedro remains the gold standard for enterprise-grade data orchestration where governance, auditability, and team collaboration are paramount. It integrates seamlessly with modern stack components like MLflow, Great Expectations, and Airflow, providing a standardized project structure (based on Cookiecutter) that enables rapid onboarding and scale-out capabilities across distributed computing environments like Apache Spark or Dask.
Kedro is an open-source Python framework designed to help data scientists and engineers create production-ready data pipelines.
Explore all tools that specialize in etl development. This domain focus ensures Kedro delivers optimized results for this specific requirement.
Explore all tools that specialize in orchestrate data workflows. This domain focus ensures Kedro delivers optimized results for this specific requirement.
Separates the code logic from data storage details using a YAML-based registry for datasets.
An interactive tool for visualizing the data pipeline, showing the relationship between nodes and datasets.
The ability to save and load the same dataset in multiple formats or using different libraries automatically.
An extensible system allowing users to inject custom behavior into the Kedro lifecycle (e.g., after a node runs).
Namespace-isolated pipelines that can be reused across different projects or within different parts of the same project.
Capability to package specific pipelines as standalone Python packages for distribution.
Templates for deploying Kedro pipelines to Airflow, Kubeflow, or AWS Batch.
Install Kedro via pip or conda using 'pip install kedro'.
Initialize a new project using 'kedro new' and follow the interactive prompts.
Define your data sources in the 'conf/base/catalog.yml' file to abstract data I/O.
Create Python functions (Nodes) in 'src/project_name/nodes.py' for specific logic.
Assemble nodes into a pipeline in 'src/project_name/pipeline.py'.
Configure environment-specific parameters in 'conf/base/parameters.yml'.
Execute the pipeline locally using 'kedro run' to verify logic.
Visualize the pipeline structure using 'kedro viz' to inspect dependencies.
Add unit tests for your nodes in the 'tests/' directory.
Package the project for deployment using 'kedro package' or Dockerize it.
All Set
Ready to go
Verified feedback from other users.
"Highly praised for bringing software engineering discipline to data science. Users love the visualization and catalog features but note a learning curve for those unfamiliar with software design patterns."
Post questions, share tips, and help other users.

dbt empowers data teams to deliver reliable, governed data faster and at scale.

Open-source foundations, production-ready platforms for workflow orchestration and AI infrastructure.

An all-in-one analytics solution covering everything from data movement to data science, Real-Time Analytics, and business intelligence.

End-to-end ELT platform with built-in transformations for analytics-ready data.

The all-in-one low-code data workspace for integration, transformation, and AI automation.

Lightweight compute platform for Python people, enabling scalable data science and engineering workflows in the cloud without Kubernetes or Docker.