> For the complete documentation index, see [llms.txt](https://www.opscompendium.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://www.opscompendium.com/dataops/tools.md).

# Tools

1. [Debezium](https://debezium.io/) - an open source distributed platform for change data capture
2. [Hudi](https://hudi.apache.org/) - "Hudi is a rich platform to build streaming data lakes with incremental data pipelines on a self-managing database layer, while being optimized for lake engines and regular batch processing."
3. [Upsolver](https://www.upsolver.com/) - "Continuous SQL Pipelines for Cloud Data Lakes. No custom coding. No orchestration. No infrastructure maintenance."
4. [DBT](https://www.getdbt.com/) - "dbt helps data teams work like software engineers—to ship trusted data, faster. collaboratively deploy analytics code following software engineering best practices like modularity, portability, CI/CD, and documentation. Now anyone who knows SQL can build production-grade data pipelines."
   1. [intro](https://www.youtube.com/watch?v=R2nr1uZ8ffc)
   2. [in depth intro](https://www.youtube.com/watch?v=MHnJDqEKyUY)
   3. [dbt in one hour](https://www.youtube.com/watch?v=na6eu9WSXGY)
   4. [CI/CD with dbt](https://www.youtube.com/watch?v=snp2hxxWgqk)
   5. [snowflake terraform and dbt](https://www.youtube.com/watch?v=r4tAyiTgwRw)
   6. [hubspot snowflake and dbt](https://www.youtube.com/watch?v=qDHgknWW_oo)
5. [Metorikku](https://github.com/YotpoLtd/metorikku) - A simplified, lightweight ETL Framework based on Apache Spark
6. BI tools that directly connect to a DB.
   1. [redash](https://redash.io/) - Connect and query your data sources, build dashboards to visualize data and share them with your company.
   2. [Metabase](https://www.metabase.com/) - "[is an easy-to-use, open source business intelligence tool that lets you analyze data from a variety of data destinations and sources. It also follows a simple and fast setup process. Its data visualization capabilities are exceptional and can be showcased in a user-friendly way, without using SQL. With Metabase, you can easily share live dashboards, automated reports, and questions with the rest of your team.](https://growthfullstack.com/analyse/metabase-bi-tool/)" - by fullstackgrowth.com
   3. [Superset](https://superset.apache.org/) - Apache Superset is a modern data exploration and visualization platform
7. [Stitch](https://www.stitchdata.com/) - Stitch rapidly moves data from 130+ sources into a data warehouse so you can get to answers faster, no coding required.
8. [SnowPlow](https://snowplowanalytics.com/) - Generate [complete, accurate and well-structured event data](https://snowplowanalytics.com/web-and-mobile-data/) across all platforms and channels in a common format, with the Snowplow Behavioral Data Platform.
9. [Workato](https://www.workato.com/) - A SINGLE PLATFORM FOR INTEGRATION & WORKFLOW AUTOMATION ACROSS YOUR ORGANIZATION
10. [AWS Deequ](https://aws.amazon.com/blogs/big-data/test-data-quality-at-scale-with-deequ/) - Test data quality at scale


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://www.opscompendium.com/dataops/tools.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.