# Tools

1. [Debezium](https://debezium.io/) - an open source distributed platform for change data capture
2. [Hudi](https://hudi.apache.org/) - "Hudi is a rich platform to build streaming data lakes with incremental data pipelines on a self-managing database layer, while being optimized for lake engines and regular batch processing."
3. [Upsolver](https://www.upsolver.com/) - "Continuous SQL Pipelines for Cloud Data Lakes. No custom coding. No orchestration. No infrastructure maintenance."
4. [DBT](https://www.getdbt.com/) - "dbt helps data teams work like software engineers—to ship trusted data, faster. collaboratively deploy analytics code following software engineering best practices like modularity, portability, CI/CD, and documentation. Now anyone who knows SQL can build production-grade data pipelines."
   1. [intro](https://www.youtube.com/watch?v=R2nr1uZ8ffc)
   2. [in depth intro](https://www.youtube.com/watch?v=MHnJDqEKyUY)
   3. [dbt in one hour](https://www.youtube.com/watch?v=na6eu9WSXGY)
   4. [CI/CD with dbt](https://www.youtube.com/watch?v=snp2hxxWgqk)
   5. [snowflake terraform and dbt](https://www.youtube.com/watch?v=r4tAyiTgwRw)
   6. [hubspot snowflake and dbt](https://www.youtube.com/watch?v=qDHgknWW_oo)
5. [Metorikku](https://github.com/YotpoLtd/metorikku) - A simplified, lightweight ETL Framework based on Apache Spark
6. BI tools that directly connect to a DB.
   1. [redash](https://redash.io/) - Connect and query your data sources, build dashboards to visualize data and share them with your company.
   2. [Metabase](https://www.metabase.com/) - "[is an easy-to-use, open source business intelligence tool that lets you analyze data from a variety of data destinations and sources. It also follows a simple and fast setup process. Its data visualization capabilities are exceptional and can be showcased in a user-friendly way, without using SQL. With Metabase, you can easily share live dashboards, automated reports, and questions with the rest of your team.](https://growthfullstack.com/analyse/metabase-bi-tool/)" - by fullstackgrowth.com
   3. [Superset](https://superset.apache.org/) - Apache Superset is a modern data exploration and visualization platform
7. [Stitch](https://www.stitchdata.com/) - Stitch rapidly moves data from 130+ sources into a data warehouse so you can get to answers faster, no coding required.
8. [SnowPlow](https://snowplowanalytics.com/) - Generate [complete, accurate and well-structured event data](https://snowplowanalytics.com/web-and-mobile-data/) across all platforms and channels in a common format, with the Snowplow Behavioral Data Platform.
9. [Workato](https://www.workato.com/) - A SINGLE PLATFORM FOR INTEGRATION & WORKFLOW AUTOMATION ACROSS YOUR ORGANIZATION
10. [AWS Deequ](https://aws.amazon.com/blogs/big-data/test-data-quality-at-scale-with-deequ/) - Test data quality at scale


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://www.opscompendium.com/dataops/tools.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
