githubEdit

Tools

  1. Debeziumarrow-up-right - an open source distributed platform for change data capture

  2. Hudiarrow-up-right - "Hudi is a rich platform to build streaming data lakes with incremental data pipelines on a self-managing database layer, while being optimized for lake engines and regular batch processing."

  3. Upsolverarrow-up-right - "Continuous SQL Pipelines for Cloud Data Lakes. No custom coding. No orchestration. No infrastructure maintenance."

  4. DBTarrow-up-right - "dbt helps data teams work like software engineers—to ship trusted data, faster. collaboratively deploy analytics code following software engineering best practices like modularity, portability, CI/CD, and documentation. Now anyone who knows SQL can build production-grade data pipelines."

  5. Metorikkuarrow-up-right - A simplified, lightweight ETL Framework based on Apache Spark

  6. Stitcharrow-up-right - Stitch rapidly moves data from 130+ sources into a data warehouse so you can get to answers faster, no coding required.

  7. SnowPlowarrow-up-right - Generate complete, accurate and well-structured event dataarrow-up-right across all platforms and channels in a common format, with the Snowplow Behavioral Data Platform.

  8. Workatoarrow-up-right - A SINGLE PLATFORM FOR INTEGRATION & WORKFLOW AUTOMATION ACROSS YOUR ORGANIZATION

  9. AWS Deequarrow-up-right - Test data quality at scale

Last updated

Was this helpful?