👨‍🔧
The Ops Compendium
  • The Ops Compendium
  • Definitions
    • Ops Definition Comparisons
  • ML & DL Compendium
  • MLOps
    • MLOps Intro
    • MLOps Teams
    • MLOps Literature
    • MLOps Course
    • MLOps Patterns
    • ML Experiment Management
    • ML Model Monitoring & Alerts
    • MLOps Tools
    • MLOps Deployment
    • Feature Stores & Feature Pipelines
    • Model Formats
    • AI As Data
    • MLOps Interview Questions
    • ML Architecture
  • DataOps
    • SQL
    • Tools
    • Databases
    • Database Modeling
    • Data Analytics
    • Data Engineering
    • Data Pipelines
    • Data Strategy
    • Data Vision
    • Data Teams
    • Data Catalogs
    • Data Governance
    • Data Quality
    • Data Observability
    • Data Program Management
    • Data KPIs
    • Data Mesh
    • Data Contract
    • Data Product
    • Data Engineering Questions & Training
    • Data Patterns
    • Data Architecture
    • Data Platforms
    • Data Lineage
  • DevOps
    • DevOps Strategy
    • DevOps Tools
      • Tutorials
      • Continuous Integration
      • Docker
      • Kubernetes
      • Cloud Objects
      • Key Value DB
      • API Gateway
      • Infrastructure As code
      • Logs
      • ELK
      • SLO
    • DevOps Courses
  • DevSecOps
    • Definitions
    • Tools
    • Concepts
  • Architecture
    • Problems
    • Development Concepts
    • System Design
Powered by GitBook
On this page
  • Lakes & Warehouses
  • Comparisons
  • Use Cases
  • Snowflake
  • ClickHouse
  • Feature engineering
  • Data lake Table Formats
  • Apache Iceberg
  • Databricks Delta Lake
  • Vector Databases

Was this helpful?

Edit on GitHub
  1. DataOps

Databases

PreviousToolsNextDatabase Modeling

Last updated 1 year ago

Was this helpful?

Lakes & Warehouses

  1. - Firebolt comparison with Snowflake vs Databricks.

    1. Delta lake is a data lake that can store raw unstructured, semi-structured, and structured data. When combined with Delta Engine it becomes a data lakehouse.

  2. , - Snowflake decouples the storage and compute functions, which means organizations that have high storage demands but less need for CPU cycles, or vice versa, don’t have to pay for an integrated bundle that requires them to pay for both. Users can scale up or down as needed and pay for only the resources they use.

  3. data mart

    1. + 3 types (dependent, independent, hybrid)

    2. (good) - the three types ^ + structures (star, snowflake, denormalized) + comparisons

  4. Data Lake

    1. using great expectations and spark

Comparisons

  1. - "Databricks Delta Lake and Delta Engine is a lakehouse. You choose it as a data lake, and for data lakehouse-based workloads including ELT for data warehouses, data science and machine learning, even static reporting and dashboards if you don’t mind the performance difference and don’t have a data warehouse.

    Most companies still choose a data warehouse like Snowflake, BigQuery, Redshift or Firebolt for general-purpose analytics over a data lakehouse like Delta Lake and Delta Engine because they need performance.

    But it doesn’t matter. You need more than one engine. Don’t fight it. You will end up with multiple engines for very good reasons. It’s just a matter of when. "

  2. Snowflake

Use Cases

Snowflake

ClickHouse

Feature engineering

Data lake Table Formats

Apache Iceberg

Databricks Delta Lake

Vector Databases

, airflow, snowflake, snowpipe, flink, rockdb, cluster optimization during ingestion, monitoring metrics, cost.

- sql or procedures, schedules, B-tree tasks.

- think snowflake open source

Gartner -

Chroma - AI native open source embedding database,

Hunters on their architecture
(good) Guides
getting started with SF tasks
cost
open source database for real time apps and analytics
Feature engineering in snowflake
A short Intro
A Primer
Benchmarking Delta vs Iceberg vs Hudi
How we migrated our production data lake to iceberg
How we reduced our cost by 90%
Top 5 Features
integrating delta lake into other platforms
Innovation Insight: Vector Databases
github
What is a DWH? a comprehensive guide
DataLakeHouse
What is SnowFlake
2
get started with SF
talend on data marts
netsuite on data marts
basic intro
monitoring health status at scale
Data Lake vs Data Warehouse
Top 5 differences between DL & DWH
Amazon on DL vs DWH
Snowflake vs Delta Lake vs Fire Bolt
Snowflake vs Amazon Redshift
Intro and demo
The three pillars - snowflake
Snowflake vs redshift on medium
SF vs RS
RS vs BQ
SF vs BQ