👨‍🔧
The Ops Compendium
  • The Ops Compendium
  • Definitions
    • Ops Definition Comparisons
  • ML & DL Compendium
  • MLOps
    • MLOps Intro
    • MLOps Teams
    • MLOps Literature
    • MLOps Course
    • MLOps Patterns
    • ML Experiment Management
    • ML Model Monitoring & Alerts
    • MLOps Tools
    • MLOps Deployment
    • Feature Stores & Feature Pipelines
    • Model Formats
    • AI As Data
    • MLOps Interview Questions
    • ML Architecture
  • DataOps
    • SQL
    • Tools
    • Databases
    • Database Modeling
    • Data Analytics
    • Data Engineering
    • Data Pipelines
    • Data Strategy
    • Data Vision
    • Data Teams
    • Data Catalogs
    • Data Governance
    • Data Quality
    • Data Observability
    • Data Program Management
    • Data KPIs
    • Data Mesh
    • Data Contract
    • Data Product
    • Data Engineering Questions & Training
    • Data Patterns
    • Data Architecture
    • Data Platforms
    • Data Lineage
  • DevOps
    • DevOps Strategy
    • DevOps Tools
      • Tutorials
      • Continuous Integration
      • Docker
      • Kubernetes
      • Cloud Objects
      • Key Value DB
      • API Gateway
      • Infrastructure As code
      • Logs
      • ELK
      • SLO
    • DevOps Courses
  • DevSecOps
    • Definitions
    • Tools
    • Concepts
  • Architecture
    • Problems
    • Development Concepts
    • System Design
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
  1. DataOps

Data Platforms

Databricks

  1. Databricks is ACID

  1. DB Learning Library

    1. Free courses

    2. Docs Optimization recommendations

    3. Comprehensive Guide to Optimize Databricks, Spark and Delta Lake Workloads

    4. Vector Search

    5. DB for ML

    6. DB for Data Engineering

  2. (good) Introduction & Tutorial - cluster / notebook / table / SQL / DataFrame / connections

  3. must know 7 concepts

  1. RDD vs Dataframe vs Dataset

    1. 2016 official blog post

    2. linkedin blog post

    3. comparison on youtube

    4. RDDs vs. Dataframes vs. Datasets – What is the Difference and Why Should Data Engineers Care?

  2. Optimizations

    1. Optimization recommendations on Databricks

    2. Comprehensive Guide to Optimize Databricks, Spark and Delta Lake Workloads

    3. How I Use Caching in Databricks to Increase Performance and Save Costs

    4. Why and How: Partitioning in Databricks

  3. Best Practices

    1. official docs

PreviousData ArchitectureNextData Lineage

Last updated 1 year ago

Was this helpful?