Data, pipelines
and architecture
forged for scale.

PySpark, Delta Lake, Databricks, Medallion. Real pipelines, real failures, real fixes.

Azure DatabricksDelta LakeUnity CatalogADF

BRONZE

raw data

SILVER

transform

GOLD

analytics

Raw ingestion from multiple sources, no transformation

Cleansing, standardization and data enrichment

Ready for BI, ML and decision-making

what I build

Architecture

Scalable, strategic design for modern data platforms.

Pipelines

Reliable, parameterized data flows built for production workloads.

Processing

Spark, Delta, and Databricks applied to real production scenarios.

Modeling

Turn raw data into trusted, decision-ready assets.

recent articles

Databricks·2026-05-04

FinOps for Data Engineering: a Databricks cost optimization case

How we reduced Databricks spend by 60% without touching SLAs — the exact playbook: cluster policies, autoscaling, spot instances, and System Tables.

Career·2026-04-06

Data Engineering Roadmap: what you actually need to learn to break into the field

A no-fluff roadmap for breaking into data engineering — built from real hiring experience, not job descriptions.

Architecture·2026-04-02

How to setup your Bronze layer in Databricks

A practical guide to building a reliable Bronze layer in Databricks using AutoLoader, Delta Lake, and Unity Catalog — with real production patterns.

Architecture·2026-04-02

How to setup your Gold layer in Databricks

Build a Gold layer that BI tools and analysts actually trust — using Star Schema, avoiding OBT traps, and modeling data for consumption, not just storage.

Databricks·2026-04-02

Databricks Observability: monitoring your Lakehouse with native tools

Complete guide to observability in Databricks using System Tables, Spark UI, Lakehouse Monitoring, audit logs and query profiling — without external tools.

Architecture·2026-04-02

How to setup your Silver layer in Databricks

Build a production-grade Silver layer in Databricks — deduplication, MERGE patterns, schema enforcement, and data quality checks with Delta Lake.

ADF·2026-04-01

Azure Data Factory in practice: CDC ingestion pipelines from zero to prod

How to build robust ADF pipelines with watermark strategy, parameterization and error handling for ingestion at scale.

Architecture·2026-03-29

Delta Live Tables: declarative pipelines that actually simplify Lakehouse engineering

How DLT changes the way you think about pipeline development, data quality, and orchestration in Databricks.

Databricks·2026-03-28

Databricks for Data Engineers: clusters, jobs and notebooks in production

Everything you need to operate Databricks in production: cluster types, orchestration with Workflows and cost best practices.

Delta Lake·2026-03-24

Delta Lake beyond the basics: time travel, OPTIMIZE and data quality

Exploring the advanced Delta Lake features that make a real difference in production: time travel, Z-ORDER, OPTIMIZE and schema enforcement.

PySpark·2026-03-22

SCD Type 2 in Delta Lake: tracking history without losing your mind

A complete implementation of Slowly Changing Dimensions Type 2 using Delta Lake MERGE, with real examples and production considerations.

Unity Catalog·2026-03-19

Unity Catalog: centralized data governance in Databricks

How to structure Unity Catalog for real production governance — hierarchy, granular permissions, lineage and external locations.

ADLS Gen2·2026-03-14

ADLS Gen2: container structure and organization for Lakehouse

How to organize your Azure Data Lake with container hierarchy, layer-based folder strategy and access control with RBAC and ACLs.

Career·2026-03-14popular

From Data Analyst to Senior Data Engineer: what I learned and what I'd do differently

The skills, mindset shifts, and mistakes that shaped my path from writing SQL reports to leading a Lakehouse team.

Architecture·2026-03-07

Data Contracts: how to guarantee quality at the source before it reaches Bronze

How to define, implement, and enforce data contracts between producers and consumers at the ingestion layer.

Career·2026-02-27

CI/CD for Databricks with Asset Bundles: deploying pipelines like real software

How to use Databricks Asset Bundles to version, test, and deploy your notebooks and jobs across environments.

Architecture·2026-02-19popular

Medallion Architecture in practice: Bronze, Silver and Gold with Delta Lake

How we structured the layers in the Lakehouse and why this decision changed our CDC ingestion.

Architecture·2026-02-11

Partitioning strategies in Delta Lake: how to avoid small files and hot spots

The partitioning decisions that make or break query performance in Delta Lake — and how to fix a table that was partitioned wrong.

CDC·2026-02-03

CDC with Azure Data Factory: ingestion strategies that really work at scale

Watermark, soft delete and how to avoid classic incremental ingestion errors.

PySpark·2026-01-24popular

Incremental MERGE with UPSERT in Databricks — from zero to production job

Complete guide of PySpark notebooks with MERGE logic and full-refresh ready for scheduling.

Unity Catalog·2026-01-16

Data governance with Unity Catalog: what nobody tells you before implementing

Real lessons from someone who went through the pain of configuring permissions, lineage and data contracts.

Architecture·2026-01-07

Schema evolution in Delta Lake: handling breaking changes without breaking pipelines

How Delta Lake handles schema changes, what mergeSchema and overwriteSchema actually do, and how to build pipelines that survive source schema drift.