A. ŠARAS
2024 to 2025 · KPMG client work built · 2025

A data platform for oil & gas.

A large oil & gas client, a team of developers, and one lakehouse to bring the data together.

What it is

Together with a team of developers, I built and managed a data platform for a large oil & gas client on Microsoft Fabric. The architecture followed the medallion pattern (bronze, silver, gold) on a lakehouse foundation, fed by metadata-driven data pipelines with change data capture and Dataflow Gen2, and transformed with PySpark and Spark SQL.

My personal contribution: re-architecting how the pipelines run. The platform originally processed sequentially; I introduced a parallel execution pattern that cut processing time by more than 5x. I also slimmed down overbloated tables, reducing row counts for faster, cheaper ingestion.

The work

Concepts

Microsoft Fabric Lakehouse Medallion architecture Metadata-driven CDC Dataflow Gen2 PySpark Spark SQL

What it taught me

Sequential pipelines are a default, not a law. The biggest performance wins came from questioning how the work was scheduled, not how it was written.