Professional Insurance

Automated Data Pipeline Integration and Warehousing Solution

Data Engineer 2 years 1 month

Led the greenfield development of a Python-based ingestion and data integration solution for Azure analytics workflows, turning complex source data into a reliable downstream pipeline.

Highlights

  • Designed and built a Python ingestion system that validated, preprocessed, transformed, and routed source files into Azure storage for downstream loading into Synapse.
  • Implemented an extensible adapter-based architecture that made it easier to support new file formats and handle inconsistent real-world input data without major rework.
  • Co-architected the underlying Azure infrastructure from scratch and helped align application design with security and provisioning requirements.
  • Coordinated with data science, infrastructure, and external data providers to align ingestion requirements, clarify source-data expectations, and keep the overall solution operable.

Technology Stack

Python
azure-sdk pyarrow fastparquet pandas pytest
SQL Terraform
Azure
Azure Function Apps Azure Synapse Azure Networking Azure Storage
CI/CD Pipelines
GitHub Actions

Tags

#professional #cloud #data-engineering