Experience
CSHARK (May 2023 - Present)
Senior Data Engineer (August 2024 - Present)
- Source integration:
- Onboard data from various APIs using Spark's Python Data Source API.
- Develop custom solution to incrementally load data from SQL Server to Delta Lake using CDC.
- CI/CI and development:
- Setup up automated Databricks resource deployment with Asset Bundles and Gitlab CI.
- Design and implement mock services for Python Data Source API integration testing.
- Modernize development environment with unified, automated setup, dependency management and development containers.
- Performance optimization:
- Optimize Spark Jobs performance for improved cost efficiency.
- Build reusable Python libraries to be shared across projects.
Data Engineer (May 2023 - August 2024)
- Creation of models to support transactional processing as well as BI analytics.
- Data Processing Solutions:
- Development and integration of custom libraries and Spark Jobs for Synapse.
- Creation of parametrized data processing pipelines (ADF).
- Data Testing Strategy: Development and implementation of comprehensive data testing strategies.
- Automated Testing: Execution of unit, integration, and data validation tests.
- CI/CD Pipeline Development: Leading GitLab CI/CD pipeline creation for Azure Synapse resource deployment.
- Team Onboarding: Knowledge transfer and training of new team members.
Capgemini (Oct 2021 - Apr 2023)
- Provisioning of Azure Services
- Creation and maintenance of Azure DevOps Repository and Pipelines
- Refactoring of ETL pipeline into medallion architecture
- Writing PySpark and SparkSQL notebooks in Databricks
- Spark performance tuning
- Creation and maintenance of Azure Data Factory Pipelines
- Conducting technical workshops
Nokia (Jul 2018 - Sep 2021)
- Leading HW regression testing project (3-person)
- Performing fault analysis and RCA preparation
- Design and implementation of data visualization tool (Bokeh and Pandas)
WBP Drosystem (Jun 2014 - Jun 2018)
- Road drainage dimensioning in Python
- Data scraping in Python and Excel scripting for mail automation