About me

My name is Jakub Domerecki and I am a Python Software Engineer.

I have been working with Python for 13 years now. I specialize in Data Engineering with emphasis on PySpark and Microsoft Azure services.

Contact me on LinkedIn

Experience

CSHARK (May 2023 - Present)

Senior Data Engineer (August 2024 - Present)

  • Source integration:
    • Onboard data from various APIs using Spark's Python Data Source API.
    • Develop custom solution to incrementally load data from SQL Server to Delta Lake using CDC.
  • CI/CI and development:
    • Setup up automated Databricks resource deployment with Asset Bundles and Gitlab CI.
    • Design and implement mock services for Python Data Source API integration testing.
    • Modernize development environment with unified, automated setup, dependency management and development containers.
  • Performance optimization:
    • Optimize Spark Jobs performance for improved cost efficiency.
    • Build reusable Python libraries to be shared across projects.

Data Engineer (May 2023 - August 2024)

  • Creation of models to support transactional processing as well as BI analytics.
  • Data Processing Solutions:
    • Development and integration of custom libraries and Spark Jobs for Synapse.
    • Creation of parametrized data processing pipelines (ADF).
  • Data Testing Strategy: Development and implementation of comprehensive data testing strategies.
  • Automated Testing: Execution of unit, integration, and data validation tests.
  • CI/CD Pipeline Development: Leading GitLab CI/CD pipeline creation for Azure Synapse resource deployment.
  • Team Onboarding: Knowledge transfer and training of new team members.

Capgemini (Oct 2021 - Apr 2023)

  • Provisioning of Azure Services
  • Creation and maintenance of Azure DevOps Repository and Pipelines
  • Refactoring of ETL pipeline into medallion architecture
  • Writing PySpark and SparkSQL notebooks in Databricks
  • Spark performance tuning
  • Creation and maintenance of Azure Data Factory Pipelines
  • Conducting technical workshops

Nokia (Jul 2018 - Sep 2021)

  • Leading HW regression testing project (3-person)
  • Performing fault analysis and RCA preparation
  • Design and implementation of data visualization tool (Bokeh and Pandas)

WBP Drosystem (Jun 2014 - Jun 2018)

  • Road drainage dimensioning in Python
  • Data scraping in Python and Excel scripting for mail automation