About me

My name is Jakub Domerecki and I am a Python Software Engineer.

I have been working with Python for 12 years now. I specialize in Data Engineering with emphasis on PySpark and Microsoft Azure services.

Contact me on LinkedIn

Experience

CSHARK (May 2023 - Present)

  • Creation of models to support transactional processing as well as BI analytics.
  • Data Processing Solutions:
    • Development and integration of custom libraries and Spark Jobs for Synapse.
    • Creation of parametrized data processing pipelines (ADF).
  • Data Testing Strategy: Development and implementation of comprehensive data testing strategies.
  • Automated Testing: Execution of unit, integration, and data validation tests.
  • CI/CD Pipeline Development: Leading GitLab CI/CD pipeline creation for Azure Synapse resource deployment.
  • Team Onboarding: Knowledge transfer and training of new team members.

Capgemini (Oct 2021 - Apr 2023)

  • Provisioning Azure Services
  • Creation and maintaining Azure DevOps Repository and Pipelines
  • Refactoring of ETL pipeline into medallion architecture
  • Writing PySpark and SparkSQL notebooks in Databricks
  • Spark performance tuning
  • Creating and maintaining Azure Data Factory Pipelines
  • Creating and conducting technical workshops

Nokia (Jun 2018 - Sep 2021)

  • Leading HW regression testing project (3-person)
  • Performing fault analysis and RCA preparation
  • design and implementation of data visualization tool for efficient HW resource usage monitoring using Bokeh and Pandas libraries (previously done manually)
  • design and implementation of work time reporting tool using JIRA API and Python