pracaon.pl

Senior Data Engineer (PySpark)

Remote, Polska
Ogłoszenie zewnętrzne
EPAM

EPAM

Partner
39d
Wynagrodzenie do ustalenia
IT i Telekomunikacja
Pełny etat
Zdalna
Wymagania
  • 5+ years of experience in Data Software Engineering or related engineering roles

  • Solid experience with PySpark and SparkSQL

  • Experience with Cosmos DB (NoSQL API)

  • Hands-on experience with OneLake or Delta Lake and OpenLake concepts

  • Knowledge of DF Gen2 and M-code

  • Experience with CI/CD pipelines using Azure DevOps or equivalent

  • Good understanding of Azure services

  • Experience integrating data solutions with Power BI

  • Experience with Azure Fabric would be an asset

  • Strong problem-solving and analytical skills

  • Ability to work independently on complex tasks

  • Experience working in Agile or Scrum environments

  • Upper-intermediate proficiency in English (B2+)

Zakres obowiązków
  • Implement data processing and transformation with Python, PySpark and SparkSQL

  • Work with OneLake (Delta / OpenLake) for efficient data storage and analytics

  • Develop and support solutions using Cosmos DB (NoSQL API)

  • Contribute to Fabric workloads including Data Engineering, Data Factory Gen2 and Lakehouse

  • Design, develop and maintain scalable data pipelines using Azure Fabric

  • Implement and maintain CI/CD pipelines and follow DevOps best practices

  • Integrate data solutions with Power BI for reporting and analytics

  • Collaborate with AI, data science and product teams to support AI-driven use cases

  • Ensure data quality, performance, security and reliability

  • Participate in Agile ceremonies and contribute to sprint delivery

  • Support production issues and continuous improvements

Seniority
  • Senior

Opis

We are looking for a Senior Data Engineer with expertise in AI-enabled data platforms and PySpark. This role involves designing, building and optimizing modern data pipelines and analytics solutions, collaborating with architects, lead engineers and business stakeholders to deliver robust, scalable and AI-integrated data solutions. Responsibilities Implement data processing and transformation with Python, PySpark and SparkSQL Work with OneLake (Delta / OpenLake) for efficient data storage and analytics Develop and support solutions using Cosmos DB (NoSQL API) Contribute to Fabric workloads including Data Engineering, Data Factory Gen2 and Lakehouse Design, develop and maintain scalable data pipelines using Azure Fabric Implement and maintain CI/CD pipelines and follow DevOps best practices Integrate data solutions with Power BI for reporting and analytics Collaborate with AI, data science and product teams to support AI-driven use cases Ensure data quality, performance, security and reliability Participate in Agile ceremonies and contribute to sprint delivery Support production issues and continuous improvements Requirements 5+ years of experience in Data Software Engineering or related engineering roles Solid experience with PySpark and SparkSQL Experience with Cosmos DB (NoSQL API) Hands-on experience with OneLake or Delta Lake and OpenLake concepts Knowledge of DF Gen2 and M-code Experience with CI/CD pipelines using Azure DevOps or equivalent Good understanding of Azure services Experience integrating data solutions with Power BI Experience with Azure Fabric would be an asset Strong problem-solving and analytical skills Ability to work independently on complex tasks Experience working in Agile or Scrum environments Upper-intermediate proficiency in English (B2+)

Słowa kluczowe / Umiejętności
Data Software Engineering
Microsoft Fabric
PySpark
AI
Python
Oferta została zaimportowana z zewnętrznego portalu.Źródło ogłoszenia