Data Lead - Client Technology
Wrocław, Dolnośląskie, Polska, 50-086Основні характеристики вакансії
Гібридний формат - частково віддалено
Дані: SQL / BI / Python
AWS / Azure
Snowflake
Роль ліда
Description
Location: Wrocław/ Katowice - 2 days in office / 3 days remote Let us introduce you the job offer by EY GDS Poland – a member of the global integrated service delivery center network by EY. What we look for Strong analytical skills and problem-solving ability A self-starter, independent-thinker, curious and creative person with ambition and passion Good interpersonal, communication, teamwork, and presentation skills Customer focused Excellent time leadership skills Positive and constructive minded Takes ownership for continuous self-learning Takes the lead and makes decisions in critical times and tough circumstances Attention to detail High levels of integrity and honesty
What we offer
EY Global Delivery Services (GDS) is a dynamic and truly global delivery network. We work across nine locations – Argentina, Hungary, India, the Philippines, Poland, Sri Lanka, Mexico, Spain and the United Kingdom – and with teams from all EY service lines, geographies and sectors, playing a vital role in the delivery of the EY growth strategy. From accountants to coders to advisory consultants, we offer a wide variety of fulfilling career opportunities that span all business disciplines. In GDS, you will collaborate with EY teams on exciting projects and work with well-known brands from across the globe. We’ll introduce you to an ever-expanding ecosystem of people, learning, skills and insights that will stay with you throughout your career.
Continuous learning: You’ll develop the mindset and skills to navigate whatever comes next.
Success as defined by you: We’ll provide the tools and flexibility, so you can make a meaningful impact, your way.
Transformative leadership: We’ll give you the insights, coaching and confidence to be the leader the world needs.
Diverse and inclusive culture: You’ll be embraced for who you are and empowered to use your voice to help others find theirs.
Ideally, you’ll also have
Business Information Glossaries – Azure Data Catalog, Microsoft Purview, Business Objects, Collibra
Cloud Computing Services –Microsoft Azure, Google Cloud Platform, Amazon Web Services, Azure OpenAI Service
Distributed Systems – Databricks, Spark, Hadoop, HDFS, Kafka, MapReduce/Hive, Storm, Zookeeper
ETL Tools – Azure Data Factory, BODS, IBM DataStage, Informatica, Power BI Dataflows, SAP Data Services
Graph Databases – Neo4j, Azure Cosmos, Stardog
NoSQL Document Stores – MongoDB, Elastic Search, DocumentDB, Apache CouchDB
Programming Languages – Python, SQL, Scala, R,
Relational MPP Databases – Snowflake, Azure Synapse Analytics, Amazon Redshift, GCP BigQuery
Relational SMP Databases – Azure SQL PaaS, MySQL, Oracle, PostgreSQL, SQL Server
Vector Databases & Semantic Search – Azure AI Search (Vector Search), ElasticSearch, Pinecone, Weaviate, FAISS, Milvus, semantic retrieval for RAG architectures
Your key responsibilities
You will be owning the design and implementation of processes to extract, transform and load data from disparate sources into a form that is consumable by analytics processes, AI and Agentic AI systems, for large or more sophisticated projects, using advanced technical capabilities
You will take accountability for the production of a suite of high‑complexity data models, semantic layers and feature models, demonstrating strong understanding of data modelling standards to ensure high‑quality, AI‑ready and reusable data assets.
Working with departments across the business to help define and deliver business value, including AI‑driven use cases, while interfacing and presenting with program teams, management and partners to deliver large, sophisticated data, analytics and Agentic AI initiatives.
Leading the design, development and implementation of data pipelines supporting analytics, machine learning and autonomous agent workflows, reviewing work to ensure high quality, scalability, governance and secure AI data access.
Evaluating and resolving issues regarding data quality reviews, cleansing, data integration and migration, demonstrating sophisticated technical knowledge and showing technical leadership in aspects of data engineering while driving continuous improvement efforts
Providing a leadership role for the work group, ensuring the appropriate expectations, principles, structures, tools and responsibilities are in place to deliver the project
Analyzing the latest industry trends such as agentic ai, cloud computing and distributed processing and inferring potential impact on businesses (short and long-term)
Providing sophisticated technical expertise to maximize efficiency, reliability and value from current data engineering processes. Researching and supervising existing client base and industry developments to identify potential new product opportunities from emerging technologies
Developing strong working relationships with peers across engineering, collaborating to develop leading data engineering solutions
Driving consistency to the relevant data engineering and data modelling processes, procedures, standards, and may input into the definition, maintenance and implementation of technology standards Skills and attributes for success
To qualify for the role you must have
Batch Processing - Capability to design an efficient way of processing high volumes of data where a group of transactions is collected over a period of time.
Data Integration (Sourcing, Storage and Migration) - Capability to design and implement models, capabilities and solutions to manage data within the enterprise (structured and unstructured, data archiving principles, data warehousing, data sourcing, etc.). This includes the data models, storage requirements and migration of data from one system to another.
Data Quality, Profiling and Cleansing - Capability to review (profile) a data set to establish its quality against a defined set of parameters and to highlight data where corrective action (cleansing) is required to remediate the data.
Stream Systems - Capability to discover, integrate, and ingest all available data from the machines that produce it, as fast as it’s produced, in any format, and at any quality.