pracaon.pl

Solution Architect (Kernel Optimization & ML Performance) | Associate Management

Kraków, Lesser Poland Voivodeship, Polska
EPAM
Partner
1d
Salary to be agreed
Full-time • Hybrid • IT & Telecommunications

Key offer highlights

  • Hybrid model - partly remote

  • 12+ years of experience

  • Architect role

Description

Join us as a Solution Architect (Kernel Optimization & ML Performance) and play a key role in shaping impactful, scalable solutions that drive real business value. You will work at the intersection of advanced hardware acceleration and machine learning infrastructure, translating complex performance and optimization requirements into robust architectural designs. This role offers the opportunity to collaborate with global, cross-functional teams in a dynamic and fast-paced environment, working closely with ML researchers, compiler engineers, and systems architects. You’ll have a direct influence on the technical direction of high-performance ML solutions, solution quality, and overall efficiency across large-scale AI workloads. At EPAM, we value innovation, ownership, and a proactive mindset, giving you the space to make a tangible technological impact. If you are ready to take on a strategic architect-level role in AI infrastructure and grow your career in a global setting, we encourage you to apply. Responsibilities Define and own the architecture for performance-critical ML workloads, leveraging custom kernels on TPUs and GPUs Create a strategic roadmap for kernel optimization, framework integration, and large-scale performance improvement Collaborate with the client’s technical leadership, ML researchers, and engineers to capture requirements and design scalable solutions Evaluate and select the right technologies, toolchains, and design patterns to optimize compute-intensive operations Guide the development of benchmarking infrastructure, autotuning frameworks, performance profiling tools, and regression suites Advocate ML performance best practices, influencing framework and compiler enhancements across teams Provide technical leadership and mentorship to development teams implementing architectural solutions Ensure adherence to security, scalability, and maintainability standards in solution design Requirements Bachelor’s or Master’s degree in Computer Science or equivalent practical experience 12+ years of software engineering experience, including 5+ years in architecture or technical leadership roles In-depth knowledge of ML frameworks (JAX, PyTorch, TensorFlow) and Core ML concepts Hands-on expertise in performance optimization at the kernel level targeting TPUs/GPUs Strong experience with C++/Python for high-performance computing Solid grasp of compiler principles, graph optimizations, and toolchains such as MLIR or OpenXLA Proven experience designing scalable, production-grade ML systems with emphasis on performance and efficiency Excellent communication, stakeholder management, and solution delivery skills Nice to have Familiarity with emerging hardware accelerators, heterogeneous compute, and scale-out performance patterns Experience with autotuning systems, benchmarking methodologies, and performance profiling tools Contributions to open-source projects in ML performance, kernels, or developer infrastructure Demonstrated ability to present architectural solutions to technical and non-technical audiences

Requirements

  • Bachelor’s or Master’s degree in Computer Science or equivalent practical experience

  • 12+ years of software engineering experience, including 5+ years in architecture or technical leadership roles

  • In-depth knowledge of ML frameworks (JAX, PyTorch, TensorFlow) and Core ML concepts

  • Hands-on expertise in performance optimization at the kernel level targeting TPUs/GPUs

  • Strong experience with C++/Python for high-performance computing

  • Solid grasp of compiler principles, graph optimizations, and toolchains such as MLIR or OpenXLA

  • Proven experience designing scalable, production-grade ML systems with emphasis on performance and efficiency

  • Excellent communication, stakeholder management, and solution delivery skills

Responsibilities

  • Define and own the architecture for performance-critical ML workloads, leveraging custom kernels on TPUs and GPUs

  • Create a strategic roadmap for kernel optimization, framework integration, and large-scale performance improvement

  • Collaborate with the client’s technical leadership, ML researchers, and engineers to capture requirements and design scalable solutions

  • Evaluate and select the right technologies, toolchains, and design patterns to optimize compute-intensive operations

  • Guide the development of benchmarking infrastructure, autotuning frameworks, performance profiling tools, and regression suites

  • Advocate ML performance best practices, influencing framework and compiler enhancements across teams

  • Provide technical leadership and mentorship to development teams implementing architectural solutions

  • Ensure adherence to security, scalability, and maintainability standards in solution design

Seniority

  • Associate Management

Nice to have

  • Familiarity with emerging hardware accelerators, heterogeneous compute, and scale-out performance patterns

  • Experience with autotuning systems, benchmarking methodologies, and performance profiling tools

  • Contributions to open-source projects in ML performance, kernels, or developer infrastructure

  • Demonstrated ability to present architectural solutions to technical and non-technical audiences

Keywords / Skills

Solution Architecture
GPGPU/GPU Programming
NVIDIA CUDA
PyTorch
Python
C++
This offer was imported from an external portal.Listing source