Senior ML/RL Training Infrastructure Engineer
Vor 4 Tagen
As a senior member of the team, you will work closely with researchers and systems engineers to build robust training frameworks, accelerate experimentation, and push the boundaries of performance and efficiency. You will collaborate with teams across Apple's engineering hubs—including New York, Seattle, and Cupertino—to advance the tooling and systems that make large-scale model training possible. If you thrive at the intersection of distributed systems, ML frameworks, and high-performance computing, this is the role for you.
Description
As a core member of our ML infrastructure team, you will design, build, and scale the systems that enable large-scale reinforcement learning for Apple's foundation models. You will focus on TPU-based training with JAX, developing robust, high-performance RL pipelines that support distributed actor/learner architectures, efficient experience replay, and large-scale environment execution.
In this role, you will work across the full stack of RL training systems—from low-level performance tuning and compiler optimization to cluster-level orchestration and resource management. You will ensure that training pipelines are efficient, reliable, reproducible, and observable, enabling research teams to iterate quickly and explore more complex RL environments and models.
Your work will directly impact the scalability, throughput, and stability of RL experiments, helping to unlock new capabilities in agentic reasoning, decision-making, and policy learning for Apple's foundation models. This position is ideal for engineers who enjoy distributed systems, high-performance ML frameworks, and building the infrastructure that makes large-scale RL research possible.
Minimum Qualifications
PhD or MSc in Computer Science, Computer Engineering or a closely related field.
Hands-on experience designing, building, or maintaining large-scale ML training infrastructure.
Strong proficiency with PyTorch or JAX and experience running training workloads on GPUs/TPUs.
Solid understanding of distributed systems concepts (parallelism strategies, fault tolerance, synchronization).
Preferred Qualifications
Practical experience developing or optimizing training loops, RL pipelines, or large-scale model-training frameworks.
Strong software engineering skills in Python, with emphasis on reliability, debuggability, and high-performance execution.
Deep experience with PyTorch/JAX internals, XLA, debugging and performance profiling on GPU/TPU architectures.
Expertise in distributed RL training patterns, including actor/learner architectures, experience replay, and parallel environment execution.
Experience building training services, orchestration tools, or automated pipelines for large-scale experiments.
Proven success diagnosing bottlenecks in large-scale ML jobs (I/O, input pipelines, kernel performance, memory, compilation).
Familiarity with RL-specific infrastructure requirements (e.g., actor/learner architectures, experience replay systems, large-scale environment execution).
Strong software engineering practices: code quality, design reviews, testing, observability, CI/CD.
Experience working with cloud-scale clusters or specialized accelerators (TPU v5/v6, GPU, custom hardware)
Contributions to ML frameworks, distributed training libraries, or high-performance computing systems.
Excellent communication and collaboration skills for working with research and engineering partners.
-
Senior ML/RL Training Infrastructure Engineer
Vor 2 Tagen
Zürich, Zürich, Schweiz Apple Inc Vollzeit $ 1'200'000 - $ 2'000'000 pro JahrSummary Posted: Dec 03, Role Number: Ready to transform how billions of people interact with technology? Apple's Core Foundation Models team is driving the intelligence that powers experiences across billions of devices worldwide—and we're looking for exceptional talent to join us Join our Europe-based applied ML team building the next generation of...
-
Junior ML Engineer
vor 2 Wochen
Zürich, Zürich, Schweiz RiskPod Vollzeit CHF 80'000 - CHF 120'000 pro JahrJob Specification: Junior Machine Learning Engineer – Model Optimization & InferenceLocation:Zürich, Switzerland (On-site / Hybrid)Department:Engineering & AI ResearchEmployment Type:Full-time, PermanentStart Date:January 15, 2026 (flexible)Experience Level:Junior (0–3 years)Role OverviewMy client is seeking a Junior Machine Learning Engineer to support...
-
Senior AI/ML Software Engineer
Vor 4 Tagen
Zürich, Zürich, Schweiz Bjak Vollzeit CHF 80'000 - CHF 120'000 pro JahrBuild AI Systems That Make Finance Simpler, Smarter, and More InclusiveAt BJAK, we use AI to make insurance and financial services easier to access, understand, and afford for millions of users. As a Senior AI/ML Software Engineer, you'll help build the intelligent systems that power this mission - from personalized recommendations and fraud detection to...
-
Senior AI/ML Software Engineer
Vor 4 Tagen
Zürich, Zürich, Schweiz Bjak Vollzeit CHF 120'000 - CHF 200'000 pro JahrBuild AI Systems That Make Finance Simpler, Smarter, and More Inclusive At BJAK, we use AI to make insurance and financial services easier to access, understand, and afford for millions of users. As a Senior AI/ML Software Engineer, you'll help build the intelligent systems that power this mission - from personalized recommendations and fraud detection to...
-
MLOps / Infrastructure Engineer (Zurich)
vor 2 Wochen
Zürich, Zürich, Schweiz Mimic Robotics Vollzeit CHF 100'000 - CHF 120'000 pro Jahrmimic is an early-stage deep tech robotics & AI start-up based in Zurich and supported by leading VCs. We give industry workers a helping hand for tedious manual labour tasks and mitigate labour shortages with a versatile automation platform. Our automation solutions, driven by dexterous robotic hands and cutting-edge AI trained on human observations, bring...
-
Senior Software Engineer, AI/ML, LLM Modeling
Vor 4 Tagen
Zürich, Zürich, Schweiz Google Vollzeit CHF 120'000 - CHF 180'000 pro JahrMinimum qualifications:Bachelor's degree or equivalent practical experience.5 years of experience with software development in one or more programming languages, or 1 year of experience with an advanced degree.3 years of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement...
-
Senior Software Engineer, AI/ML, LLM Modeling
Vor 4 Tagen
Zürich, Zürich, Schweiz Google Vollzeit CHF 180'000 - CHF 250'000 pro JahrMinimum qualifications:Bachelor's degree or equivalent practical experience.5 years of experience with software development in one or more programming languages, or 1 year of experience with an advanced degree.3 years of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement...
-
Senior Evaluation ML Engineer
Vor 4 Tagen
Zürich, Zürich, Schweiz kaiko Vollzeit € 120'000 - € 400'000 pro JahrAbout kaikoDelivering high quality cancer care is complex; specialists form a view of each patient's condition by reasoning across different data - CT scans, genomics context, treatment history and clinical notes.Current AI are powerful within domains but fall short when it comes to reasoning across data or domain areas. kaiko.w, our AI assistant for...
-
Senior Software Engineer, AI/ML, LLM Modeling
Vor 2 Tagen
Zürich, Zürich, Schweiz Google Vollzeit CHF 120'000 - CHF 180'000 pro Jahrlink Copy link email Email a friend Applylink Copy link email Email a friend Minimum qualifications:Bachelor's degree or equivalent practical experience. 5 years of experience with software development in one or more programming languages, or 1 year of experience with an advanced degree. 3 years of experience with one or more of the following:...
-
Software Engineer III, AI/ML, Page Quality
Vor 2 Tagen
Zürich, Zürich, Schweiz Google Vollzeit CHF 120'000 - CHF 180'000 pro JahrMinimum qualifications:Bachelor's degree or equivalent practical experience.2 years of experience with software development in one or more programming languages, or 1 year of experience with an advanced degree.1 year of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement...