Level: Senior

Job Field: Data

Employment Type: Full Time

Contract Type: Permanent employment

Location: Heidelberg

Working Model: Hybrid, Onsite

Job Summary

In this role, you will develop innovative methods in reinforcement learning, conduct large-scale experiments, and optimize training infrastructures to enhance the performance of our models.

Job Technologies

Python

Your role in the team

Aleph Alpha is one of the few companies in Europe with end-to-end in-house model development including pre- and post-training. We're building models that have general-purpose capabilities, but also specifically excel at addressing the needs of our customers.
We're growing our post-training team in Heidelberg (or hybrid in Germany) and are looking for an AI Researcher who combines a deep theoretical understanding of reinforcement learning methods with a desire to improve on the state of the art and improve model capabilities in large-scale training.
As a (senior) AI Researcher for reinforcement learning, you will shape and improve the underlying RL methodology, maintain a high-quality training code-base, and conduct large-scale experiments to hill-climb our performance benchmarks.
In your day-to-day you will conduct large-scale reinforcement learning experiments, derive hypotheses from the results, and iterate on both the implementation and methodology based on the observations.
Together with a collaborative team, you will have direct impact on the models that we ship to our customers.
Hill-climb in large-scale training: Conduct large-scale LLM training runs, analyze evaluation scores in depth, propose hypotheses for improvement and directly implement them in order to maximize performance on our benchmarks.
Theoretical innovation: Stay at the bleeding edge of RL research. You will identify, implement, and iterate on novel approaches to multi-turn reinforcement learning.
Scale our training infrastructure: Identify bottlenecks in our training setup and optimize our RL training loops for large-scale training.
Cross-functional collaboration: Partner with our other post-training teams to turn raw feedback into actionable training signals, ensuring that our RL iterations lead to measurable improvements in downstream performance.

This text has been machine translated. Show original

Our expectations of you

Qualifications

A deep understanding of Reinforcement Learning theory and how it relates to modern RL methods.
Vertrautheit mit statistischen Methoden zur Bewertung und zum Versuchsdesign.
Ability to reason about what an evaluation/environment measures and whether it matters - not just run benchmarks, but understand them.
Starke Python-Kenntnisse und Vertrautheit mit ML-Tools (insbesondere torch distributed).
Willingness to relocate to Heidelberg or travel regularly (potentially weekly).
A history of contributions to top-tier venues (NeurIPS, ICML, ICLR, etc.) specifically regarding RL.

Experience

Experience with multi-node LLM training (ideally using RL). You understand how to scale multi-node RL trainings and can reason about and implement distributed algorithms.
PhD in reinforcement learning or equivalent research experience.
Experience evaluating LLM models and crafting environments for training.

This text has been machine translated. Show original

Benefits

Health, Fitness & Fun

🚲Jobbike

Topics that you deal with on the job

Job Locations

Location Heidelberg
Baden-Württemberg
Germany
Location Heidelberg
Baden-Württemberg
Germany

This is your employer

Aleph Alpha

As a German AI startup based in Heidelberg, Aleph Alpha focuses on the development of large language models and generative AI. It offers solutions for companies looking to build their own AI capabilities and protect their data.

Company Type: Startup

Working Model: Hybrid, Onsite

Industry: Internet, IT, Telecommunication

Senior AI Researcher

Aleph Alpha

Location: Heidelberg
Working Model: Hybrid, Onsite
Diversity: Open for all genders
English Only: English only required

Senior AI Researcher

Job Summary

Job Technologies

Your role in the team

Our expectations of you

Qualifications

Experience

Benefits

Health, Fitness & Fun

Topics that you deal with on the job

Job Locations

Location Heidelberg

Location Heidelberg

This is your employer

Aleph Alpha

More Jobs

Dual Study Program in Data Science

Dual Study Program in Data Science

Senior Business Analyst – Customer Insights

Customer Data Consultant

Oracle Database Admin

PHP developers

Career Tips

For Employer

Company

Partners and Portals

Senior AI Researcher

Job

Job Summary

Job Technologies

Your role in the team

Our expectations of you

Qualifications

Experience

Benefits

Health, Fitness & Fun

Topics that you deal with on the job

Job Locations

Location Heidelberg

Location Heidelberg

This is your employer

Aleph Alpha

Description

More Jobs

Dual Study Program in Data Science

Dual Study Program in Data Science

Senior Business Analyst – Customer Insights