Logo Aleph Alpha

Senior AI Researcher

Job

  • Level
    Senior
  • Job Field
    Data
  • Employment Type
    Full Time
  • Contract Type
    Permanent employment
  • Location
    Heidelberg
  • Working Model
    Hybrid, Onsite
  • Job Summary

    In this role, you will develop innovative methods in reinforcement learning, conduct large-scale experiments, and optimize training infrastructures to enhance the performance of our models.

    Job Technologies

    Your role in the team

    • Aleph Alpha is one of the few companies in Europe with end-to-end in-house model development including pre- and post-training. We're building models that have general-purpose capabilities, but also specifically excel at addressing the needs of our customers.
    • We're growing our post-training team in Heidelberg (or hybrid in Germany) and are looking for an AI Researcher who combines a deep theoretical understanding of reinforcement learning methods with a desire to improve on the state of the art and improve model capabilities in large-scale training.
    • As a (senior) AI Researcher for reinforcement learning, you will shape and improve the underlying RL methodology, maintain a high-quality training code-base, and conduct large-scale experiments to hill-climb our performance benchmarks.
    • In your day-to-day you will conduct large-scale reinforcement learning experiments, derive hypotheses from the results, and iterate on both the implementation and methodology based on the observations.
    • Together with a collaborative team, you will have direct impact on the models that we ship to our customers.
    • Hill-climb in large-scale training: Conduct large-scale LLM training runs, analyze evaluation scores in depth, propose hypotheses for improvement and directly implement them in order to maximize performance on our benchmarks.
    • Theoretical innovation: Stay at the bleeding edge of RL research. You will identify, implement, and iterate on novel approaches to multi-turn reinforcement learning.
    • Scale our training infrastructure: Identify bottlenecks in our training setup and optimize our RL training loops for large-scale training.
    • Cross-functional collaboration: Partner with our other post-training teams to turn raw feedback into actionable training signals, ensuring that our RL iterations lead to measurable improvements in downstream performance.

    This text has been machine translated. Show original

    Our expectations of you

    Qualifications

    • A deep understanding of Reinforcement Learning theory and how it relates to modern RL methods.
    • Vertrautheit mit statistischen Methoden zur Bewertung und zum Versuchsdesign.
    • Ability to reason about what an evaluation/environment measures and whether it matters - not just run benchmarks, but understand them.
    • Starke Python-Kenntnisse und Vertrautheit mit ML-Tools (insbesondere torch distributed).
    • Willingness to relocate to Heidelberg or travel regularly (potentially weekly).
    • A history of contributions to top-tier venues (NeurIPS, ICML, ICLR, etc.) specifically regarding RL.

    Experience

    • Experience with multi-node LLM training (ideally using RL). You understand how to scale multi-node RL trainings and can reason about and implement distributed algorithms.
    • PhD in reinforcement learning or equivalent research experience.
    • Experience evaluating LLM models and crafting environments for training.

    This text has been machine translated. Show original

    Benefits

    Health, Fitness & Fun

    Topics that you deal with on the job

    Job Locations

    • Location Heidelberg

      Baden-Württemberg

      Germany

    This is your employer

    Aleph Alpha

    Aleph Alpha

    As a German AI startup based in Heidelberg, Aleph Alpha focuses on the development of large language models and generative AI. It offers solutions for companies looking to build their own AI capabilities and protect their data.

    Description

  • Company Type
    Startup
  • Working Model
    Hybrid, Onsite
  • Industry
    Internet, IT, Telecommunication
  • Logo Aleph Alpha

    Senior AI Researcher

    Location
    Heidelberg
    Working Model
    Hybrid, Onsite
    Diversity
    Open for all genders
    English Only
    English only required

    More Jobs