Job
- Level
- Experienced
- Job Field
- IT, System
- Employment Type
- Full Time
- Contract Type
- Permanent employment
- Location
- Munich
- Working Model
- Onsite
Job Summary
In this role, you will manage the daily administration of the HPC infrastructure, optimize resources, and automate workflows to ensure high availability and performance for simulation projects.
Job Technologies
Your role in the team
- Helsing operates on-premises high-performance computing (HPC) infrastructure that supports electromagnetics, computational fluid dynamics, and multi-physics simulation.
- As an HPC Systems Administrator based in Munich, you will take ownership of this critical environment, ensuring that our team of simulation engineers remains unblocked, productive, and equipped to solve complex problems.
- You will play a vital role in maintaining rigorous technical standards, optimising compute resources, and scaling our infrastructure to support continuous, large-scale modelling.
- The role is based on-site in Munich with regular travel to our Tussenhausen site.
- Own the day-to-day administration of compute nodes, workload schedulers, parallel storage, high-speed interconnects, and license servers.
- Ensure the environment remains highly available and consistently performant through proactive monitoring, patching, firmware updates, and incident response.
- Administer the workload scheduler (Slurm, PBS Pro, or similar), managing queues, fair-share policies, accounting, and quotas to optimise resource utilisation.
- Manage the simulation software stack and user environments using tools such as Lmod, Spack, or EasyBuild.
- Collaborate with hardware and software vendors to resolve support cases, process RMAs, and ensure upgrade quality.
- Automate operational workflows using Bash, Python, and Ansible to improve system efficiency and reduce manual intervention.
- Maintain the strict security posture required for cleared work and support ongoing compliance reviews.
- Onboard users and maintain comprehensive documentation to empower engineers to self-serve.
This text has been machine translated. Show original
Our expectations of you
Qualifications
- Have administered Linux systems within a production HPC or large shared compute environment.
- Are capable of scripting and automating complex workflows with Bash, Python, and Ansible.
- Can effectively manage commercial simulation software (CAE, CFD, or EM) and license servers, including FlexLM.
- A background working within classified or strictly regulated environments.
- Expertise in GPU computing, including CUDA and NVIDIA toolchains, alongside MPI stack management.
- Vertrautheit mit HPC-Containern unter Verwendung von Apptainer, Enroot oder Pyxis.
- Competence in infrastructure-as-code practices using tools such as Terraform.
Experience
- Have hands-on experience managing workload schedulers such as Slurm, PBS Pro, LSF, or similar.
- Possess production experience with parallel filesystems (Lustre, BeeGFS, or GPFS) and high-speed interconnects (InfiniBand or RoCE).
- Experience administering Altair or Siemens simulation suites.
- Experience with identity management systems such as Keycloak, FreeIPA, Active Directory, or Kerberos.
This text has been machine translated. Show original
What we offer
- Helsing's work is important. You'll be directly contributing to the protection of democratic countries while balancing both ethical and geopolitical concerns.
- The work is unique. We operate in a domain that has highly unusual technical requirements and constraints, and where robustness, safety, and ethical considerations are vital.
- You will face unique Engineering and AI challenges that make a meaningful impact in the world.
- Our work frequently takes us right up to the state of the art in technical innovation, be it reinforcement learning, distributed systems, generative AI, or deployment infrastructure.
- The defence industry is entering the most exciting phase of the technological development curve.
- Advances in our field of the world are not incremental: Helsing is part of, and often leading, historic leaps forward.
- In our domain, success is a matter of order-of-magnitude improvements and novel capabilities.
- This means we take bets, aim high, and focus on big opportunities.
- Despite being a relatively young company, Helsing has already been selected for multiple significant government contracts.
- We actively encourage healthy, proactive, and diverse debate internally about what we do and how we choose to do it.
- Teams and individual engineers are trusted (and encouraged) to practise responsible autonomy and critical thinking, and to focus on outcomes, not conformity.
- At Helsing you will have a say in how we (and you!) work, the opportunity to engage on what does and doesn't work, and to take ownership of aspects of our culture that you care deeply about.
This text has been machine translated. Show original
Topics that you deal with on the job
Job Locations
This is your employer
Helsing GmbH
Helsing GmbH is a Munich-based company specializing in defense and defense technology. It develops AI solutions for military equipment and collaborates closely with governments and industry partners to integrate existing hardware into AI-supported networks and promote the technological strength of democratic societies. Founded in 2021, Helsing has quickly established itself as an innovative player in defense technology.
Description
- Company Type
- Startup
- Working Model
- Full Remote, Hybrid, Onsite
- Industry
- Other Sectors