Job
- Level
- Experienced
- Job Field
- IT, DevOps
- Employment Type
- Full Time
- Contract Type
- Permanent employment
- Location
- Bonn
- Working Model
- Hybrid, Onsite
Job Summary
You will develop and manage a scalable, secure platform in Kubernetes focusing on CI/CD, observability, and multi-cloud management while providing innovative solutions for AI models.
Job Technologies
Your role in the team
- As a "DevOps/Platform Engineer (m/f/d)," you will provide a secure, scalable, observable platform for our AI platform Alan and establish the principle "You build it, you run it" within the team.
- You support the productive teams on "paved paths" (Self-Service, Guardrails) and ensure predictable performance and costs.
- You take ownership of core platform/serving components.
- You operate K8s clusters, networking (Ingress), storage (databases, snapshots), and OS/kernel patching, ensuring their secure and stable operation.
- You model multi-cloud resources (especially Open Telekom Cloud) via console and IaC (Terraform).
- You build CI/CD pipelines and develop release, versioning, and rollback strategies.
- In the field of Observability & Site Reliability Engineering, you implement OpenTelemetry-based tracing, metrics, and logs, define SLIs/SLOs, alerting, and error budgets.
- Together with our AI Engineers, you will set up the platform for Model Serving: GPU scheduling, autoscaling, inference gateways, observability (latency/QPS/token costs).
This text has been machine translated. Show original
Our expectations of you
Education
- You have successfully completed your master's degree or your doctorate in one of the STEM subjects or a humanities subject with a STEM specialization.
Qualifications
- You have expertise in security, including network security, secrets management, hardening (CIS), software supply chain, and access principles (Least Privilege).
- You are characterized by curiosity and a strong desire to learn, as well as pronounced problem-solving and communication skills.
- You communicate convincingly and efficiently in German and English.
Experience
- You have at least 2 years of relevant professional experience in DevOps, Site Reliability Engineering, or Platform Engineering, with proven responsibility for Kubernetes, IaC, CI/CD, Observability, and production operations — ideally within a SaaS environment.
- You possess practical know-how in Git-based deployments, modular IaC, secret/config management, and incident experience.
- Ideally, you have some practical experience in operating inference workloads (vLLM or similar), GPU capacity management, autoscaling, and observability.
This text has been machine translated. Show original
What we offer
- You work on a cutting-edge, scalable AI platform with a lot of creative freedom and take early responsibility for key infrastructure and architecture decisions.
- You will exchange technical insights with your future colleagues on an equal footing and receive budget and time for your own innovation projects.
- You grow professionally and personally with us through specially tailored training, certifications, and career development programs.
- In your areas of expertise, you can set and expand your focus.
- In addition to an attractive fixed salary plus sales and profit sharing, you can compensate for overtime and record travel time as working hours.
- By freely choosing your workplace and flexible working hours, you tailor your daily work routine to suit your lifestyle.
This text has been machine translated. Show original
Benefits
Work-Life-Integration
Health, Fitness & Fun
Topics that you deal with on the job
Job Locations
This is your employer
comma soft
Die Comma Soft AG mit Sitz in Bonn hilft DAX-Konzernen, mittelständischen Unternehmen sowie Behörden die unzähligen Möglichkeiten der Digitalisierung zu nutzen. Als ganzheitlicher digitaler und neutraler Lösungspartner werden zusammen mit den Kunden bisher unerschlossene Ertragspotentiale aufgedeckt und so dauerhafter digitaler Erfolg durch neue Geschäftsmodelle, -prozesse und -strategien geschaffen.
Description
- Company Size
- 50-249 Employees
- Founding year
- 1989
- Company Type
- Digital Agency
- Working Model
- Hybrid, Onsite
- Industry
- Internet, IT, Telecommunication