Job
- Level
- Senior
- Job Field
- IT, DevOps, Security
- Employment Type
- Full Time
- Contract Type
- Permanent employment
- Location
- Munich
- Working Model
- Hybrid, Onsite
Job Summary
In this role, you will manage our Kubernetes platforms, enhance observability and security, execute zero-downtime upgrades, and optimize the GitOps workflow using automation tools and monitoring technologies.
Job Technologies
Your role in the team
- You're a seasoned Site Reliability Engineer with years spent running production Kubernetes at scale, and you're the kind of engineer who takes the initiative when something can be better - observability, resilience, a tricky upgrade, or the way the team thinks about security.
- In this role you'll join our operations team for our MeetingSuite product in Munich - a flat and diverse SRE team of four engineers.
- Your day-to-day is keeping our Kubernetes platforms observable, resilient and boring-to-upgrade: GitOps with Flux, multi-AZ design, zero-downtime releases, and a centralised observability story every service owner can use without calling SRE.
- Alongside that, you'll partner closely with our Application Security Engineer on Kubernetes and container security - with room to grow into our security champion over time - to keep the bar high for the DAX 30 and other DACH customers we serve.
- Operate and continuously improve our Kubernetes production platforms, contributing to zero-downtime upgrades and multi-AZ resilience as team-wide goals.
- Werde zum Experten des Teams für unsere ELK-basierte Log-Plattform – zentralisiertes Cross-Cluster-Monitoring und Anomalieerkennung – sodass jeder Service-Owner seine Workloads sehen, Alarme setzen und debuggen kann, ohne Unterstützung durch SRE.
- Maintain and evolve our Prometheus alerting rules and Grafana dashboards alongside the team.
- Partner with our Application Security Engineer on Kubernetes and container security - admission control, workload identity, secrets management, network segmentation and runtime threat detection - with an interest in growing into our security champion over time.
- Love automation. Chip away at operational toil - deployments, monitoring setup, internal reporting - building on the baseline the team already has, and ship reliably through our GitOps workflow (Flux, GitLab CI).
- Participate in our Standby and Daily Business rotation, lead incident response, run blameless post-mortems, and drive the resulting action items to completion.
This text has been machine translated. Show original
Our expectations of you
Qualifications
- Solid grasp of Kubernetes and container security - secrets management, network segmentation and runtime protection - and an interest in growing into our security champion alongside our Application Security Engineer.
- Proven depth in the ELK stack (or a very similar log platform) - pipelines, indexing, dashboards, alerting - with an interest in growing into the team's observability expert.
- Working knowledge of Prometheus and Grafana.
- Solid coding in Go, Python or Bash, with a love for automating away repetitive work.
- Comfortable being on-call and leading incidents calmly under pressure.
- Professional fluency in German and excellent English; at home working in a diverse team.
Experience
- Several years of hands-on experience in SRE, DevOps, or Platform Engineering, including significant time managing production Kubernetes at scale.
- Strong Kubernetes expertise with deep hands-on experience in at least one area - cluster lifecycle and upgrades, workload identity and RBAC, admission control, network policies, or custom resources and operators - and working familiarity with the rest.
- Comfortable with GitOps and CI/CD as a daily way of working (we run Flux and GitLab CI; equivalents like Argo CD, GitHub Actions or Jenkins are fine), and hands-on experience with Helm and Kustomize for managing manifests.
This text has been machine translated. Show original
What we offer
- To foster strong collaboration and connection, this role will follow a hybrid work model.
- If you are within a commuting distance to one of our Diligent office locations, you will be expected to work onsite at least 50% of the time.
This text has been machine translated. Show original
Topics that you deal with on the job
Job Locations
This is your employer
Diligent Corporation
Diligent Corporation, a renowned company based in New York, offers a comprehensive platform for governance, risk, and compliance processes. It assists executives and boards in managing information centrally and controlling risks more effectively. Founded in 1994, the company has locations in Austria and Germany.
Description
- Company Type
- Established Company
- Working Model
- Hybrid, Onsite
- Industry
- Internet, IT, Telecommunication
