Job
- Level
- Experienced
- Job Field
- IT, System, DevOps
- Employment Type
- Full Time
- Contract Type
- Permanent employment
- Location
- Karlsruhe
- Working Model
- Hybrid, Onsite
Job Summary
In this role, you will provide technical Level-2 support, optimize monitoring and logging solutions, and ensure service availability within a Kubernetes environment. You will troubleshoot issues and automate workflows using Ansible and Terraform.
Job Technologies
Your role in the team
- You take over the technical Level-2 support with direct customer contact.
- You maintain monitoring, logging, and alerting solutions (e.g., Prometheus, Grafana, Loki) for proactive problem detection in shift operations and contribute to resolving complex issues in distributed systems.
- You troubleshoot networks (LAN/WAN/VPN, DNS, DHCP) and storage systems (File/Object/Block) and deploy highly available services on Linux and Kubernetes (Helm charts).
- You build Infrastructure-as-Code and maintain automation and playbooks with Ansible, Terraform, GitLab CI/CD, Argo CD, as well as scripting languages like Bash, Python, and Go.
- You collaborate with development teams to improve processes and deployments and to seamlessly integrate new services and applications into our cloud and Kubernetes environment.
- You ensure a stable and secure platform operation, including end-to-end incident management from initial analysis through resolution to post-processing within the scope of problem management.
This text has been machine translated. Show original
Our expectations of you
Qualifications
- You are ready to work in a 24×7 shift model (night, weekend, and holiday shifts) and bring a strong problem-solving and troubleshooting mindset.
- You have solid knowledge of automation tools (e.g., Ansible, SaltStack), monitoring and observability tools (Prometheus, Grafana, Loki), as well as logging and alerting solutions (ELK Stack).
- You have very good knowledge of at least one programming or scripting language (Go, Python, Bash) for automation and monitoring tasks.
Experience
- You have several years of experience as a Site Reliability Engineer or in a related role (Linux System Administrator, Platform Engineer, DevOps/Infrastructure Engineer, Full-Stack Developer).
- You have experience with virtualized environments (QEMU/KVM, OpenStack, Proxmox), cloud storage technologies (File, Object, Block), and are proficient in Docker & Kubernetes.
- You have experience in code management (merge conflicts, feature branches, merge requests, CI/CD), which is an advantage.
This text has been machine translated. Show original
What we offer
- You benefit from a hybrid work model and flexible shift hours.
- At some locations, you can enjoy a subsidized canteen and various free beverages, as well as modern office spaces with excellent transportation links.
- You will receive various employee discounts for activities and products.
- Look forward to employee events such as summer and winter parties as well as workshops.
- Numerous training and development opportunities are available to you.
- Various health offerings, such as sports and health courses, support your well-being.
This text has been machine translated. Show original
Benefits
Work-Life-Integration
Health, Fitness & Fun
Topics that you deal with on the job
Job Locations
This is your employer
IONOS
The 1&1 IONOS product portfolio offers everything businesses need to be successful in the cloud: from domains to classic websites and do-it-yourself solutions, online marketing tools to fully-fledged servers and an IaaS solution.
Description
- Founding year
- 1988
- Company Type
- Established Company
- Working Model
- Full Remote, Hybrid, Onsite
- Industry
- Internet, IT, Telecommunication