Logo Doctolib

Senior Site Reliability Engineer - Observability

Job

  • Level
    Senior
  • Job Field
    IT, DevOps, Back End
  • Employment Type
    Full Time
  • Contract Type
    Permanent employment
  • Location
    Berlin
  • Working Model
    Hybrid, Onsite
  • Job Summary

    In this role, you will develop the observability strategy, optimize logging, metrics, and tracing, drive large-scale reliability initiatives, and enhance incident management processes on our platform.

    Job Technologies

    Your role in the team

    • We are looking for a Senior Site Reliability Engineer to join the Core Reliability & Observability team in Platform Engineering.
    • Your mission will be to shape Doctolib's observability strategy and ensure our platform remains reliable, debuggable, and scalable at a European scale.
    • You will work in a feature team developing logging, metrics, tracing, and alerting capabilities, contributing directly to supporting 400,000 health professionals and 80 million patients in their daily healthcare journey.
    • Working in the tech team at Doctolib means building innovative products and features to improve the daily lives of care teams and patients.
    • Your responsibilities include but are not limited to:
    • Lead the observability strategy across the platform, with an emphasis on building scalable, developer-friendly logging and tracing capabilities.
    • Identify and lead large-scale cross-cutting reliability initiatives, including improvements to our incident detection, response, and postmortem analysis capabilities.
    • Take part in the on-call rotation, and actively contribute to improving our on-call experience by refining alerting, reducing noise, and ensuring actionable telemetry.

    This text has been machine translated. Show original

    Our expectations of you

    Qualifications

    • You'll be a great fit if you:
    • Have solid understanding of containerization and orchestration technologies (Docker and Kubernetes).
    • Have a strong understanding of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows.
    • Have deep expertise in observability tooling and architecture, such as: Logging: Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Logstash, Vector; Tracing: OpenTelemetry or proprietary APMs; Metrics: Prometheus, Thanos, Datadog, or equivalent.
    • Have proficiency in at least one programming language (Ruby, Python, Go, Java, etc.) and a deep understanding of infrastructure as code principles.
    • Like troubleshooting performance issues in complex environments.
    • Are fluent in English.
    • It would be fantastic if you:
    • Have worked in a high-growth tech environment.

    Experience

    • Have a solid hands-on experience (3y+) on a large-scale production platform.
    • Have proven experience with cloud platforms such as AWS, Azure or Google Cloud.
    • Have experience with monitoring and observability tools.
    • Have experience contributing to open-source observability projects.
    • Are passionate about developer experience and platform engineering.

    This text has been machine translated. Show original

    What we offer

    • Free comprehensive health insurance (basic package) for you and your children.
    • 25 days of paid vacation per year, plus up to 14 days of RTT.
    • Free mental health and coaching services through our partner Moka.care.
    • Work from abroad for up to 10 days per year thanks to our flexibility days policy.
    • Lunch vouchers (Swile card) worth €8.50 per working day, with €4.50 covered by Doctolib.
    • A subsidy from the works council to reimburse part of the membership fee for a sports club or a creative class.
    • 50% reimbursement of your public transport subscription.
    • Parent Care Program: receive one additional month of leave on top of the legal parental leave.
    • Enrollment in Doctolib's long-term employee value sharing plan called DoctoGrowth.
    • For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support.
    • Relocation support in case of international mobility.
    • Access to the best AI tools for coding, development and dedicated training.

    This text has been machine translated. Show original

    Benefits

    Work-Life-Integration

    Topics that you deal with on the job

    Job Locations

    • Location Berlin

      Germany

    This is your employer

    Doctolib

    Doctolib

    Doctolib is an innovative company in the digital health sector that offers a platform for organizing medical appointments. It supports both doctors and patients in optimizing the appointment management process and improving communication.

    Description

  • Company Type
    Startup
  • Working Model
    Hybrid, Onsite
  • Industry
    Healthcare, Social Sector
  • Logo Doctolib

    Senior Site Reliability Engineer - Observability

    Location
    Berlin
    Working Model
    Hybrid, Onsite
    Diversity
    Open for all genders
    English Only
    English only required

    More Jobs