
Posted 3 days ago
Site Reliability Engineer
Schwarz Digits(Senior) Site Reliability Engineer - STACKIT Control Plane
Requirements
3+ years SRE or DevOps experience, Expert Kubernetes Control Plane knowledge, Proficiency in Go, Infrastructure as Code experience, Linux system internals and networking, Experience with datastores and messaging systems
Skills
KubernetesGoLinuxPostgreSQLKafkaCI/CD
About the role
Responsibilities
- Collaborate with development teams to enhance monitoring and alerting infrastructure to shorten time-to-detect intervals.
- Optimize time-to-mitigation by creating clear playbooks, designing dashboards, and ensuring comprehensive telemetry data.
- Act as a reliability consultant to foster a shared responsibility model and educate teams on reliability patterns.
- Design and refine development practices and CI/CD pipelines to support progressive delivery strategies like Canary and Blue/Green deployments.
- Proactively analyze and optimize Control Plane scalability, addressing bottlenecks in distributed consensus, database throughput, and networking.
- Participate in a compensated on-call rotation, leading incident responses and facilitating blameless post-mortems.
Requirements
- 3+ years of experience in Site Reliability Engineering, DevOps, or Platform Engineering, focusing on large-scale distributed systems.
- Expert-level knowledge of Kubernetes Control Plane internals (API Server, Controller Manager, Scheduler, and etcd).
- Proficiency in Go for building automation tools, Kubernetes Operators, or integration code.
- Deep experience with Infrastructure as Code and container infrastructure.
- Strong understanding of Linux system internals (kernel tuning, memory management) and networking (TCP/IP, CNI, Load Balancers, eBPF).
- Experience operating datastores (e.g., PostgreSQL, Redis) and messaging systems (e.g., Kafka, NATS) in scalable environments.
About the Company
Schwarz Digits creates the technological foundation for digital sovereignty in Europe. As the IT and digital division of the Schwarz Group, we develop and manage the IT infrastructures for the retail divisions Lidl and Kaufland, as well as Schwarz Production and PreZero. We operate as an independent provider in the external market, bundling core services in Cloud, Cyber Security, Data & AI, Communication, and Workspace.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeSite Reliability Engineer
Schwarz Digits · Heilbronn
