
Posted 4 days ago
Site Reliability Engineer
PhonePeSite Reliability Engineer
Perks & benefits
AccommodationMedical InsuranceMobile AllowancePaid LeaveRelocation Allowance
Requirements
4-8 years experience, Microsoft Azure expertise, Linux/Ubuntu administration, Terraform and Saltstack, Python, Go, or Java, Prometheus and Grafana, MySQL and Aerospike, Networking (BGP, IPsec, Express Route)
Skills
AzureTerraformPythonLinuxKubernetesDockerPrometheusAnsible
About the role
Responsibilities
- Manage, scale, and ensure high availability of core infrastructure within a high-volume Azure environment.
- Configure and maintain Ubuntu Virtual Machines, Azure Storage, and networking components including Azure Firewall, Route Tables, and Express Route.
- Drive automation for all BAU tasks using Terraform, Saltstack, and Ansible.
- Set up and manage high-availability databases such as MySQL and Aerospike, including cross-region replication and migrations.
- Implement and manage monitoring and observability solutions using Prometheus, Victoria Metrics, and Grafana, alongside centralized logging with Loki.
- Lead incident response, conduct Root Cause Analysis (RCA), and participate in an on-call rotation.
- Conduct proactive capacity planning and manage critical components like Nginx, HA Proxy, Docker, and RabbitMQ.
Requirements
- 4-8 years of experience in Site Reliability Engineering or similar infrastructure roles.
- Deep expertise in Microsoft Azure services and complex Azure networking (BGP, IPsec, Express Route).
- Expert proficiency in Linux administration, specifically Ubuntu/Linux environments.
- Strong programming skills in at least one high-level language: Python, Go, or Java.
- Mastery of Shell scripting (Bash) for operational automation.
- Hands-on experience with Infrastructure as Code (Terraform) and configuration management (Saltstack or Ansible).
- Proven experience managing high-availability databases (MySQL, Aerospike) and monitoring stacks (Prometheus, Grafana).
Benefits
- Comprehensive insurance coverage including Medical, Critical Illness, Accidental, and Life Insurance.
- Wellness programs including an Employee Assistance Program and onsite medical center.
- Parental support including maternity, paternity, and adoption assistance.
- Retirement benefits including PF contributions and NPS.
- Additional perks such as higher education assistance and car lease options.
About the Company
PhonePe is a leading digital payments platform in India, serving over 600 million registered users and 40 million merchants. The company processes over 330 million transactions daily and is expanding its portfolio into financial products like insurance, lending, and wealth management, as well as new consumer tech businesses.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeSite Reliability Engineer
PhonePe · Bangalore
