Site Reliability Engineer 2 - BigData at PhonePe - ScoutJobs - The AI-curated global job board
Skip to content
PhonePe
Posted 10 hours ago

Site Reliability Engineer 2 - BigData

PhonePeSite Reliability Engineer 2 - BigData

Perks & benefits

Medical InsuranceAccommodationMobile AllowanceRelocation Allowance

Requirements

4+ years experience in distributed big data ecosystems, Expertise in Linux, IP, Iptables, and IPsec, Proficiency in Perl, Golang, or Python, Hands-on experience with Hadoop stack (HDFS, HBase, Airflow, YARN, Ranger, Kafka, Pinot), Experience with configuration tools like Puppet, Salt, Chef, or Ansible, Knowledge of DevOps tools (Saltstack, Ansible, Docker, Git), Experience with monitoring tools (ELK, Grafana, Prometheus, OpenTelemetry)

Skills

LinuxPythonHadoopKafkaDockerAnsiblePrometheus

About the role

Responsibilities

  • Manage, maintain, and support complex distributed big data ecosystems and Linux/Unix environments.
  • Design and implement automation systems for provisioning, scaling, upgrading, and patching clusters.
  • Lead on-call rotations and incident responses, conducting root cause analysis and driving postmortem processes.
  • Troubleshoot and resolve complex production issues while identifying mitigating strategies.
  • Design and review scalable, reliable system architectures to ensure high availability and performance.
  • Develop tools and scripts to automate operational processes and increase system resilience.
  • Collaborate with development teams to integrate SRE best practices into the software development lifecycle.
  • Monitor system performance and resource usage to identify bottlenecks and implement performance tuning.

Requirements

  • 4+ years of experience managing and maintaining distributed big data ecosystems.
  • Strong expertise in Linux, including IP, Iptables, and IPsec.
  • Proficiency in scripting or programming with Perl, Golang, or Python.
  • Hands-on experience with the Hadoop stack (HDFS, HBase, Airflow, YARN, Ranger, Kafka, Pinot).
  • Experience with configuration management tools such as Puppet, Salt, Chef, or Ansible.
  • Proficiency with DevOps tools including Saltstack, Ansible, Docker, and Git.
  • Experience with monitoring and logging tools such as ELK, Grafana, Prometheus, and OpenTelemetry.

Preferred Qualifications

  • Experience managing infrastructure on public cloud platforms (AWS, Azure, or GCP).
  • Experience with massive petabyte-scale data migrations and large-scale upgrades.
  • Proven experience in designing and reviewing system architectures for extreme scalability.

Benefits

  • Comprehensive insurance including Medical, Critical Illness, Accidental, and Life Insurance.
  • Wellness programs and onsite medical support.
  • Parental support including maternity, paternity, and adoption assistance.
  • Retirement benefits including PF contributions and NPS.
  • Additional perks such as higher education assistance and car lease programs.

About the Company

PhonePe is a leading digital payments platform in India, serving over 600 million registered users and 40 million merchants. We process over 330 million transactions daily and are committed to unlocking the flow of money and access to services for every Indian through a diverse portfolio of financial and consumer tech products.

ScoutJobs Agent

Get matches like this delivered daily

Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.

Get started — it's free

Site Reliability Engineer 2 - BigData

PhonePe · Bangalore

Sign up to apply