Requirements

3-5+ years managing production infrastructure in cloud environments, Hands-on experience with Linux systems, Experience with containerized workloads and Kubernetes, Knowledge of Infrastructure as Code (Terraform, Terragrunt, or Crossplane), Experience designing CI/CD pipelines, Familiarity with GitOps (Argo CD or Flux), Understanding of cloud networking and load balancing, Experience with monitoring and logging (Prometheus, Grafana, ELK), Proficiency in Bash or Python, Experience with relational databases, Understanding of SLIs, SLOs, and error budgets

About the role

Responsibilities

Design, deploy, and operate reliable and scalable systems across cloud and Kubernetes environments
Automate infrastructure provisioning, deployments, and operational workflows
Build and maintain tools for deployment, monitoring, and system operations
Monitor system health and performance to proactively identify areas for improvement
Troubleshoot and resolve issues across development, test, and production environments
Participate in incident response, root cause analysis, and long-term reliability improvements
Collaborate with engineering teams to improve system operability and deployment safety
Support and operate large-scale systems, including data-intensive or AI-driven workloads

Requirements

3–5+ years of experience managing production infrastructure in cloud environments (AWS, Azure, or GCP)
Strong hands-on experience with Linux systems in production environments
Experience with containerized workloads and Kubernetes
Knowledge of Infrastructure as Code (Terraform, Terragrunt, or Crossplane)
Experience designing and maintaining CI/CD pipelines
Familiarity with GitOps principles and tools such as Argo CD or Flux
Understanding of cloud networking, load balancing, and service connectivity
Experience with monitoring and logging tools (Prometheus, Grafana, ELK/EFK)
Proficiency in Bash or Python
Experience with relational databases
Understanding of SLIs, SLOs, and error budgets
Ability to participate in on-call rotations and perform root cause analysis

Preferred Qualifications

Experience operating systems at scale or in high-availability environments
Exposure to on-prem or hybrid infrastructure
Experience supporting data platforms, analytics, or AI/ML workloads

About the Company

Signzy is an AI-powered RPA platform for financial services. We automate complex back-office decision-making processes into real-time APIs using our no-code AI model builder and a marketplace of over 200+ APIs. We work with over 90 financial institutions globally, including major banks in India and the US, and maintain a strong global presence with offices in New York, Dubai, and Bangalore.

Site Reliability Engineer

Requirements

Skills

About the role

Responsibilities

Requirements

Preferred Qualifications

About the Company

Get matches like this delivered daily

Site Reliability Engineer