Principal Site Reliability Engineer at AIFT - ScoutJobs - The AI-curated global job board
Skip to content
AIFT
Posted 3 days ago

Principal Site Reliability Engineer

AIFTPrincipal Site Reliability Engineer

Requirements

8+ years software/network/systems engineering, 6+ years large scale cloud services, 2+ years SRE leadership, Business level English fluency, Infrastructure planning and optimization, Budget and OKR planning, Monitoring solutions (Prometheus, Grafana, ELK), SDLC experience, Network security knowledge, GitLab CI/CD implementation, Ansible and Terraform proficiency, Kubernetes and Docker knowledge

Skills

KubernetesTerraformPythonGoAnsiblePrometheusDockerCI/CD

About the role

Responsibilities

  • Lead the development, construction, and management of reliable, distributed systems and large-scale cloud services.
  • Plan infrastructure upgrades and optimizations to support business operations.
  • Manage cloud budgets and ensure expenses remain within allocated limits.
  • Drive OKR planning to ensure technical key results align with company objectives.
  • Implement and maintain robust monitoring solutions and CI/CD processes.
  • Oversee the complete software development life cycle (SDLC) from a reliability perspective.

Requirements

  • 8+ years of technical experience in software engineering, network engineering, or systems administration.
  • 6+ years of experience operating large-scale cloud services.
  • 2+ years of experience in an SRE team leadership role.
  • Proficiency in programming languages such as Bash, Python, or Go.
  • Hands-on experience with Ansible, Terraform, Kubernetes, and Docker.
  • Advanced knowledge of monitoring tools including Prometheus, Grafana, and ELK stack.
  • Experience implementing GitLab CI/CD and using Git version control.
  • Strong understanding of network security and infrastructure automation.
  • Business-level fluency in English.

Preferred Qualifications

  • Experience with AI pair programming tools like OpenAI.
  • Proven ability to work effectively in a team-oriented environment with strong interpersonal skills.

About the Company

AIFT is dedicated to building cutting-edge technology solutions, focusing on high-scale, reliable, and distributed systems to drive business innovation.

ScoutJobs Agent

Get matches like this delivered daily

Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.

Get started — it's free

Principal Site Reliability Engineer

AIFT · Taipei

Sign up to apply