Site Reliability Engineer at ID.me - ScoutJobs - The AI-curated global job board
Skip to content
ID.me
Posted 4 days ago

Site Reliability Engineer

ID.meSite Reliability Engineer

Requirements

Bachelor's degree in Computer Science or related field, 3-5 years SRE or DevOps experience, 2+ years cloud management (AWS, GCP, or Azure), 1+ years proficiency in Java, Go, Python, Ruby, or JavaScript

Skills

AWSKubernetesTerraformPythonGoPrometheusGrafana

About the role

Responsibilities

  • Build and maintain automated reliability tooling, infrastructure as code, and observability systems to enhance uptime and performance.
  • Develop monitoring, logging, and alerting frameworks using tools like Prometheus, Grafana, and OpenTelemetry.
  • Implement automated architectural reviews and reliability guardrails for agent-developed applications.
  • Partner with engineering teams to design scalable, fault-tolerant systems that meet defined SLIs and SLOs.
  • Automate repetitive operational tasks and develop self-healing/auto-remediation mechanisms.
  • Participate in on-call rotations, lead incident response efforts, and conduct post-incident reviews.
  • Improve deployment and release processes using CI/CD pipelines and progressive delivery techniques.
  • Collaborate with Security and Compliance teams to ensure systems meet FedRAMP, NIST, and internal policy requirements.

Requirements

  • Bachelor’s degree in Computer Science, Software Engineering, or a related technical field.
  • 3-5 years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering.
  • 2+ years of hands-on experience managing and scaling services in cloud environments (AWS, GCP, or Azure).
  • 1+ years of proficiency in at least one modern programming language (Java, Go, Python, Ruby, or JavaScript).

Preferred Qualifications

  • Strong understanding of containerization and orchestration technologies such as Docker and Kubernetes.
  • Experience implementing and maintaining CI/CD pipelines and automation frameworks.
  • Working knowledge of observability systems including metrics, tracing, logging, and alerting.
  • Experience building automated recovery, failover, or chaos-engineering systems.
  • Exposure to infrastructure-as-code tools like Terraform, Pulumi, or Ansible and GitOps practices.
  • Understanding of security and compliance frameworks such as FedRAMP, SOC2, or NIST 800-53.
  • Experience using AI agentic coding assistants or deploying custom AI agents into production.

About the Company

ID.me is the next-generation digital identity wallet that simplifies how individuals securely prove their identity online. With over 152 million users, ID.me provides streamlined identity verification for 20 federal agencies, 45 state government agencies, and over 70 healthcare organizations. We are committed to the mission of "No Identity Left Behind," ensuring everyone has access to a secure digital identity.

ScoutJobs Agent

Get matches like this delivered daily

Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.

Get started — it's free

Site Reliability Engineer

ID.me · Mountain View

Sign up to apply