Senior Site Reliability Engineer at Onebrief - ScoutJobs - The AI-curated global job board
Skip to content
Onebrief
Posted 4 days ago

Senior Site Reliability Engineer

Onebrief

Requirements

Active Top Secret clearance, 5+ years in Platform, DevOps, or SRE, Terraform or CloudFormation, Ansible, Kubernetes, GitLab CI/CD, Jenkins, or GitHub Actions, Python, Go, or Bash, AWS or AWS GovCloud, Grafana, ELK, or Datadog

Skills

KubernetesTerraformAWSPythonAnsibleGo

About the role

Responsibilities

  • Own the reliability, scalability, and security of the production application and platform across AWS and on-premise DoD environments.
  • Design, implement, and manage a world-class observability platform using tools like Prometheus, Loki, and Grafana.
  • Define and measure Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to increase system trust.
  • Lead incident response and act as incident commander during critical events, conducting blameless post-mortems and After Action Reviews (AARs).
  • Automate infrastructure using Terraform and Ansible, embedding security and compliance controls (RMF, STIGs) directly into the automation.
  • Proactively identify and eliminate operational toil through advanced automation and improved runbooks.

Requirements

  • Active Top Secret clearance (with the ability to obtain SCI eligibility).
  • 5+ years of experience in Platform, DevOps, or Site Reliability Engineering.
  • Proficiency with Infrastructure as Code tools such as Terraform or CloudFormation and Ansible.
  • Hands-on experience with Kubernetes design, deployment, and operations.
  • Experience building and maintaining CI/CD pipelines (GitLab CI/CD, Jenkins, or GitHub Actions).
  • Proficiency in at least one scripting language: Python, Go, or Bash.
  • Familiarity with AWS or AWS GovCloud.
  • Experience with observability stacks such as Grafana, ELK, or Datadog.
  • Strong understanding of networking fundamentals and secure configurations.

Preferred Qualifications

  • Experience working in DoD environments and familiarity with compliance frameworks (RMF, STIGs, ICD 503).
  • Experience with GitOps practices and service mesh technologies like Istio or Linkerd.
  • Familiarity with on-prem virtualization (VMware, Proxmox, Nutanix, or Hyper-V).
  • Relevant certifications such as AWS DevOps Engineer, CKA/CKAD, or Security+.

Benefits

  • Compensation: $180K – $220K plus equity.
  • Comprehensive health, dental, vision, and life insurance.
  • 401(k) plan with company match.
  • Unlimited PTO and 8 weeks of fully paid parental leave.
  • Annual company summit retreats and a $1,000 annual home office budget.

About the Company

Onebrief is a collaboration and AI-powered workflow software company designed specifically for military staffs. We transform military planning to make staffs faster, smarter, and more efficient. Founded in 2019, Onebrief is a high-growth organization valued at $2.15B, backed by top-tier investors including Battery Ventures and General Catalyst.

ScoutJobs Agent

Get matches like this delivered daily

Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.

Get started — it's free

Senior Site Reliability Engineer

Onebrief · Arlington

Sign up to apply