Senior Lead Site Reliability Engineer at JPMorgan Chase - ScoutJobs - The AI-curated global job board
Skip to content
JPMorgan Chase
Posted a day ago

Senior Lead Site Reliability Engineer

JPMorgan ChaseSenior Lead Site Reliability Engineer - AI/ML and Data Platforms

Requirements

5+ years applied SRE experience, Advanced understanding of SLI/SLO/SLA, Experience with observability tools (Grafana, Prometheus, Splunk), Knowledge of distributed systems and system design, Experience with AI capabilities in reliability workflows

Skills

PythonAWSKubernetesTerraformDatabricksSpark

About the role

Responsibilities

  • Define non-functional requirements (NFRs) and availability targets for large-scale data platforms and AI/ML workloads.
  • Create and deliver high-quality designs, roadmaps, and program charters for distributed systems initiatives.
  • Implement observability and reliability designs to ensure robust, stable, and scalable analytics environments.
  • Lead the adoption of AI-assisted reliability workflows across the SDLC, including testing, validation, and production readiness.
  • Use enterprise-authorized AI capabilities to accelerate incident analysis and operational decisioning.
  • Mentor technologists and serve as a site reliability adoption champion within the engineering community.

Requirements

  • 5+ years of applied Site Reliability Engineering (SRE) experience.
  • Advanced understanding of SRE principles, including SLI, SLO, SLA, and error budgets.
  • Extensive experience with observability tools such as Grafana, Prometheus, Splunk, Dynatrace, or Datadog.
  • Demonstrated experience using AI capabilities to improve reliability engineering workflows.
  • Strong knowledge of distributed systems, system design, resiliency, and disaster recovery.
  • Ability to communicate complex data-based solutions and collaborate effectively across cross-functional teams.

Preferred Qualifications

  • Experience with AWS platforms and managed data platforms like Databricks.
  • Experience building and managing data pipelines using Spark or similar distributed compute frameworks.
  • Knowledge of containerization and orchestration tools such as Docker and Kubernetes.
  • Proficiency in Python or other programming languages for automation and platform development.
  • Experience with CI/CD pipelines, automation frameworks, and Infrastructure as Code (e.g., Terraform).

About the Company

JPMorgan Chase is a leading global financial institution providing innovative solutions to millions of consumers, small businesses, and prominent corporate and government clients. We leverage cutting-edge technology to drive excellence in investment banking, asset management, and consumer banking.

ScoutJobs Agent

Get matches like this delivered daily

Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.

Get started — it's free

Senior Lead Site Reliability Engineer

JPMorgan Chase · Jersey City

Sign up to apply