
Posted 4 hours ago
Senior AI Site Reliability Engineer
Oracle
Requirements
7+ years software engineering or SRE experience, Experience with high-availability distributed systems, Proficiency in Python, Java, or Go, Experience with CI/CD, Terraform, and Kubernetes, Knowledge of observability tools like Prometheus and Grafana, Experience with data warehousing platforms
Skills
KubernetesTerraformPythonDockerPrometheusGrafanaAWSOCI
About the role
Responsibilities
- Design, build, and operate highly reliable, scalable, and secure infrastructure for the Oracle Health Patient Portal.
- Advance AI-assisted reliability practices, including enhancing observability, alerting, and automated incident detection.
- Improve system reliability through automation, monitoring, and performance optimization.
- Partner with development teams to enhance service architecture, scalability, and operability.
- Participate in on-call rotations and perform root cause analysis to implement long-term fixes for complex production issues.
- Drive continuous improvement in DevOps/SRE practices, including CI/CD, Infrastructure as Code, and automation at scale.
Requirements
- 7+ years of software engineering, cloud infrastructure, SRE, or DevOps experience.
- Experience building and operating high-availability, fault-tolerant distributed systems.
- Proficiency in Python, Java, or Go.
- Hands-on experience with Terraform, Docker, and Kubernetes.
- Experience with CI/CD pipelines and observability tools such as Prometheus and Grafana.
- Proficiency with data warehousing platforms (e.g., Vertica, Snowflake) and ETL frameworks.
- Strong troubleshooting skills and experience resolving complex production issues in distributed environments.
Preferred Qualifications
- Experience in healthcare or regulated environments (HIPAA, compliance frameworks).
- Experience building self-healing or autonomous infrastructure systems.
- Experience working in environments requiring security clearance.
Benefits
- Medical, dental, and vision insurance.
- 401(k) Savings and Investment Plan with company match.
- Flexible Vacation and paid time off.
- Paid parental leave and adoption assistance.
- Employee Stock Purchase Plan.
About the Company
Oracle brings together the data, infrastructure, applications, and expertise to power everything from industry innovations to life-saving care. With AI embedded across our products and services, we help customers turn that promise into a better future for all.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeSenior AI Site Reliability Engineer
Oracle · United States
