
Posted a day ago
Software Engineer, Fleet Hardware Health
OpenAISoftware Engineer, Fleet Hardware Health
Requirements
Experience managing large-scale server environments, Proficiency in Python, Go, or similar languages, Strong Linux, networking, and server hardware knowledge, Data analysis with SQL, PromQL, and Pandas
Skills
PythonGoLinuxSQLPrometheusGrafana
About the role
Responsibilities
- Build and maintain automation systems for provisioning and managing server fleets
- Develop tools to monitor server health, performance, and lifecycle events
- Collaborate with clusters, networking, and infrastructure teams to ensure high availability
- Partner with external operators to maintain high quality standards
- Identify and resolve performance bottlenecks and inefficiencies
- Continuously improve automation to reduce manual operational work
Requirements
- Experience managing large-scale server environments
- Proficiency in Python, Go, or similar programming languages
- Strong knowledge of Linux, networking, and server hardware
- Ability to perform data analysis using SQL, PromQL, and Pandas
Preferred Qualifications
- Experience with low-level hardware details (PCIe, Infiniband, power management, kernel perf tuning)
- Knowledge of hardware management protocols such as IPMI or Redfish
- Experience with High-Performance Computing (HPC) or distributed systems
- Familiarity with monitoring tools like Prometheus and Grafana
Benefits
- Competitive salary range of $230K – $490K plus equity
- Comprehensive medical, dental, and vision insurance
- 401(k) retirement plan with employer match
- Flexible PTO and paid parental leave
- Daily meals in the office and mental health support
- Annual learning and development stipend
About the Company
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of AI capabilities and seek to safely deploy them to the world through products like ChatGPT.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeSoftware Engineer, Fleet Hardware Health
OpenAI · San Francisco
