
Posted 4 days ago
HPC Data Center Operational Lead
Jump Trading GroupHPC Data Center Operational Lead
Perks & benefits
Health InsuranceMedical InsurancePaid Leave
Requirements
7+ years data center operations experience, 3+ years team leadership in 24/7 environments, Knowledge of power distribution and cooling technologies, Proficiency in Linux systems, Networking knowledge (L2/L3, BGP, OSPF), Hardware break-fix expertise, Ability to travel regularly to data center sites
Skills
HPCLinuxPythonAristaCisco
About the role
Responsibilities
- Lead and manage data center site teams across multiple HPC facilities, including recruiting, mentoring, and conducting performance reviews.
- Develop and enforce operational standards for power, cooling, cabling, and hardware lifecycles.
- Design and own preventative maintenance programs to minimize unplanned downtime.
- Serve as a subject matter authority on critical facility systems, including air and liquid cooling architectures.
- Own the end-to-end monitoring strategy and lead critical incident response and root cause analysis.
- Manage hardware break-fix functions for servers, GPUs, network equipment, and storage.
- Oversee inventory, spares management, capacity planning, and vendor relationships.
- Integrate AI tools into daily workflows to automate processes, analyze telemetry, and accelerate decision-making.
Requirements
- 7+ years of data center operations experience.
- 3+ years of team leadership experience in 24/7 critical infrastructure environments.
- In-depth knowledge of power distribution, redundancy architectures, and cooling technologies (air and liquid).
- Deep technical expertise in server hardware (GPUs, multi-socket platforms) and network switch hardware (Arista, Cisco).
- Strong proficiency in Linux systems, including OS-level troubleshooting and diagnostics.
- Solid understanding of networking protocols (L2/L3, BGP, OSPF, LACP, ECMP).
- Ability to travel regularly to various HPC data center sites.
- Demonstrated experience using AI tools in a professional setting.
Preferred Qualifications
- Experience in High-Performance Computing (HPC) environments.
- Programming or scripting experience, specifically with Python.
- Knowledge of industry standards such as ASHRAE and TIA-942.
- Bachelor's degree in a relevant technical field.
Benefits
- Discretionary bonus eligibility
- Medical, dental, and vision insurance
- HSA, FSA, and Dependent Care options
- Retirement plan with employer match
- Paid vacation and paid holidays
- Paid parental leave
- Wellness Programs
About the Company
Jump Trading Group is a global technology and trading firm that empowers exceptional talent in Mathematics, Physics, and Computer Science. We design, develop, and deploy cutting-edge technologies that power some of the most demanding computational workloads in the industry.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeHPC Data Center Operational Lead
Jump Trading Group · Chicago
