M
Posted 10 hours ago
HPC Engineer
MBZUAIHPC Engineer
Requirements
Bachelor's degree in CS or related field, Linux administration, Python, Bash, Go, or C/C++, Networking fundamentals, Cloud platforms (Azure, AWS, GCP), Containers (Docker, Apptainer, Enroot), Git
Skills
LinuxPythonDockerBashHPC
About the role
Responsibilities
- Support the operation and maintenance of large-scale GPU computing clusters
- Assist researchers with job submission, troubleshooting, and resource utilization
- Monitor cluster health, performance, and availability
- Troubleshoot Linux, hardware, storage, networking, and software issues
- Support Slurm administration and user management
- Assist with cluster deployment, upgrades, and validation
- Develop scripts and automation tools to improve efficiency
- Maintain technical documentation and operational procedures
- Participate in incident response and operational support
- Collaborate with researchers, vendors, and internal teams
Requirements
- Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, Software Engineering, IT, Mathematics, Physics, or a related discipline
- Experience with Linux administration
- Proficiency in Python, Bash, Go, or C/C++
- Understanding of networking fundamentals
- Experience with Git and software development workflows
Preferred Qualifications
- Experience with cloud platforms (Azure, AWS, GCP)
- Proficiency with containers (Docker, Apptainer, Enroot)
- Exposure to AI/ML infrastructure
- Experience in HPC, distributed systems, or research computing environments
About the Company
The Institute for Foundation Models (IFM) at MBZUAI operates some of the world’s largest AI supercomputing environments, supporting frontier AI research and foundation model development across thousands of GPUs.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeHPC Engineer
MBZUAI · United Arab Emirates
