
Posted 20 hours ago
Inference Optimization Intern – Performance Modeling
Institute of Foundation ModelsInference Optimization Intern – Performance Modeling
Requirements
Pursuing degree in CS, CE, EE, AI, or HPC, CUDA programming, NVIDIA GPU architecture knowledge, Nsight Systems or Nsight Compute, PTX or SASS knowledge, C++, CUDA, and Python proficiency
Skills
CUDAPythonC#PyTorch
About the role
Responsibilities
- Develop analytical performance models for GPU kernels and inference workloads
- Build and validate a simulator to estimate theoretical hardware performance limits
- Identify performance bottlenecks in compute, memory, communication, and scheduling
- Analyze GPU execution using NVIDIA Nsight Systems and Nsight Compute
- Investigate PTX and SASS code generation to understand low-level execution behavior
- Collaborate with researchers to optimize inference kernels for transformer-based models
- Design profiling methodologies for Hopper and Blackwell architectures
Requirements
- Currently pursuing a degree in Computer Science, Computer Engineering, Electrical Engineering, AI, HPC, or a related discipline
- Proficiency in C++, CUDA, and Python
- Knowledge of NVIDIA GPU architecture and memory hierarchy
- Experience with CUDA programming and GPU kernel development
- Familiarity with performance profiling tools like Nsight Systems or Nsight Compute
- Understanding of PTX or SASS
Preferred Qualifications
- Experience optimizing CUDA kernels for throughput and latency
- Understanding of roofline analysis and hardware utilization metrics
- Experience with deep learning frameworks such as PyTorch or TensorFlow
- Strong analytical and debugging abilities
About the Company
The Institute of Foundation Models is dedicated to advancing the science and engineering of large-scale AI systems. Our researchers and engineers develop cutting-edge foundation models while pushing the limits of high-performance computing and efficient AI inference.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeInference Optimization Intern – Performance Modeling
Institute of Foundation Models · Sunnyvale
