Inference Optimization Intern – Performance Modeling at Institute of Foundation Models - ScoutJobs - The AI-curated global job board
Skip to content
Institute of Foundation Models
Posted 20 hours ago

Inference Optimization Intern – Performance Modeling

Institute of Foundation ModelsInference Optimization Intern – Performance Modeling

Requirements

Pursuing degree in CS, CE, EE, AI, or HPC, CUDA programming, NVIDIA GPU architecture knowledge, Nsight Systems or Nsight Compute, PTX or SASS knowledge, C++, CUDA, and Python proficiency

Skills

CUDAPythonC#PyTorch

About the role

Responsibilities

  • Develop analytical performance models for GPU kernels and inference workloads
  • Build and validate a simulator to estimate theoretical hardware performance limits
  • Identify performance bottlenecks in compute, memory, communication, and scheduling
  • Analyze GPU execution using NVIDIA Nsight Systems and Nsight Compute
  • Investigate PTX and SASS code generation to understand low-level execution behavior
  • Collaborate with researchers to optimize inference kernels for transformer-based models
  • Design profiling methodologies for Hopper and Blackwell architectures

Requirements

  • Currently pursuing a degree in Computer Science, Computer Engineering, Electrical Engineering, AI, HPC, or a related discipline
  • Proficiency in C++, CUDA, and Python
  • Knowledge of NVIDIA GPU architecture and memory hierarchy
  • Experience with CUDA programming and GPU kernel development
  • Familiarity with performance profiling tools like Nsight Systems or Nsight Compute
  • Understanding of PTX or SASS

Preferred Qualifications

  • Experience optimizing CUDA kernels for throughput and latency
  • Understanding of roofline analysis and hardware utilization metrics
  • Experience with deep learning frameworks such as PyTorch or TensorFlow
  • Strong analytical and debugging abilities

About the Company

The Institute of Foundation Models is dedicated to advancing the science and engineering of large-scale AI systems. Our researchers and engineers develop cutting-edge foundation models while pushing the limits of high-performance computing and efficient AI inference.

ScoutJobs Agent

Get matches like this delivered daily

Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.

Get started — it's free

Inference Optimization Intern – Performance Modeling

Institute of Foundation Models · Sunnyvale

Sign up to apply