Senior Software Development Engineer – LLM Inference Framework at AMD - ScoutJobs - The AI-curated global job board
Skip to content
AMD
Posted 3 days ago

Senior Software Development Engineer – LLM Inference Framework

AMDSenior Software Development Engineer – LLM Inference Framework

Requirements

Master's or PhD in Computer Science or related field, Experience with vLLM or SGLang, Proficiency in Python and C/C++, Knowledge of distributed inference scaling, Experience with GPU architectures

Skills

PythonC#PyTorchLLMGPU

About the role

Responsibilities

  • Architect and optimize distributed LLM inference runtimes based on in-house engines or open-source stacks like vLLM and SGLang
  • Design and improve hybrid execution strategies including TP, PP, and EP (MoE), KV-cache management, and token scheduling
  • Implement multi-node inference pipelines using RCCL, RDMA, and collective-based execution
  • Drive throughput, latency, and memory efficiency across single-GPU and multi-GPU clusters
  • Optimize continuous batching, speculative decoding, and KV-cache paging
  • Collaborate with AMD GPU library and compiler teams to ensure efficient use of FP8/FP4 GEMM and FlashAttention
  • Upstream features and performance fixes into major open-source inference frameworks

Requirements

  • Master's or PhD in Computer Science, Computer Engineering, Electrical Engineering, or a related field
  • Proficiency in Python and C/C++
  • Hands-on experience with vLLM, SGLang, or similar inference stacks
  • Deep understanding of GPU architectures and distributed inference scaling
  • Experience with high-performance computing and large-scale workloads on heterogeneous GPU clusters

Preferred Qualifications

  • Experience contributing to upstream open-source projects
  • Strong background in kernel development and GPU runtime integration
  • Expertise in integrating optimized GPU performance into PyTorch or TensorFlow
  • Understanding of compiler systems including LLVM, Triton, and ROCm

About the Company

AMD is a global leader in high-performance computing, graphics, and visual computing technologies. Our mission is to build products that accelerate next-generation computing experiences—from AI and data centers to PCs and gaming systems. We strive to push the limits of innovation to solve the world's most important challenges.

ScoutJobs Agent

Get matches like this delivered daily

Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.

Get started — it's free

Senior Software Development Engineer – LLM Inference Framework

AMD · Santa Clara

Sign up to apply