
Posted 3 days ago
Senior Software Development Engineer – LLM Inference Framework
AMDSenior Software Development Engineer – LLM Inference Framework
Requirements
Master's or PhD in Computer Science or related field, Experience with vLLM or SGLang, Proficiency in Python and C/C++, Knowledge of distributed inference scaling, Experience with GPU architectures
Skills
PythonC#PyTorchLLMGPU
About the role
Responsibilities
- Architect and optimize distributed LLM inference runtimes based on in-house engines or open-source stacks like vLLM and SGLang
- Design and improve hybrid execution strategies including TP, PP, and EP (MoE), KV-cache management, and token scheduling
- Implement multi-node inference pipelines using RCCL, RDMA, and collective-based execution
- Drive throughput, latency, and memory efficiency across single-GPU and multi-GPU clusters
- Optimize continuous batching, speculative decoding, and KV-cache paging
- Collaborate with AMD GPU library and compiler teams to ensure efficient use of FP8/FP4 GEMM and FlashAttention
- Upstream features and performance fixes into major open-source inference frameworks
Requirements
- Master's or PhD in Computer Science, Computer Engineering, Electrical Engineering, or a related field
- Proficiency in Python and C/C++
- Hands-on experience with vLLM, SGLang, or similar inference stacks
- Deep understanding of GPU architectures and distributed inference scaling
- Experience with high-performance computing and large-scale workloads on heterogeneous GPU clusters
Preferred Qualifications
- Experience contributing to upstream open-source projects
- Strong background in kernel development and GPU runtime integration
- Expertise in integrating optimized GPU performance into PyTorch or TensorFlow
- Understanding of compiler systems including LLVM, Triton, and ROCm
About the Company
AMD is a global leader in high-performance computing, graphics, and visual computing technologies. Our mission is to build products that accelerate next-generation computing experiences—from AI and data centers to PCs and gaming systems. We strive to push the limits of innovation to solve the world's most important challenges.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeSenior Software Development Engineer – LLM Inference Framework
AMD · Santa Clara
