
Posted a day ago
Software Engineer, Inference - Multi Modal
OpenAISoftware Engineer, Inference - Multi Modal
Requirements
experience building and scaling inference systems for LLMs, experience with GPU-based ML workloads, familiarity with vLLM or TensorRT-LLM, knowledge of distributed compute and networking
Skills
PythonGPULLMvLLM
About the role
Responsibilities
- Design and implement inference infrastructure for large-scale multimodal models.
- Optimize systems for high-throughput, low-latency delivery of image and audio inputs and outputs.
- Enable experimental research workflows to transition into reliable production services.
- Collaborate closely with researchers, infra teams, and product engineers to deploy state-of-the-art capabilities.
- Contribute to system-level improvements including GPU utilization, tensor parallelism, and hardware abstraction layers.
Requirements
- Experience building and scaling inference systems for LLMs or multimodal models.
- Experience with GPU-based ML workloads and understanding the performance dynamics of large models.
- Familiarity with inference tooling such as vLLM, TensorRT-LLM, or custom model parallel systems.
- Knowledge of distributed compute, networking, and high-throughput data handling.
- Ability to own problems end-to-end in ambiguous, fast-moving environments.
Preferred Qualifications
- Experience working with image generation or audio synthesis models in production.
- Exposure to distributed ML training or system-efficient model design.
Benefits
- Competitive salary ($295K – $555K) and generous equity.
- Comprehensive medical, dental, and vision insurance.
- 401(k) retirement plan with employer match.
- Paid parental leave and flexible PTO.
- Daily meals in the office and mental health support.
- Annual learning and development stipend.
About the Company
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of AI capabilities and seek to safely deploy them to the world through our products.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeSoftware Engineer, Inference - Multi Modal
OpenAI · San Francisco
