
Posted 18 days ago
Principal Software Engineer
DataRobotPrincipal Software Engineer
Perks & benefits
Family Medical InsuranceHealth InsuranceMedical InsurancePaid Leave
Requirements
10+ years engineering experience, 5+ years infrastructure or platform experience, Deep Kubernetes expertise, Proficiency in Python or Go, Experience with AWS, GCP, or Azure, Experience with Helm and CI/CD, Experience with IaC (Terraform, Pulumi)
Skills
PythonGoKubernetesAWSTerraformLLM
About the role
Responsibilities
- Lead the technical vision and design of high-performance inference engines for agentic infrastructure and LLM serving systems
- Design, develop, and optimize the inference engine to ensure large language model (LLM) serving is fast, scalable, and efficient
- Collaborate with partners like NVIDIA to integrate new model architectures such as sparsity and mixture-of-experts
- Optimize for latency, throughput, memory efficiency, and hardware utilization across GPUs and accelerators
- Build and maintain instrumentation, profiling, and tracing tooling to guide optimizations
- Develop scalable routing, batching, scheduling, and memory management mechanisms for inference workloads
- Mentor engineers, shape architecture, and influence cross-team roadmaps
Requirements
- 10+ years of engineering experience, with at least 5+ years in infrastructure, platform, or backend systems
- Deep expertise in Kubernetes internals, including networking, scheduling, scaling, and controller patterns
- Strong proficiency in Python or Go
- Experience operating across multiple cloud providers (AWS, GCP, or Azure)
- Experience with Helm, container orchestration patterns, and CI/CD automation
- Experience with Infrastructure as Code (Terraform, Pulumi) and GitOps workflows
- Proven ability to design and build complex systems from scratch
Preferred Qualifications
- Familiarity with Cilium, Kyverno, KEDA, Gateway API, or OPA
- Experience building and running multi-tenant SaaS platforms
- Experience with GPU infrastructure for training and inference
- Experience with performance tuning for large-scale data or compute workloads
Benefits
- Medical, Dental, and Vision Insurance
- Flexible Time Off Program
- Paid Holidays and Paid Parental Leave
- Global Employee Assistance Program (EAP)
About the Company
DataRobot delivers AI that maximizes impact and minimizes business risk. Our platform and applications integrate into core business processes so teams can develop, deliver, and govern AI at scale. Organizations worldwide rely on DataRobot for AI that makes sense for their business — today and in the future.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freePrincipal Software Engineer
DataRobot · United States
