
Posted 14 hours ago
ML Infrastructure/MLOps Engineer
Aivar Innovations Private LimitedML Infrastructure/MLOps Engineer
Perks & benefits
Education AllowanceHealth InsuranceRelocation Allowance
Requirements
3-7 years MLOps/ML platform experience, Kubernetes (EKS preferred), Ray/KubeRay, Kubeflow, or MLflow, Distributed training (PyTorch DDP, Horovod, or DeepSpeed), Model serving (KServe, Seldon, or FastAPI), Strong Python engineering
Skills
KubernetesMLOpsPythonRayKubeflowMLflowAWSPyTorch
About the role
Responsibilities
- Deploy and optimize Ray + KubeRay for distributed data processing and model training across GPU clusters
- Build Kubeflow Pipelines for reproducible ML workflows, including data prep, training, evaluation, and deployment
- Configure MLflow for centralized experiment tracking and model registry
- Implement advanced job scheduling using tools like Kueue or Volcano for queue management and priority
- Build model CI/CD pipelines for automated training, validation, and canary/blue-green deployment
- Create self-service tooling for data scientists, including cluster provisioning and GPU allocation
- Monitor ML workload performance, focusing on GPU utilization and training throughput
Requirements
- 3–7 years of experience in ML infrastructure, MLOps, or ML platform engineering
- Strong expertise in Kubernetes (EKS preferred), including deployments, PVs, RBAC, and resource management
- Proficiency in at least two of the following: Ray/KubeRay, Kubeflow, MLflow, Airflow, or Argo Workflows
- Experience with distributed training frameworks such as PyTorch DDP, Horovod, DeepSpeed, or Ray Train
- Experience with model serving tools like KServe, Seldon, or custom FastAPI serving
- Strong Python engineering skills focused on automation and tooling rather than just notebooks
- Experience with GPU scheduling and resource management on Kubernetes
Benefits
- Learn from experts, including former AWS leaders and AI pioneers
- Direct ownership of high-impact "greenfield" projects from concept to global launch
- Work with modern technology, including the latest Generative AI frameworks and cloud-native architectures
- Build mission-critical systems used by major global enterprises
- Rapid career growth opportunities in a high-speed environment
About the Company
Aivar Innovations is an AI-first technology partner where cutting-edge technology meets industry expertise. We provide AI-augmented teams that accelerate development, reduce time-to-market, and deliver exceptional code quality for global enterprises.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeML Infrastructure/MLOps Engineer
Aivar Innovations Private Limited · Bengaluru
