GPU/ML Systems Engineer at Aivar Innovations Private Limited - ScoutJobs - The AI-curated global job board
Skip to content
Aivar Innovations Private Limited
Posted 16 hours ago

GPU/ML Systems Engineer

Aivar Innovations Private LimitedGPU/ML Systems Engineer

Requirements

3-7 years experience, Hands-on GPU optimization, vLLM or Triton Inference Server, Model quantization (INT8, FP16, GPTQ, AWQ), CUDA ecosystem, AWS GPU instances, Performance profiling

Skills

GPUMachine LearningCUDA

About the role

Responsibilities

  • Deploy and tune vLLM with multi-GPU tensor parallelism, dynamic batching, and KV cache optimization for LLMs
  • Configure NVIDIA Triton for production multi-model serving with custom backends and model ensembles
  • Build TensorRT-LLM optimized model binaries for maximum throughput on L40S, A100, and H100 GPUs
  • Implement AWS Inferentia deployments using Neuron SDK, including model compilation and performance tuning
  • Execute model quantization (INT8, FP16, GPTQ, AWQ) with rigorous quality-accuracy tradeoff analysis
  • Run comprehensive load testing using Locust to map performance cliffs and scaling thresholds
  • Produce detailed benchmark reports with instance selection and cost-per-token recommendations

Requirements

  • 3–7 years of experience with GPU-accelerated ML workloads in production
  • Hands-on experience with LLM serving frameworks such as vLLM, TensorRT-LLM, or Triton Inference Server
  • Deep understanding of GPU architecture, including memory hierarchy, tensor cores, NVLink, and NCCL
  • Proficiency in model quantization techniques (INT8, FP16, GPTQ, AWQ)
  • Strong knowledge of the CUDA ecosystem (drivers, cuDNN, NVIDIA container toolkit)
  • Experience with performance profiling tools like Nsight, nvidia-smi, or DCGM
  • Practical experience managing AWS GPU instances (G-series, P-series)

Preferred Qualifications

  • Experience optimizing models for custom accelerators like AWS Inferentia or Trainium
  • Familiarity with KServe and Prometheus + DCGM Exporter for monitoring

Benefits

  • Learn from experts, including former AWS leaders and AI pioneers
  • Direct ownership of high-impact "greenfield" projects from concept to launch
  • Access to modern tech stacks, including the latest Generative AI frameworks
  • Opportunity for rapid career growth in a high-speed environment

About the Company

Aivar Innovations is an AI-first technology partner where cutting-edge technology meets industry expertise. We provide AI-augmented teams that accelerate development, reduce time-to-market, and deliver exceptional code quality for major global enterprises.

ScoutJobs Agent

Get matches like this delivered daily

Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.

Get started — it's free

GPU/ML Systems Engineer

Aivar Innovations Private Limited · Bangalore

Sign up to apply