
Posted 4 days ago
Senior Staff Software Engineer (Machine Learning Platform)
TekionSenior Staff Software Engineer ( Machine learning Platform)
Requirements
12–15+ years building large-scale data/ML or platform systems, Proficiency in Python and Java, Scala, or Go, Experience with MLOps pipelines (Airflow, Kubeflow, MLflow), Cloud expertise in AWS and container orchestration (Docker, Kubernetes), Knowledge of LLM gateways, agentic systems, and orchestration, Experience with graph databases (Neo4j, Neptune) and vector search
Skills
PythonAWSKubernetesMLOpsLLMDockerKafka
About the role
Responsibilities
- Build and operate the LLM control plane and gateway, including smart routing, rate limiting, failover, and cost tracking.
- Develop unified APIs and SDKs (REST/gRPC) with normalized schemas, structured outputs, and full observability.
- Design and manage agent orchestration patterns, including tool registries, function calling, and long-running workflows.
- Implement safety and privacy guardrails such as content filtering, prompt validation, and PII redaction.
- Build and scale MLOps pipelines for both classical ML models and deep learning, including experiment tracking and deployment.
- Evolve the domain graph and retrieval systems using hybrid search (graph, vector, and keyword) to serve real-time context to agents.
- Define and maintain SLOs for latency, uptime, and cost, while enabling autoscaling and spend controls.
- Provide developer tools, templates, and documentation to enable product teams to ship AI features rapidly.
Requirements
- 12–15+ years of experience building large-scale data, machine learning, or platform systems.
- Proficiency in Python and at least one of Java, Scala, or Go.
- Extensive experience with MLOps pipelines (e.g., Airflow, Kubeflow, MLflow) and CI/CD for models.
- Strong expertise in cloud infrastructure (AWS preferred) and container orchestration (Docker, Kubernetes).
- Hands-on experience with LLM gateways, agentic orchestration, and safety guardrails.
- Practical knowledge of graph databases (e.g., Neo4j, Neptune) and vector search technologies.
- Deep understanding of distributed systems, microservices, and API design.
Preferred Qualifications
- Experience building or operating multi-tenant SaaS platforms with strict SLAs.
- Knowledge of online feature computation using Spark, Flink, or Kafka.
- Experience with hybrid retrieval patterns and large-scale entity resolution.
- A "platform-as-a-product" mindset with a focus on developer experience and observability.
About the Company
Tekion is disrupting the automotive industry with the first cloud-native automotive platform. By connecting the entire ecosystem—OEMs, retailers, and consumers—through a seamless, AI-driven platform, Tekion is enabling the best automotive retail experiences ever. We employ close to 3,000 people across North America, Asia, and Europe.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeSenior Staff Software Engineer (Machine Learning Platform)
Tekion · Bengaluru
