
Posted a day ago
Lead/Staff Data Engineer
ApnaLead/Staff Data Engineer - Data Platform
Requirements
5-7 years of data engineering experience, Hands-on experience with Apache Airflow, Strong knowledge of Presto or Trino, Deep understanding of Apache Hudi concepts, Strong SQL skills, Proficiency in Python, Java, or Scala, Experience with distributed data processing
Skills
TrinoAirflowPythonSQL
About the role
Responsibilities
- Build and scale scalable batch and near-real-time data pipelines for product, business, and ML use cases
- Design and improve lakehouse architecture using technologies like Apache Hudi
- Manage large-scale analytical workloads using query engines such as Presto or Trino
- Build and maintain orchestration workflows using Apache Airflow
- Create reusable data models, curated datasets, and reliable data marts
- Improve platform reliability, observability, SLA tracking, lineage, and data quality
- Optimize storage, compute, query performance, and pipeline costs
- Partner with product, analytics, and ML teams to convert data needs into scalable solutions
- Mentor data engineers and drive engineering standards for data modeling and schema evolution
Requirements
- 5-7 years of experience in data engineering, preferably at scale
- Hands-on experience with Apache Airflow or similar orchestration systems
- Strong knowledge of Presto, Trino, or other distributed query engines
- Deep understanding of Apache Hudi concepts (upserts, compaction, schema evolution, etc.)
- Strong SQL skills and ability to debug complex data issues
- Proficiency in Python, Java, or Scala
- Strong knowledge of distributed data processing and storage systems
- Experience designing and building reliable ETL/ELT pipelines and data models
Preferred Qualifications
- Experience with Kafka, Spark, Flink, Hive, Iceberg, or Delta Lake
- Experience building internal data platforms or self-serve data infrastructure
- Experience with data quality frameworks like Great Expectations, Deequ, or Soda
- Exposure to ML feature pipelines or feature stores
- Experience with cloud infrastructure such as AWS, GCP, or Azure
- Understanding of data governance, metadata management, and PII handling
About the Company
Apna is building a professional network that helps users find jobs, employers find talent, and communities grow. Data is central to how we build products, understand users, and power AI-driven systems.
ScoutJobs Agent
Get matches like this delivered daily
Sign up free — we'll pull jobs that fit your CV from across the web and rank them for you.
Get started — it's freeLead/Staff Data Engineer
Apna · Bengaluru
