
Data Engineer
Institute of Foundation Models
Data Engineer
Employment Type
Full Time
Location
Abu Dhabi
Experience
Mid Level, Senior
Benefits
Requirements
Job Description
Responsibilities
- Rapidly collect, curate, and preprocess datasets based on detailed specifications provided by NLP researchers, delivering data within tight timelines (typically within 1-2 days).
 - Develop and maintain efficient web crawling solutions, APIs, and automated workflows to continuously improve data collection processes.
 - Refine and evaluate outputs from Large Language Models (LLMs) to generate structured datasets suitable for model training and benchmarking.
 - Implement scalable data pipelines, ensuring efficient data processing, storage, retrieval, and distribution to research teams.
 - Collaborate closely with researchers and engineers to ensure collected data meets specified quality and relevance criteria.
 - Document data collection methodologies, dataset characteristics, and pipeline architecture clearly and effectively.
 - Engage with peer teams and participate in technical reviews to uphold best practices and data quality standards.
 - Represent MBZUAI at industry and research forums, showcasing technical capabilities in large-scale data processing and AI data infrastructure.
 - Perform all other duties as reasonably directed by the line manager commensurate with these functional objectives.
 
Requirements
- Bachelor’s degree in Computer Science, Data Science, Engineering, or a related technical field
 - Extensive experience in data engineering, data processing, and automation using Python
 - Proficiency in designing and deploying web crawling solutions, automated data extraction, and processing pipelines
 - Strong understanding of data structures, algorithms, databases, SQL, and performance optimization
 - Experience working with cloud infrastructure and distributed data processing frameworks (e.g., AWS, Spark, Kafka, Kubernetes)
 - Excellent problem-solving abilities and attention to detail
 - Strong communication and collaboration skills
 
Preferred Qualifications
- Master’s degree or equivalent experience in Computer Science, Data Engineering, or related technical fields
 - Proven track record supporting NLP or AI research teams with rapid data delivery
 - Experience refining outputs from large-scale AI models, such as LLM-generated data
 - Contributions to open-source projects or visible activity in coding communities (e.g., GitHub, Stack Overflow)
 - Familiarity with advancements in NLP data processing and large language model technologies
 
Benefits
- Health Insurance
 - Annual Leave
 - Visa
 - Relocation Allowance
 
About the Company
The Institute of Foundation Models is a dedicated research lab committed to building, understanding, using, and risk-managing foundation models. Our mission is to advance research, nurture future AI builders, and contribute transformative innovations for a knowledge-driven economy. As part of our Abu Dhabi-based team, you will collaborate with world-class researchers, data scientists, and engineers, developing AI solutions with the power to shape whole industries. The institute strives to inspire the next generation of AI pioneers and establish itself as a global leader in high-performance deep learning research.
How to Apply
Similar Jobs You Might Be Interested In

Rust/Go Software Engineer - Dubai
Syndica
Mid Level, Senior Information Technology Full Time OnsitePosted a month ago

AI Product Engineer
Brain Co.
Mid Level, Senior Information Technology Full Time OnsiteAnnual Leave Health Insurance Medical Insurance Paid LeavePosted a month ago

Distribution Business Manager
Palo Alto Networks
Senior, Manager Information Technology Full Time OnsitePaid Leave Visa Health InsurancePosted 25 days ago

Forward Deployed Software Engineer | Middle East
Gecko Robotics
Mid Level, Senior Information Technology Full Time OnsiteHealth Insurance Medical Insurance Paid LeavePosted 25 days ago

Associate Staff Engineer (iOS Developer)
Nagarro
Mid Level, Senior Information Technology Full Time OnsitePosted 24 days ago

Product Strategist | Middle East
Gecko Robotics
Mid Level, Senior Information Technology Full Time OnsiteAnnual Leave Family Medical Insurance Health Insurance Medical Insurance Paid Leave Family VisaPosted 24 days ago
Use The Smartest AI CV Builder To Land A Remote Job Faster in Dubai
Rezi's award-winning AI-powered resume builder is trusted by hundreds of thousands of job seekers. Create your perfect resume in minutes with Rezi.
- AI-powered CV Builder
 - AI-powered CV Editor
 - AI-powered CV Score
 - AI-powered Keyword Targeting
 

Dynatrace Engineer
Ghobash Group (CNS Middle East)
Senior Information Technology Full Time OnsitePosted 16 days ago


Help Desk
BlackStone eIT
Senior Information Technology Full Time OnsitePaid Leave Relocation AllowancePosted 14 days ago

Software Engineer
Tavily
Mid Level Information Technology Full Time OnsiteAccommodation Education Allowance Family Medical Insurance Family Visa Flights Health Insurance Relocation AllowancePosted 14 days ago