Responsibilities

Develop a rigorous understanding of what makes an agent a great collaborator across professional, creative, and technical domains.
Translate qualitative judgments about model behavior into concrete hypotheses, evaluations, graders, and training interventions.
Improve reward models and RL objectives to shape model behaviors and personality.
Work with human experts to produce high-quality preference data and tasteful rollouts that capture excellent collaborative behavior.
Partner with pretraining and product teams to integrate personality improvements into the full training stack and real-world workflows.
Build sustainable pipelines for updating training data as understanding of model behavior evolves.

Requirements

Strong technical foundations in machine learning, software engineering, or statistics.
Proven experience with Large Language Models (LLMs) and post-training methodologies.
Hands-on experience with RL/RLHF, reward modeling, and creating robust evaluations.
Experience working with synthetic data and production ML systems.
Ability to translate subjective product questions into falsifiable hypotheses and rigorous technical experiments.
Strong communication skills to collaborate effectively with researchers, engineers, and designers.

Preferred Qualifications

Background in behavioral science or Human-Computer Interaction (HCI).
Deep "taste" for model behavior and the ability to articulate why specific model responses feel natural or useful.
Experience managing end-to-end projects from behavioral observation through to training and launch.

Benefits

Competitive base salary and generous equity packages.
Comprehensive medical, dental, and vision insurance.
401(k) retirement plan with employer match.
Flexible paid time off and paid parental leave.
Daily meals in the office and mental health support.
Annual learning and development stipend.

About the Company

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of AI capabilities and seek to safely deploy them to the world through products like ChatGPT and the API.

Researcher, Agent Post-Training, Personality

Requirements

Skills

About the role

Responsibilities

Requirements

Preferred Qualifications

Benefits

About the Company

Get matches like this delivered daily

Researcher, Agent Post-Training, Personality