Principal AI/ML Researcher / Engineer In Reasoning, Planning, and Decision-making systems

Airbnb · United States · $296k - $370k

full-time principal Posted 1 month ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

AI Market Demand Pack · $29 one-time

Compare this role's skills with the full AI hiring market. Get ranked demand, salary bands, leading companies, public source URLs, and a decision brief.

See the live sample →

pytorch agents robotics rag fine-tuning llm reinforcement-learning generative-ai

About this role

Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible for guests to connect with communities in a more authentic way. About the Role We are seeking a Principal / Distinguished AI/ML Researcher and/or Engineer with deep experience in reasoning, planning, and decision-making systems. This role is ideal for individuals who have architected post-training intelligence frameworks, integrated Large Reasoning Models (LRMs) with Knowledge Graphs, and applied Reinforcement Learning (RL) as a first-class component of adaptive planning and control. You will be responsible for inventing, scaling, and operationalizing intelligent decisioning substrates that blend symbolic and sub-symbolic methods, enabling next-generation AI systems that go beyond pattern recognition into the realm of deliberation, foresight, and agency. Our mission is to build cognitive AI systems that combine post-trained foundational models, explicit memory and knowledge, and recursive planning strategies to power sophisticated real-world decisioning in personalized environments. You will collaborate across disciplines and influence company-wide AI architecture. A core dimension of this role is the design and deployment of multi-agent systems, where reasoning, planning, and decisioning are distributed across networks of intelligent agents. You will formulate coherent, synergistic strategies that enable agents to cooperate, negotiate, and align objectives, ensuring that distributed intelligence converges to purposeful, high-quality outcomes across contexts. Relevance and Impact of This Role This role advances Airbnb's AI capabilities toward reasoning, planning, and adaptive decision-making across complex real-world environments. The near-term impact spans improved decision quality, contextual intelligence, adaptive personalization, and operational coordination across guest and host workflows — introducing goal-directed reasoning systems capable of handling ambiguity, constraints, trade-offs, and multi-step planning. Guests benefit from more intelligent planning and assistance, while hosts and internal teams gain systems capable of adaptive optimization, dynamic recommendations, policy-aware decisioning, and intelligent workflow orchestration. Longer term, this role helps establish Airbnb's leadership in cognitive AI systems and distributed intelligence architectures. The technologies developed here become the decisioning and reasoning substrate underlying the broader ecosystem — enabling AI systems that can deliberate, coordinate, adapt, and make coherent long-horizon decisions across multiple agents and environments. Over time, this positions Airbnb as an intelligent coordination and planning platform where AI systems actively reason, plan, and coordinate actions across the marketplace in ways that continuously improve user outcomes, ecosystem health, and strategic adaptability. What You Will Do Research & Innovation Drive foundational and applied research in reasoning engines, planning architectures, and decision-making frameworks at scale in order to incorporate genAI into the ranking / recommendation / personalization stack in both single model to multi-agent ( system ) level intelligence with objective to grow the business (new user growth, abandoned user, long tailed user) in existing and new business areas while supporting Multi-Modal NL → Conversational Interfaces. Advance techniques in LLM/LRM post-training, reinforcement learning–based decisioning, and knowledge-integrated agents. Design methods for plan induction, value estimation, and contingency modeling within intelligent agents. Explore and validate protocols for distributed reasoning and joint planning among cooperative agents in multi-agent systems. System Design & Architecture Architect RPD systems that integrate post-trained LLMs/LRMs, graph-structured memory (e.g., KGs), and RL-driven controllers. Design recursive task planners, search-based or policy-based reasoners, and belief-state trackers that can interoperate with large model substrates. Ensure modularity and extensibility through multi-agent frameworks, agentic substrates, and declarative planning pipelines. Define communication protocols, coordination strategies, and cross-agent knowledge alignment mechanisms to foster emergent cooperative intelligence. Model Development Build and evolve stateful, dynamic models that combine supervised learning with online/offline reinforcement, simulation-based rollouts, and symbol grounding. Implement hybrid pipelines that couple learned embeddings, prompted generative models, and graph-theoretic inference. Optimize systems for adaptive exploration, planning horizon control, and policy robustness. Develop