Tech Lead Manager - Multi Modal Foundation Models (Language)

Wayve · Sunnyvale, CA

full-time lead Posted 9 months ago

Apply Now

generative-ai deep-learning pre-training llm nlp reinforcement-learning autonomous-vehicles agents

About this role

At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status, pregnancy or related condition (including breastfeeding) or any other basis as protected by applicable law. About us Founded in 2017, Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems. Our vision is to create autonomy that propels the world forward. Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving. In our fast-paced environment big problems ignite us—we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future. At Wayve, your contributions matter. We value diversity, embrace new perspectives, and foster an inclusive work environment; we back each other to deliver impact. Make Wayve the experience that defines your career! The role Wayve is building embodied foundation models for autonomous driving, models that learn from large-scale experience to perceive, understand, and navigate complex environments without relying on brittle rules or maps. This role is an opportunity to lead foundational work at the intersection of large-scale multimodal pretraining, language grounding, and post-training/alignment for embodied agents. You’ll partner with world-class research and engineering teams to define what it means for models to understand language in context , grounded in real scenes, time, and action, and to build the training and evaluation strategy to scale it. You will lead a team responsible for advancing language grounding and reasoning capabilities in Wayve’s multimodal foundation models, spanning pretraining and post-training, representation learning, and rigorous evaluation. While high-quality data and scalable pipelines are essential, the focus is on model capability: grounded understanding, transferable representations, and reliable reasoning behaviors in real-world autonomy settings. What you’ll do Build strong multimodal embeddings / representations that transfer well across downstream tasks (retrieval, grounding, temporal understanding, instruction following, scenario description, action-relevant understanding). Lead language grounding & reasoning research/engineering efforts that connect language to perception, state, and action in embodied autonomy. Drive training strategies across pretraining and post-training , including instruction tuning and preference-based optimization to shape model behavior (e.g., RLHF, DPO , or related methods). Explore a broad set of tasks and architectures, including “ System 1 / System 2 ”-style approaches (fast reactive competence paired with deliberative reasoning or planning-like behaviors). Define and operationalize reasoning-focused evaluations for grounded language understanding (beyond surface-level captioning), including robustness and generalization tests that reflect real-world driving. Maintain a continued emphasis on high-quality datasets for training and evaluation: curation, enrichment, filtering, and integrating third-party datasets where valuable. Partner closely with research and engineering teams to run capability-driven experiments (ablations, scaling studies, post-training iterations) and translate findings into training strategy. Manage and mentor a team of engineers and data scientists; set technical direction, establish strong experimental standards, and deliver measurable capability gains. What you’ll bring Essential Proven technical leadership in foundation-model work (LLMs, VLMs/MLLMs, VLAs, or closely related multimodal systems), with end-to-end ownership of meaningful components (training, evaluation, tooling, or strategy). Strong hands-on deep learning expertise (especially PyTorch ) and the ability to connect training dynamics to model behavior and capability outcomes. Experience designing and interpreting rigorous experiments that inform large-scale training decisions. Strong collaboration skills with ML researchers, including shaping experimental design, training strategy, and evaluation methodology. Ability to operate at scale: working knowledge of distributed processing frameworks like Ray, Spark (or equivalent) and building scalable, fault-tolerant pipelines that support fast iteration is desired. Industry experience in data- and model-intensiv