Senior Research Engineer, Olmo + Molmo

Allen Institute for AI · Seattle, WA · $146k - $220k

full-time senior Posted 3 weeks ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

fine-tuning pytorch deep-learning healthcare llm tensorflow search agents

About this role

Persons in these roles are expected to work from our offices in Seattle. On-site requirements vary based on position and team. If you have questions about on-site work arrangements for this role, please ask your recruiter. Our base salary range is $146,880 - $220,320, and in addition we have generous bonus plans to provide a competitive compensation package. Who You Are: To thrive as a Research Engineer at Ai2, you'll bring a blend of deep technical expertise and a collaborative, self-directed mindset. You have extensive experience with deep learning and/or foundation models — whether through a PhD in ML or equivalent hands-on industry work. You're a curious, agile engineer who can generate ideas, design experiments, and implement them in Python against real AI systems. You communicate research insights clearly to technical stakeholders, and you're energized by working with strong contributors toward shared, ambitious goals. As a Research Engineer on the team, you'll be a core member responsible for training Ai2's flagship open models (e.g. Olmo, Molmo, and beyond). From system design to experiment release, you'll own end-to-end delivery while collaborating closely with research and engineering colleagues to push the boundaries of open model research. Who We Are: We are a non-profit AI institute focused on developing foundational AI research and innovation to deliver real-world positive impact through large-scale open models, data, and artifacts (e.g., Olmo , Tulu , Molmo , FlexOlmo ). Balancing academic freedom with corporate-level scale ( read about our new compute cluster here ), Ai2 is uniquely resourced and positioned to deliver high-impact, truly open research. Our team unites the best and brightest scientific and engineering minds to explore the potential of truly open AI. Through our efforts, including the pioneering Olmo and Molmo releases, we endeavor to empower academics, researchers, and AI developers more broadly to advance the science of language models, multimodal models, and generative AI. If you are passionate about advancing the science of AI through open, rigorous research and believe in accessible AI for the common good, we want to hear from you! Your Next Challenge: Key responsibilities: Building and optimizing infrastructure for LLM, multimodal, and agentic research — including training/inference pipelines, dataset curation, and large-scale preprocessing Designing, training, and evaluating multimodal models (vision + language) and agentic workflows, including tool use, planning, and long-horizon tasks Scoping and leading research projects, prioritizing experiments for highest impact Bringing strong software engineering practices to a research environment and bridging cutting-edge work to production-quality products Contributing to and supporting the open-source community through model releases, datasets, public APIs, and technical reports What You’ll Need: 4+ years of ML infrastructure experience — data preprocessing, model training, evaluation, inference, and deployment Experience with end-to-end model development — dataset construction, training, fine-tuning, evaluation, profiling, and monitoring Familiarity with modern model architectures — including LLMs (MoEs, long-context models), vision-language models (e.g., Molmo, LLaVA), and experience training and evaluating both Agentic systems knowledge — tools, memory, and long-running workflows Strong software engineering fundamentals — performant, scalable systems and confident debugging Proficiency in Python and a major ML framework (PyTorch, JAX, or TensorFlow), with the flexibility to pick up new tools as needed Familiarity with cloud and containerization (e.g., GCP, AWS, Docker) Strong communication and collaboration skills — we're a small, close-knit team and work best when everyone's pulling in the same direction Education/Experience: BS or MSc in Computer Science, Statistics, Engineering, Applied Mathematics, or a related quantitative field (or equivalent experience) A minimum of 2 years of software development experience. (or equivalent experience) Physical Demands and Work Environment: The physical demands described here are representative of those that must be met by a team member to successfully perform the essential functions of this position. Reasonable accommodations may be made to enable individuals with disabilities to perform the functions. Must be able to remain in a stationary position for long periods of time. The ability to communicate information and ideas so others will understand. Must be able to exchange accurate information in these situations. The ability to observe details at close range. Can work under deadlines. A Little More About Ai2: Ai2 is a Seattle based non-profit AI research institute founded in 2014 by the late Paul Allen. Our mission is building breakthrough AI to solve the world’s biggest problems. We develop