Staff Applied AI Researcher - Agentic Reasoning Systems (Dublin, CA)

Articul8 · Dublin, Ireland

full-time lead Posted 1 month ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

generative-ai robotics reinforcement-learning agents llm research

About this role

About us: Articul8 was born from a simple belief: GenAI should work for the enterprise, not the other way around. Our platform combines domain-specific models, autonomous agentic reasoning through ModelMesh(TM), reliable model evaluation through LLM-IQ(TM), and multimodal understanding to serve regulated industries including energy, semiconductor, finance, aerospace, and supply chain. Trusted by Fortune 500 enterprises, we bring together research, engineering, product, and domain expertise to deliver AI that meets the accuracy, explainability, and auditability standards that high-stakes environments demand. Job Description: Articul8 AI is seeking a Staff Applied AI Researcher to define how our platform reasons at runtime and how autonomous systems make trustworthy decisions in production. You will lead research across the core runtime intelligence capabilities behind ModelMesh(TM): task decomposition, agent coordination, model and tool routing, probabilistic decisioning, verification, observability-aware execution, and the evaluation methods that determine whether autonomous behavior is reliable enough for enterprise use. Responsibilities: - Set technical direction for agentic reasoning systems and runtime intelligence across ModelMesh™ — define the orchestration strategies, decision policies, verification approaches, and runtime quality standards that determine how massively parallel agent systems reason, coordinate, and self-correct in production - Architect the infrastructure for researcher augmentation at scale — design the agentic platforms and orchestration primitives that enable every researcher and engineer at Articul8 to deploy fleets of AI agents for experimentation, evaluation, and production integration — multiplying the depth, breadth, and velocity of the entire organization - Go deep: advance the science of autonomous reasoning — design, train, and refine the learned components behind runtime decisioning (routing models, verification models, confidence estimators, reward models, policy selectors), using massively parallel agent-driven experiment pipelines to explore architectural and algorithmic frontiers exhaustively - Go broad: unify perception, retrieval, reasoning, and action — build repeatable methodology for composing domain-specific models, data perception systems, knowledge graphs, retrieval layers, and external tools into coherent agentic workflows, delegating integration testing and cross-modal benchmarking to parallel agent systems so you can reason across the full stack simultaneously - Drive research on agent reliability for regulated environments — lead failure detection, self-checking, verification workflows, compounding error analysis, and auditable autonomous behavior research, using agent-orchestrated stress testing and red-teaming at scales that manual evaluation cannot reach - Define evaluation methodology for runtime intelligence — establish how task success, decision quality, robustness, traceability, and failure recovery are measured under realistic enterprise conditions, building agentic evaluation harnesses that run continuously and surface regressions before they reach customers - Influence platform-level architecture — shape decisions on model routing, tool use, observability, governance, access control, and interoperability with external agent ecosystems, ensuring the platform is designed for humans and agents to amplify each other - Mentor researchers across levels in the agentic paradigm — raise the bar on technical judgment, experimental rigor, and agent-augmented research practice; contribute to hiring researchers who are driven to maximize their human potential - Maintain hands-on research impact — sustain a meaningful personal research contribution through technical work, publications, patents, and externally visible output, modeling what it looks like to be a deeply technical leader who uses agentic systems to go deeper and faster than ever before Required Qualifications: - Education: PhD or MSc in Computer Science, Machine Learning, AI, Robotics, or a related field. - Experience: 8+ years in AI/ML research with demonstrated impact on production systems, including 3+ years building LLM-based or autonomous AI systems. - Reasoning and orchestration: Deep hands-on experience in at least two of: multi-agent coordination, planning under uncertainty, sequential decision-making, probabilistic inference, model routing, or tool-using agent systems. You've built systems where multiple models must collaborate to produce a reliable outcome. - Evaluation of autonomous systems: You have designed evaluation frameworks for systems where correctness is not binary — measuring decision quality, reliability under distribution shift, compounding error rates, and failure recovery in production-like conditions. - Systems at scale: You have designed and operated research systems that integrate multiple models, data sources, an