Applied Research - RL & Agents

Prime Intellect · San Francisco, CA · $150k - $300k

full-time senior Posted 5 days ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

AI Market Demand Pack · $29 one-time

Compare this role's skills with the full AI hiring market. Get ranked demand, salary bands, leading companies, public source URLs, and a decision brief.

See the live sample →

distributed-systems llm agents reinforcement-learning research

About this role

OWN YOUR INTELLIGENCE Prime Intellect is building the open superintelligence stack: the infrastructure frontier AI labs build internally, made available to every ambitious AI team. Our platform, Lab, unifies compute, environments, evaluations, secure sandboxes, high-performance training, and deployment into one full-stack system for post-training at frontier scale - from SFT and RL to tool use, agent workflows, and continuously improving production models. We are building open frontier AI: open-source models trained end to end for long-horizon tasks like autonomous research, and the full-stack platform our own research team uses to build them. The next generation of AI companies, enterprises, and research teams do not just need more GPUs. They need the ability to turn their own workflows, tools, data, and feedback loops into superintelligence they own. Prime Intellect has raised $150M in total funding from Founders Fund, Radical Ventures, NVIDIA, and exceptional AI, infrastructure, and enterprise operators — including Andrej Karpathy, Dwarkesh Patel, and leaders and founders from Ramp, Perplexity, Harvey, Mercor, Zapier, Datadog, Cognition, OpenAI, Thinking Machines, Together AI, SemiAnalysis, LangChain, Browserbase, Cloudflare, Sierra, Databricks, Airbnb, OpenRouter, Standard Intelligence, Fleet, Core Auto, and more. We are looking for people who want to build at the intersection of frontier research, real infrastructure, and go-to-market for a category that does not fully exist yet. ROLE IMPACT This is a role at the intersection of cutting-edge RL/post-training methods and applied agent systems. You’ll have a direct impact on shaping how advanced models are aligned, deployed, and used in the real world by: - Advancing Agent Capabilities: Designing and iterating on next-generation AI agents that tackle real workloads—workflow automation, reasoning-intensive tasks, and decision-making at scale. - Building Robust Infrastructure: Developing the systems and frameworks that enable these agents to operate reliably, efficiently, and at massive scale. - Bridge Between Applications & Research: Translate ambiguous objectives into clear technical requirements that guide product and research priorities. - Prototype in the Field: Rapidly design and deploy agents, evals, and harnesses for real-world tasks to validate solutions. Application-Driven Research & Infrastructure - Shape the direction and feature set for verifiers, the Environments Hub, training services, and other research platform offerings. - Build high‑quality examples, reference implementations, and “recipes” that make it easy for others to extend the stack. - Prototype agents and eval harnesses tailored to real-world use cases and external systems. - Pair with technical end‑users (research teams, infra‑heavy customers, open‑source contributors) to design environments, evals, and verifiers that reflect real workloads. Post-training & Reinforcement Learning - Design and implement novel RL and post-training methods (RLHF, RLVR, GRPO, etc.) to align large models with domain-specific tasks. - Build evaluations and harnesses and to measure reasoning, robustness, and agentic behavior in real-world workflows. - Prototype multi-agent and memory-augmented systems to expand capabilities for downstream applications. - Experiment with post-training recipes to optimize downstream performance. Agent Development & Infrastructure - Rapidly prototype and iterate on AI agents for automation, workflow orchestration, and decision-making. - Extend and integrate with agent frameworks to support evolving feature requests and performance requirements. - Architect and maintain distributed training/inference pipelines, ensuring scalability and cost efficiency. - Develop observability and monitoring (Prometheus, Grafana, tracing) to ensure reliability and performance in production deployments. REQUIREMENTS - Strong background in machine learning engineering, with experience in post-training, RL, or large-scale model alignment. - Experience with agent frameworks and tooling (e.g. DSPy, LangGraph, MCP, Stagehand). - Familiarity with distributed training/inference frameworks (e.g., vLLM, sglang, Accelerate, Ray, Torch). - Track record of research contributions (publications, open-source contributions, benchmarks) in ML/RL. - Passion for advancing the state-of-the-art in reasoning and building practical, agentic AI systems. - Strong technical writing abilities (documentation, blogs, papers) and research taste. - Eagerness to drive collaborations with external partners and engage with the broader open-source community. NICE-TO-HAVES - Experience with web programming (React, TypeScript, Next.js). - Experience running LLM evaluations and/or synthetic data generation. - Experience deploying containerized systems at scale (Docker, Kubernetes, Terraform). WHAT WE OFFER - Cash Compensat