Evaluation Jobs

92 jobs from companies building with AI · Avg salary $239k (44 with data)

AI evaluation engineering roles focused on benchmarking, testing, and measuring model performance. Evaluation engineers build the frameworks that determine whether AI systems are improving.

Senior Airworthiness Engineer, Air Dominance & Strike

Anduril · Costa Mesa, CA · $146k - $194k
payments cloud computer-vision evaluation
Remote full-time senior 5 days ago

Principal Software Engineer, AI Observability & Evals Platform

LangChain · Boston, MA · $230k - $270k
llm agents platform evaluation
Hybrid full-time principal 1 week ago

Research Engineer – Evals

Firecrawl · San Francisco, CA · $160k - $240k
reinforcement-learning fine-tuning computer-graphics search llm research evaluation
Remote full-time senior 1 week ago

Machine Learning Engineer, LLM Evals & Observability

Glean · Mountain View, CA · $200k - $300k
nlp agents data-pipeline cloud llm reinforcement-learning machine-learning evaluation
On-site full-time junior 1 week ago

Research Engineer, Model Evaluations

Anthropic · San Francisco, CA · $320k - $485k
alignment distributed-systems data-pipeline agents search llm evaluation research
Hybrid full-time principal 3 weeks ago

Product Manager, Public Sector GenAI Test & Evaluation (T&E)

Scale AI · San Francisco, CA · $154k - $193k
agents fine-tuning generative-ai llm evaluation
On-site full-time mid 3 weeks ago

Senior Staff Software Engineer, Indexing & Retrieval Platform

Reddit · Remote (US) · $279k - $390k
search generative-ai cloud healthcare distributed-systems machine-learning evaluation
Remote full-time lead 4 weeks ago

Sr. Software Engineer, Computer Vision

SpaceX · Hawthorne, CA · $160k - $225k
llm agents data-pipeline pytorch computer-vision fine-tuning deep-learning evaluation
On-site full-time senior 4 weeks ago

Evaluation Lead

Waabi · Toronto, Canada · $159k - $260k
robotics generative-ai autonomous-vehicles evaluation
On-site full-time lead 1 month ago

Software Engineer, Evaluation Infrastructure

Waabi · Toronto, Canada · $127k - $223k
autonomous-vehicles data-pipeline distributed-systems generative-ai infrastructure evaluation
On-site full-time junior 1 month ago
Hiring AI developers? Start with a job post. Claiming the profile comes after. Post a job →

Engineering Manager, Agent Prompts & Evals

Anthropic · San Francisco, CA · $320k - $405k
alignment agents llm evaluation
Hybrid full-time lead 1 month ago

Research Scientist, Frontier Risk Evaluations

Scale AI · San Francisco, CA · $216k - $270k
llm alignment generative-ai research evaluation
On-site full-time senior 1 month ago

Research Engineer, Evaluations

AssemblyAI · New York, NY · $210k - $260k
llm data-pipeline speech search cloud research evaluation
Remote full-time senior 1 month ago

Cloud Evals Infrastructure Engineer

METR · Berkeley · $257k - $340k
llm infrastructure research evaluation
On-site full-time senior 1 month ago

Machine Learning Engineer, LLM Evals & Observability

Glean · San Francisco, CA · $200k - $300k
reinforcement-learning data-pipeline nlp agents cloud llm evaluation machine-learning
On-site full-time junior 2 months ago

Prompt Engineer, Agent Prompts & Evals

Anthropic · San Francisco, CA · $320k - $405k
alignment nlp llm evaluation prompt-engineering
Hybrid full-time senior 3 months ago

Tech Lead, Performance Evaluation

May Mobility · Remote (US) · $200k - $295k
robotics deep-learning autonomous-vehicles healthcare evaluation
Remote full-time lead 3 months ago

Director of AI Foundations, Foundation Model Evaluation & Data

Waymo · Mountain View, CA · $332k - $421k
reinforcement-learning generative-ai llm autonomous-vehicles nlp evaluation
Hybrid full-time lead 4 months ago

Senior Backend Software Engineer, AI Observability & Evals Platform (LangSmith)

LangChain · San Francisco, CA · $175k - $225k
api-design agents backend evaluation
Hybrid full-time senior 4 months ago

Staff Machine Learning Engineer (Infra), Driver Understanding and Evaluation

Waymo · Mountain View, CA · $251k - $310k
fine-tuning robotics autonomous-vehicles pytorch tensorflow distributed-systems evaluation infrastructure
On-site full-time lead 4 months ago
Hiring AI developers? Start with a job post. Claiming the profile comes after. Post a job →

Weekly AI Jobs Digest

Top new roles from 50+ companies. Curated, not scraped. One email, every Monday.

No spam. Unsubscribe anytime.

Hiring AI engineers?

Post the role first. Your company profile and analytics connect from the employer flow.

Post a Job See Pricing

Agentic API & MCP Server

Wire AI Dev Jobs into your agent at build time — MCP server live, REST API public for discovery, free API keys for recurring search.

# Add as MCP server
claude mcp add --transport http aidevjobs https://aidevboard.com/mcp

# Or hit the REST API
curl https://aidevboard.com/api/v1/jobs?tags=llm,pytorch

13 tools via com.aidevboard/jobs · Free keyed access · Pro $49/mo →