Evaluation Jobs

84 jobs from companies building with AI · Avg salary $242k (56 with data)

AI evaluation engineering roles focused on benchmarking, testing, and measuring model performance. Evaluation engineers build the frameworks that determine whether AI systems are improving.

Engineering Manager, System Evaluation

Wing · Palo Alto, CA · $155k - $165k

data-pipeline cloud robotics evaluation

On-site full-time lead 3 days ago

Evaluations - Member of Technical Staff

Simile · San Francisco, CA · $200k - $400k

search agents generative-ai llm research evaluation

On-site full-time lead 4 days ago

User Researcher, AI Evaluations

Notion · San Francisco, CA · $196k - $230k

agents llm evaluation research

Remote full-time senior 2 weeks ago

Senior Machine Learning Engineer, Simulation Evaluation

Waymo · Mountain View, CA · $213k - $263k

autonomous-vehicles generative-ai pytorch diffusion-models robotics deep-learning machine-learning evaluation

On-site full-time senior 3 weeks ago

Director, Research - Evaluation & Training

Snorkel AI · San Francisco, CA · $275k - $425k

generative-ai llm research evaluation

On-site full-time lead 3 weeks ago

Director of Platform Management for Simulation, Evaluation & Validation

Wayve · London, UK · $332k - $415k

data-pipeline autonomous-vehicles robotics generative-ai evaluation

Hybrid full-time lead 3 weeks ago

Senior Software Engineer, ML/Eval Data Platforms & Infrastructure

Waymo · Mountain View, CA · $213k - $263k

distributed-systems data-pipeline fine-tuning autonomous-vehicles infrastructure evaluation

Hybrid full-time senior 3 weeks ago

Staff Software Engineer, Safeguards Evals

Anthropic · San Francisco, CA · $320k - $485k

data-pipeline llm alignment agents distributed-systems rust evaluation

Hybrid full-time lead 3 weeks ago

Machine Learning Engineer II - Autonomous Driving Performance Evaluation

May Mobility · Anywhere, USA · $172k - $210k

autonomous-vehicles healthcare robotics machine-learning evaluation

Remote full-time junior 3 weeks ago

Systems Engineer, AI Validation

Wayve · Sunnyvale, CA · $209k - $266k

generative-ai autonomous-vehicles robotics agents evaluation

Hybrid full-time senior 1 month ago

Hiring AI developers? Start with a job post. Claiming the profile comes after. Post a job →

Member of Technical Staff, Evals

Magic · San Francisco, CA · $200k - $550k

pre-training code-generation reinforcement-learning evaluation

On-site full-time lead 1 month ago

Technical Lead Manager, Prediction, ML Evaluation

Waymo · Mountain View, CA · $251k - $310k

pytorch deep-learning autonomous-vehicles robotics tensorflow nlp computer-vision evaluation

On-site full-time lead 1 month ago

AI Engineer, Evaluation

Mixpanel · San Francisco, CA · $226k - $306k

mlops llm agents cloud microservices search evaluation

On-site full-time senior 1 month ago

Principal Software Engineer, AI Observability & Evals Platform

LangChain · Boston, MA · $230k - $270k

llm agents platform evaluation

Hybrid full-time principal 1 month ago

Research Engineer – Evals

Firecrawl · San Francisco, CA · $160k - $240k

llm search fine-tuning computer-graphics reinforcement-learning research evaluation

Remote full-time senior 1 month ago

Machine Learning Engineer, LLM Evals & Observability

Glean · Mountain View, CA · $200k - $300k

llm nlp reinforcement-learning cloud agents data-pipeline machine-learning evaluation

On-site full-time junior 1 month ago

Triage Specialist

Wayve · Sunnyvale, CA · $115k - $137k

generative-ai autonomous-vehicles robotics evaluation

On-site full-time mid 1 month ago

Member of Technical Staff, Evaluation Execution

METR · Berkeley · $285k - $503k

fine-tuning agents research evaluation

On-site full-time lead 2 months ago

Senior Staff Machine Learning Systems Engineer, Indexing & Retrieval Search

Reddit · Remote (US) · $279k - $390k

distributed-systems healthcare search cloud generative-ai machine-learning evaluation

Remote full-time lead 2 months ago

Sr. Software Engineer, Computer Vision

SpaceX · Hawthorne, CA · $160k - $225k

agents data-pipeline computer-vision llm pytorch fine-tuning deep-learning evaluation

On-site full-time senior 2 months ago

Hiring AI developers? Start with a job post. Claiming the profile comes after. Post a job →

Page 1 of 5 Next →

Weekly AI Jobs Digest

Top new roles from 54+ companies. Curated, not scraped. One email, every Monday.

No spam. Unsubscribe anytime.

Hiring AI engineers?

Post the role first. Your company profile and analytics connect from the employer flow.

Post a Job See Pricing

Agentic API & MCP Server

Wire AI Dev Jobs into your agent at build time — MCP server live, REST API public for discovery, free API keys for recurring search.

# Add as MCP server
claude mcp add --transport http aidevjobs https://aidevboard.com/mcp

# Or hit the REST API
curl https://aidevboard.com/api/v1/jobs?tags=llm,pytorch

13 tools via com.aidevboard/jobs · Open read access · optional free keys for stable agent identity