Evaluation Jobs

84 jobs from companies building with AI · Avg salary $242k (56 with data)

AI evaluation engineering roles focused on benchmarking, testing, and measuring model performance. Evaluation engineers build the frameworks that determine whether AI systems are improving.

Machine Learning Engineer (Infra), Driver Understanding and Evaluation

Waymo · Mountain View, CA · $170k - $216k

pytorch tensorflow distributed-systems robotics fine-tuning autonomous-vehicles evaluation machine-learning

On-site full-time senior 2 months ago

Staff Data Scientist

Wayve · London, UK · $276k - $311k

generative-ai distributed-systems autonomous-vehicles pytorch data-science evaluation

Hybrid full-time lead 3 months ago

Senior Data Scientist

Wayve · London, UK · $209k - $266k

pytorch distributed-systems generative-ai autonomous-vehicles evaluation data-science

Hybrid full-time senior 3 months ago

Evaluation Lead

Waabi · Toronto, Canada · $159k - $260k

robotics autonomous-vehicles generative-ai evaluation

On-site full-time lead 3 months ago

Software Engineer, Evaluation Infrastructure

Waabi · Toronto, Canada · $127k - $223k

generative-ai distributed-systems data-pipeline autonomous-vehicles evaluation infrastructure

On-site full-time junior 3 months ago

Engineering Manager, Agent Prompts & Evals

Anthropic · San Francisco, CA · $320k - $405k

llm alignment agents evaluation

Hybrid full-time lead 3 months ago

Research Scientist, Frontier Risk Evaluations

Scale AI · San Francisco, CA · $216k - $270k

llm alignment generative-ai evaluation research

On-site full-time senior 3 months ago

Machine Learning Engineer, LLM Evals & Observability

Glean · San Francisco, CA · $200k - $300k

reinforcement-learning cloud data-pipeline llm agents nlp evaluation machine-learning

On-site full-time junior 4 months ago

Tech Lead, Performance Evaluation

May Mobility · Remote (US) · $200k - $295k

healthcare robotics autonomous-vehicles deep-learning evaluation

Remote full-time lead 5 months ago

Senior Backend Software Engineer, AI Observability & Evals Platform (LangSmith)

LangChain · San Francisco, CA · $175k - $240k

agents api-design backend evaluation

Hybrid full-time senior 5 months ago

Hiring AI developers? Start with a job post. Claiming the profile comes after. Post a job →

Staff Machine Learning Engineer (Infra), Driver Understanding and Evaluation

Waymo · Mountain View, CA · $251k - $310k

robotics fine-tuning distributed-systems tensorflow autonomous-vehicles pytorch evaluation infrastructure

On-site full-time lead 6 months ago

Machine Learning Engineer, Driver Understanding and Evaluation

Waymo · Mountain View, CA · $170k - $216k

tensorflow robotics pytorch autonomous-vehicles generative-ai machine-learning evaluation

On-site full-time mid 6 months ago

Member of Technical Staff, Evals & Post-Training Product

Fireworks AI · San Mateo, CA · $175k - $220k

pytorch agents fine-tuning generative-ai llm mlops evaluation

On-site full-time lead 8 months ago

Applied Research - Evals & Data

Prime Intellect · San Francisco, CA · $150k - $300k

distributed-systems reinforcement-learning llm agents data-pipeline evaluation research

Remote full-time senior 8 months ago

Senior Software Engineer, ML Evaluation Infra and Efficiency

Waymo · Mountain View, CA · $238k - $302k

tensorflow autonomous-vehicles distributed-systems llm evaluation infrastructure

On-site full-time senior 10 months ago

FullStack Engineer, AI Observability & Evals Platform (LangSmith)

LangChain · San Francisco, CA · $145k - $180k

llm agents fullstack evaluation

Hybrid full-time junior 10 months ago

Senior Staff Machine Learning Engineer, Data & Eval

Airbnb · United States · $244k - $305k

mlops generative-ai data-pipeline fine-tuning payments llm machine-learning evaluation

On-site full-time lead 1 year ago

Senior Software Engineer, Simulator Evaluation

Waymo · Mountain View, CA · $204k - $259k

llm generative-ai search autonomous-vehicles robotics evaluation

On-site full-time senior 1 year ago

Senior Fullstack Engineer, AI Observability & Evals Platform

LangChain · San Francisco, CA · $175k - $240k

agents llm evaluation fullstack

Hybrid full-time senior 1 year ago

Senior Robotics Software Engineer, Simulation & Evaluation

Field AI · Irvine, CA · $155k - $180k

robotics evaluation

On-site full-time senior 1 week ago

Hiring AI developers? Start with a job post. Claiming the profile comes after. Post a job →

← Previous Page 2 of 5 Next →

Weekly AI Jobs Digest

Top new roles from 54+ companies. Curated, not scraped. One email, every Monday.

No spam. Unsubscribe anytime.

Hiring AI engineers?

Post the role first. Your company profile and analytics connect from the employer flow.

Post a Job See Pricing

Agentic API & MCP Server

Wire AI Dev Jobs into your agent at build time — MCP server live, REST API public for discovery, free API keys for recurring search.

# Add as MCP server
claude mcp add --transport http aidevjobs https://aidevboard.com/mcp

# Or hit the REST API
curl https://aidevboard.com/api/v1/jobs?tags=llm,pytorch

13 tools via com.aidevboard/jobs · Open read access · optional free keys for stable agent identity