Contract — AI Agent Evaluation

Aaru · Remote · $100k - $150k

contract mid Posted 3 months ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

AI Market Demand Pack · $29 one-time

Compare this role's skills with the full AI hiring market. Get ranked demand, salary bands, leading companies, public source URLs, and a decision brief.

See the live sample →

evaluation python agents statistics nlp benchmarking

About this role

Evaluate and benchmark Aaru's synthetic research agents. Design evaluation protocols, run A/B tests, and measure agent accuracy against human researchers. 3-month contract with extension possibility. Remote-friendly.