Consumer Goods & Retail AI Rater & Evaluator

LILT · Remote
contract mid Posted 1 week ago

About this role

OVERVIEW LILT is building a global network of domain experts to support high-quality AI evaluation across training, benchmarking, red-teaming, and ongoing model monitoring. We are seeking consumer goods and retail professionals to contribute expert judgment to human-in-the-loop AI evaluation workflows used by leading enterprises and hyperscalers. This role is designed for professionals who understand how consumer products, retail operations, merchandising, and customer interactions function in real-world commercial environments and who can apply that expertise to evaluate, assess, and improve multilingual AI systems. Your contribution of expertise will directly influence multilingual AI model quality, safety, and deployment readiness. This role includes two distinct expert tracks, based on experience level and scope of responsibility. TRACK A: CONSUMER GOODS & RETAIL AI RATER Raters execute structured evaluation tasks using clearly defined rubrics and instructions. Responsibilities - Evaluate AI outputs related to consumer goods, retail operations, and commerce content - Perform structured scoring, comparison, classification, and judgment tasks - Assess factual accuracy, product relevance, clarity, and alignment with retail best practices - Identify hallucinations, misleading product information, pricing errors, or policy violations - Apply domain-specific retail and consumer goods guidelines consistently across tasks Ideal Background - Retail professionals, merchandising specialists, product managers, or consumer goods practitioners - Experience with product catalogs, retail operations, merchandising, or customer-facing commerce workflows - Strong attention to detail and comfort working with structured evaluation criteria TRACK B: CONSUMER GOODS & RETAIL AI EVALUATOR (SENIOR TRACK) Evaluators provide higher-level domain oversight and help shape how evaluation is performed. Responsibilities - Validate and refine evaluation rubrics and edge-case handling - Perform adjudication where raters disagree - Conduct error analysis and qualitative reviews of model behavior - Partner with LILT research, product, and customer teams on evaluation design - Support red-teaming, policy compliance, and model readiness assessments Ideal Background - Senior retail leaders, category managers, or consumer goods subject matter experts - Experience defining standards, reviewing complex edge cases, or advising on product or commerce risk - Ability to clearly explain nuanced retail and consumer decision-making tradeoffs EVALUATION FOCUS & REQUIREMENTS Types of AI Evaluation Work Depending on project demands, work may include: - Product and retail content evaluation - Merchandising accuracy and product categorization assessment - Commerce-related benchmarking and comparative model analysis - Red-teaming for misleading product claims or unsafe recommendations - Ongoing model monitoring and regression testing What We Look For - Deep domain expertise in consumer goods, retail, or commerce - Strong judgment and ability to apply criteria consistently - Comfort working with structured evaluation workflows - Ability to explain reasoning clearly, especially in customer-facing or brand-sensitive scenarios - Reliability, professionalism, and respect for quality standards Engagement Model - Contract-based, flexible participation - Project-based work with clear expectations and timelines - Opportunities for recurring work based on performance and demand - Compensation communicated upfront per project or task type Why This Work Matters Your expertise helps ensure that AI systems: - Provide accurate and trustworthy product and retail information - Align with enterprise commerce standards and policies - Are reliable and safe for consumers across languages Language Requirements - Native or professional fluency in one or more supported languages is required - Supported languages span 30+ global languages - Language-specific nuance is assessed through screening and task-based evaluation, not separate job descriptions - English fluency is required for guidelines, feedback, and collaboration AI is changing how the world communicates — and LILT is leading that transformation. LILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human-verified services to Enterprises, Governments, and AI Developers around the world. Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise. Information collected and processed as part of your application

Similar Jobs

Related searches:

Remote Jobs Mid-Level Jobs Remote Mid-Level Jobs Mid-Level AI InfrastructureMid-Level AI Research mlopsevaluation

Get jobs like this delivered weekly

Free AI jobs newsletter. No spam.