Senior Research Engineer

Mem0 · San Francisco, CA

full-time senior Posted 21 hours ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

AI Market Demand Pack · $29 one-time

Compare this role's skills with the full AI hiring market. Get ranked demand, salary bands, leading companies, public source URLs, and a decision brief.

See the live sample →

fine-tuning data-pipeline rag llm pytorch search research

About this role

Role Summary: Own the end-to-end lifecycle of memory features—from research to production. You’ll fine-tune models for extraction, updates, consolidation/forgetting, and conflict resolution; turn customer pain points into research hypotheses; implement and benchmark ideas from papers; and ship with Engineering to SOTA latency, reliability, and cost. You’ll also build evaluation at scale (offline metrics + online A/Bs) and close the loop with real-world feedback to continuously improve quality. What You'll Do: - Fine-tune and train models for memory extraction, updates, consolidation/forgetting, and conflict resolution; iterate based on data and outcomes. - Read, reproduce, and implement research: quickly prototype paper ideas, benchmark against baselines, and productionize what wins. - Build evaluation at scale: automated relevance/accuracy/consistency metrics, gold sets, online A/B & interleaving, and clear dashboards. - Work closely with customers to uncover pain points, turn them into research hypotheses, and validate solutions through field trials. - Partner with Engineering to ship: design APIs and data contracts, plan safe rollouts, and maintain SOTA latency, reliability, and cost at scale. Minimum Qualifications - Experience in RAG or information retrieval (retrieval, ranking, query understanding) for real products. - Model training/fine-tuning experience (LLMs/encoders) with a strong footing in experimental design and iteration. - Strong Python; deep experience with PyTorch and familiarity with vLLM and modern serving frameworks. - Built evaluation for complex vision-and-language tasks (gold sets, offline metrics, online tests). - Able to orchestrate data pipelines to run these models in production with low-latency SLAs (batch + streaming). - Clear, concise communication with stakeholders (engineering, product, GTM, and customers). Nice to Have: - Publications at venues like CVPR, NeurIPS, ICML, ACL, etc. - Experience with privacy-preserving ML (redaction, differential privacy, data governance). - Deep familiarity with memory/retrieval literature or prior work on memory systems. - Expertise with embeddings, vector-DB internals, deduplication, and contradiction detection.