Research Engineers, Data

Distyl AI · San Francisco, CA · $150k - $250k

full-time senior Posted 5 days ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

fine-tuning healthcare search data-pipeline research

About this role

ABOUT DISTYL AI Distyl is an applied AI technology company partnering with the world’s most ambitious institutions to rearchitect critical operations for the frontier of AI. Our customers include the largest companies in telecom, healthcare, insurance, manufacturing, consumer goods, and global social organizations. We research and deploy technologies that power AI-native operations — both for our partners and for Distyl itself. Our work spans research into self-constructing systems, the development of the most reliable execution of AI systems, and products that transform mission-critical workflows. As a result, Distyl's technologies affect some of the world's largest operations — from hundreds of millions of consumer interactions to tens of millions of supply chain transactions and millions of patient journeys. Distyl is backed by leading investors including Lightspeed Venture Partners, Khosla Ventures, Coatue, DST Global, and the board-members of 20+ F500s. The results reflect this approach: a 100% production deployment success rate for our customers and one of the few enterprise AI companies to run a profitable business. WHAT WE ARE LOOKING FOR At Distyl, Research Engineers build the bridge between frontier AI research and production systems that deliver real business value. This role is for engineers who are excited to investigate how AI systems should be designed, rapidly prototype new ideas, and turn promising concepts into reliable systems that work inside real customer environments. Research Engineers operate at the intersection of applied research, systems engineering, and customer-facing deployment. They design and implement compound AI systems, run experiments to understand system behavior, build evaluation frameworks, and collaborate closely with AI Researchers, AI Engineers, and customer stakeholders. Their work is not limited to demos or isolated prototypes: they help turn new techniques into robust systems that can be measured, operated, and improved in production. KEY RESPONSIBILITIES - Design and build data systems that power reliable AI workflows across enterprise environments - Develop pipelines for collecting, cleaning, transforming, labeling, and evaluating domain-specific data used by AI systems - Create data quality frameworks that identify coverage gaps, ambiguity, drift, duplication, leakage, and other failure modes - Build tools and workflows that help teams turn raw customer data into usable context for retrieval, evaluation, reasoning, and execution - Partner with AI Researchers and AI Engineers to understand how data quality affects system behavior and production outcomes - Develop synthetic data, annotation, and feedback-loop strategies to improve system performance in areas where real-world data is sparse or noisy - Analyze customer workflows and datasets to determine what information AI systems need, where that information should come from, and how it should be represented - Communicate clearly with internal teams and customer stakeholders about data assumptions, limitations, risks, and tradeoffs WHO YOU ARE - Experience Building Data Systems for AI: You have built data pipelines, evaluation datasets, labeling workflows, retrieval corpora, or similar systems that improve model or agent behavior - Strong Data Engineering Fundamentals: You write clean Python and SQL, understand data modeling and pipeline reliability, and can build systems that are maintainable under production constraints - Research-Oriented Builder: You are comfortable investigating how data quality, structure, and representation affect AI system performance - AI-Native Working Style: You use AI tools daily to accelerate coding, analysis, debugging, exploration, and workflow automation - Comfort with Ambiguous Data: You can reason through messy enterprise datasets, incomplete documentation, conflicting business definitions, and changing requirements - Bias Towards Measurement: You prefer to make data quality and system behavior observable through concrete metrics, evaluations, and experiments - Customer Environment Readiness: You can work directly with customer teams to understand their data, ask precise questions, and explain tradeoffs clearly - Ownership Mentality: You take responsibility for whether the data layer enables the AI system to deliver reliable value in production WHAT WE OFFER - The base salary range for this role is $150K – $250K, depending on experience, location, and level. In addition to base compensation, this role is eligible for meaningful equity, along with a comprehensive benefits package - 100% covered medical, dental, and vision for employees and dependents - 401(k) with additional perks (e.g., commuter benefits, in‑office lunch) - Access to state‑of‑the‑art models, generous usage of modern AI tools, and real‑world business problems - Ownership of high‑impact projects across top enterprises - A missi