Director, Data Platform Engineering

Lila Sciences · San Francisco, CA · $232k - $346k

full-time lead Posted 3 months ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

AI Market Demand Pack · $29 one-time

Compare this role's skills with the full AI hiring market. Get ranked demand, salary bands, leading companies, public source URLs, and a decision brief.

See the live sample →

data-pipeline fine-tuning cloud llm agents platform

About this role

Your Impact at LILA Lila is seeking a highly motivated and experienced engineering leader to lead a team responsible for our Lila's product data platform. You will own the data platform and infrastructure end-to-end — architecture, delivery, reliability, and developer/data scientist experience. Our mission is to deliver Scientific Super Intelligence through a reliable, scalable, and self-service infrastructure for data ingestion, storage, processing, and interaction — enabling AI/ML, product teams, and scientists to build data-intensive applications with confidence and speed. Our platform supports analytical and machine learning workloads across Lila, serving autonomous DBTL cycles, instrument data pipelines, and AI inference workflows. You will be responsible for building and leading a team of talented engineers, driving technical strategy, and ensuring the scalability and performance of our data management and data serving capabilities of our Data Platform. You will work closely with data scientists, data engineers, lab scientists, and product teams to understand their needs and deliver innovative solutions that leverage the power of cutting edge data processing technologies. What You'll Be Building Team Leadership: Build, mentor, and manage a high-performing team of 8-12 data engineering experts. Evaluate and adopt modern data infrastructure - including real-time streaming (Kafka, Flink), columnar engines (DuckDB, ClickHouse), lake house, and cloud-native object storage architectures; Foster a culture of collaboration, innovation, and continuous improvement; Provide technical guidance and mentorship to team members, promoting their professional growth; Conduct performance reviews, provide feedback, and identify opportunities for training and development; Manage team workload, prioritize projects, and ensure timely delivery of high-quality solutions. Technical Strategy and Execution: Define and execute the technical roadmap for our data platform, aligning with Lila’s overall data strategy; Drive innovation in data Lakehouse and data serving ecosystem exploring new technologies and approaches to improve usability, performance, scalability, and efficiency; Ensure the reliability, availability, and security of our data processing infrastructure.; Collaborate with other engineering teams to integrate our data processing technologies with other Lila systems and services. Stakeholder Management: Partner with data scientists, data engineers, lab scientists, product managers, and other stakeholders to understand their data processing needs and requirements; Communicate technical concepts and solutions effectively to both technical and non-technical audiences; Advocate for best practices in data processing and engineering; Manage expectations and ensure alignment across different teams. Engineering Thought Leadership: Represent Lila’s data platform work at external conferences; Deliver presentations, and write blog posts highlighting Lila’s leadership in big data processing. Scientist and Engineering Productivity: Drive innovative, agentic, and low-code solutions to deliver data interfaces - exploration, query, analytics, and ML/inference solutions at scale. What You’ll Need to Succeed 12+ years of software development experience, with a focus on data processing at scale. 5+ years of experience leading senior engineers. Experience with building on AWS/GCP primitives like S3 + Athena/BigQuery, and query engines. Operated data platforms at petabyte scale with sub-second query latency requirements. Experience managing data infrastructure supporting 100+ concurrent ML training and inference workloads. Familiarity with LLM/AI-native data patterns — vector stores, embedding pipelines, pre/mid/post training. Track record of building data platforms in high-growth or early-stage environments where speed-to-value mattered as much as long-term architecture. Hands-on coding in Python and modern backend frameworks. Experience with infrastructure-as-code and containerized deployments (Kubernetes). BS, MS, or Ph.D. in Computer Science or a related field of study. Bonus Points For Thought leadership in the community via presentations in conferences or blog posts. Experience building and growing teams focusing on open source technologies. Scientific data management and quality experience Built self-service data products/platforms where developer experience was a first-class product concern. Compensation We offer competitive base compensation with bonus potential and generous early-stage equity. Your final offer will reflect your background, expertise, and expected impact. U.S. Benefits. Full-time U.S. employees receive a comprehensive benefits program including medical, dental, and vision coverage; employer-paid life and disability insurance; flexible time off with generous company wide holidays; paid parental leave; an educational assistance program; commuter benefits