Machine Learning Intern

Insitro · San Francisco, CA
full-time junior Posted 1 month ago

About this role

THE OPPORTUNITY Global drug development productivity is declining, with an overall failure to develop effective treatments for many increasingly prevalent complex diseases affecting millions of patients per year. We seek to tackle this by combining innovative machine learning techniques with pioneering technologies that measure multiple cellular aspects, aiming to drastically improve and accelerate how drugs are discovered and developed. We are looking for highly motivated interns to join our compute team as a machine learning scientist looking to work at the intersection of machine learning and life sciences for our Summer 2026 cohort. You will partner directly with a team mentor in developing and/or applying ML methods to a process and analyze large scale datasets from multiple modalities over the course of the summer (11-12 weeks). These internships can based in on South San Francisco headquarter with a hybrid work schedule or can be remote based on the team mentor's location and business need. Compute is a diverse team that works across the company spanning imaging, omics, statistical genetics, pan-modality therapeutics discovery, clinical research, and research software engineering. EXAMPLE OF AREAS & TOPICS YOU WILL BE WORKING ON: - Computational Biology: - Leverage publicly available single cell transcriptomics resources to extract insights about disease mechanisms relevant to the therapeutic areas; - Methods for Omics & Imaging data modalities: - Develop, productionize, and deploy cutting-edge ML approaches to integrate large-scale multi-modal phenotypic datasets; - Statistical and Translational Genetics: - Develop workflows to enable post-GWAS (Genome-Wide Association Scan) analysis of results, e.g., fine-mapping - Translational genetics deep dives: enabling higher throughput annotation and exploration of candidate genes from our discovery efforts - Design of statistical methods to improve rare variant burden tests, and methods to improve power in longitudinal phenotypes - Integrative Phenotyping: - Develop ML models for imputing disease-relevant phenotypes from high-content clinical imaging datasets, e.g., MRI, PET-CT - Develop ML methods for disentangling and genetically interpreting axes of variation in complex phenotypes - Use LLMs to extract disease-relevant information from medical records - Molecular Machine Learning: - Explore generative models of small molecules, biologics, and/or oligonucleotide therapeutics in various data modalities such as 2D and 3D representations for hit-to-lead drug discovery efforts. - Develop new geometric deep learning methods to better characterize nuanced molecular properties and relationships. - Computational Microscopy: - Identify and prototype novel microscopy-driven phenotyping workflows, including hardware acquisition, post-processing, and featurization - Develop robust software tooling to support the deployment of new and existing methods for general use by insitro scientists - Optimize existing microscopy acquisition methods in both hardware and software, using ML feature outputs to benchmark improvements WHAT YOU WILL LEARN THROUGH THIS EXPERIENCE: - In the course of the internship you will learn diverse machine learning techniques and rigorously analyze complex dataset and design metrics to ensure robustness of our methods. - You can expect to develop and prototype solutions to enable ML based decisions in our workflows. - You will work closely with machine learning engineers and scientists, biologists, chemists, microscopy experts, and automation engineers. - You will be mentored by one of our senior researchers, who has significant experience in machine learning and/or data science. - You will also attend our machine learning team meetings and be exposed to a diverse set of novel technologies and machine learning concepts that tackle various biological questions. IN RETURN, WE WILL SUPPORT YOU BY: - Placing a high degree of trust in your ideas and execution. - Bringing you up to speed in the domain ML enabled drug discovery. - Striving to provide a low-stress work environment. - Making ourselves available for collaboration. - Caring about you as a whole person - not a resource. - Being a well funded startup with a stable runway. ABOUT YOU - Working towards a BS, MS, or Ph.D. in engineering, computational biology, systems biology, computer science, mathematics, statistics, life science, chemistry, physics, or a related field. - Proficiency in one or more general-purpose programming languages. We primarily use Python. - Interest in using and developing brand new statistical and machine learning methods inspired by real problems. - Curiosity about human physiology or disease biology. - Committed to writing high-quality, well-commented code and documentation. - Ability to communicate effectively and collabora

Similar Jobs

Related searches:

Hybrid Jobs Junior Jobs Hybrid Junior Jobs Junior Machine LearningJunior NLP & Language AIJunior AI InfrastructureJunior Fintech & Payments AIJunior Data ScienceJunior Healthcare AIJunior Generative AI AI Jobs in San Francisco Machine Learning in San FranciscoNLP & Language AI in San FranciscoAI Infrastructure in San FranciscoFintech & Payments AI in San FranciscoData Science in San FranciscoHealthcare AI in San FranciscoGenerative AI in San Francisco fine-tuningllmclouddeep-learninghealthcarepaymentspytorchdata-science