Machine Learning Engineer — Distillation

Featherless AI · Remote

full-time mid Posted 5 months ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

AI Market Demand Pack · $29 one-time

Compare this role's skills with the full AI hiring market. Get ranked demand, salary bands, leading companies, public source URLs, and a decision brief.

See the live sample →

pytorch generative-ai llm deep-learning machine-learning research

About this role

ABOUT THE ROLE We’re looking for a Machine Learning Engineer focused on model distillation to help us build smaller, faster, and more efficient models without sacrificing quality. You’ll work at the intersection of research and production—taking cutting-edge techniques and turning them into systems that scale. This is a hands-on role with real ownership: you’ll design distillation pipelines, run large-scale experiments, and ship models used in production. WHAT YOU’LL DO - Design and implement knowledge distillation pipelines (teacher–student, self-distillation, multi-teacher, etc.) - Distill large foundation models into smaller, faster, and cheaper models for inference - Run and analyze large-scale training experiments to evaluate quality, latency, and cost tradeoffs - Collaborate with research to translate new distillation ideas into production-ready code - Optimize training and inference performance (memory, throughput, latency) - Contribute to internal tooling, evaluation frameworks, and experiment tracking - (Optional) Contribute back to open-source models, tooling, or research WHAT WE’RE LOOKING FOR - Strong background in machine learning or deep learning - Hands-on experience with model distillation (LLMs or other neural networks) - Solid understanding of training dynamics, loss functions, and optimization - Experience with PyTorch (or JAX) and modern ML tooling - Comfort running experiments on multi-GPU or distributed setups - Ability to reason about model quality vs. performance tradeoffs - Pragmatic mindset: you care about shipping, not just papers NICE TO HAVE - Experience distilling LLMs or large sequence models - Experience with inference optimization (quantization, pruning, kernels, etc.) - Familiarity with evaluation for language models - Open-source contributions or research publications - Experience in early-stage or fast-moving startups WHY JOIN - Work on core model quality and cost efficiency—not side projects - High ownership and direct impact on product and roadmap - Small, senior team with strong research + engineering culture - Competitive compensation + meaningful equity - Remote-friendly, async-first environment