Member of Technical Staff, Kernels

Magic · San Francisco, CA · $225k - $550k
full-time lead Posted 2 years ago

About this role

Magic’s mission is to build safe AGI that accelerates humanity’s progress on the world’s most important problems. We believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment more reliably than humans can alone. Our approach combines frontier-scale pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal. ABOUT THE ROLE As a Kernel Engineer, you will design, implement, and maintain high-performance kernels to optimize throughput and latency during training and inference. Magic's long-context windows create distinct kernel optimization challenges around memory utilization, data movement, and sustained throughput. WHAT YOU'LL WORK ON - Design and implement kernels that support high-performance long-context behavior - Ownership of kernel design, implementation, deployment, and production reliability - Focus on robustness, extensive testing, and functional correctness, while pushing on performance - Evaluate porting Magic’s compute kernels to alternative hardware options - Co-design kernels with understanding and interaction with training, inference, and RL teams - For a sample of our work, see Magic-Attention, presented at GTC 2026 https://www.nvidia.com/gtc/session-catalog/sessions/gtc26-s82294/ WHAT WE’RE LOOKING FOR - Low-level programming experience targeting AI accelerators such as NVIDIA Blackwell or Google TPUs - Develop and optimize GPU kernels in frameworks such as NCCL https://developer.nvidia.com/nccl, MSCCLPP https://github.com/microsoft/mscclpp, CUTLASS https://github.com/NVIDIA/cutlass, CuTeDSL https://docs.nvidia.com/cutlass/latest/media/docs/pythonDSL/cute_dsl.html, Triton https://github.com/triton-lang/triton, Quack https://github.com/Dao-AILab/quack, Flash-Attention https://github.com/Dao-AILab/flash-attention, and similar frameworks - Experience in other kernel authoring frameworks such as Pallas https://docs.jax.dev/en/latest/pallas/index.html/Mosaic (GPU https://docs.jax.dev/en/latest/pallas/gpu/index.html or TPU https://docs.jax.dev/en/latest/pallas/gpu/index.html), or Mojo https://www.modular.com/blog/matrix-multiplication-on-nvidias-blackwell-part-1-introduction also maps well to the work on Magic's kernel team - Strong depth over shallow breadth: for kernel engineering, we prefer candidates with deep expertise in computer architecture, low-level machine optimizations, and code generation, with breadth across ML - Agility, ownership mindset, and grit COMPENSATION, BENEFITS AND PERKS (US) - Annual salary ranges between $225K - $550K based on experience - Equity is a significant part of total compensation, in addition to salary - 401(k) plan with 6% salary matching - Generous health, dental and vision insurance for you and your dependents - Unlimited paid time off - Visa sponsorship and relocation stipend to bring you to SF, if possible - A small, fast-paced, highly focused team Magic strives to be the place where high-potential individuals can do their best work. We value quick learning and grit just as much as skill and experience. OUR CULTURE - Integrity. Words and actions should be aligned - Hands-on. At Magic, everyone is building - Teamwork. We move as one team, not N individuals - Focus. Safely deploy AGI. Everything else is noise - Quality. Magic should feel like magic

Similar Jobs

Related searches:

On-site Jobs Lead Jobs On-site Lead Jobs Lead AI InfrastructureLead AI ResearchLead AI Agents & RAGLead Machine LearningLead NLP & Language AI AI Jobs in San Francisco AI Infrastructure in San FranciscoAI Research in San FranciscoAI Agents & RAG in San FranciscoMachine Learning in San FranciscoNLP & Language AI in San Francisco gpupre-trainingcode-generation