ML Compiler Engineer

HuggingFace · Remote (Global) · $180k - $320k

full-time senior

Apply Now

ml-compiler python c++ cuda quantization optimization inference

About this role

Optimize model inference for HuggingFace's inference API. Work on model compilation, quantization, and hardware-specific optimization. Make models run faster and cheaper for millions of users.

Requirements

Experience with ML compilers (TVM, XLA, TensorRT) or model optimization. Strong C++/Python. Understanding of hardware architectures (GPU, TPU).

ML Compiler Engineer

About this role

Requirements

Job Details