H
ML Compiler Engineer
full-time
senior
About this role
Optimize model inference for HuggingFace's inference API. Work on model compilation, quantization, and hardware-specific optimization.
Make models run faster and cheaper for millions of users.
Requirements
Experience with ML compilers (TVM, XLA, TensorRT) or model optimization. Strong C++/Python. Understanding of hardware architectures (GPU, TPU).