AI Infrastructure Engineer

Intercom · London, UK

full-time senior Posted 1 month ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this → Hiring? Promote this listing →

gpu distributed-systems cloud llm agents infrastructure

About this role

Fin is the AI Customer Agent company on a mission to help businesses provide perfect customer experiences. Our AI Agent Fin is the highest-performing AI Customer Agent on the market today, enabling businesses to deliver impeccable, always-on customer support across the customer journey – from service, to sales, to ecommerce. Powered by our own AI models, Fin resolves complex customer issues end-to-end across every channel, with minimal set-up and integration. Fin can also be combined with our natively integrated Intercom help desk for one single system that is designed to meet the needs of modern day support teams. Founded in 2011, Fin became one of the fastest growing companies and remains one of the largest private software companies in the world with nearly 30,000 global businesses using our products to transform their customer support. Driven by our core values, we push boundaries, build with speed and intensity, and relentlessly deliver incredible value to our customers. What's the opportunity? We’re looking for Senior+ AI Infrastructure Engineers to build the systems that train and serve Fin's next generation of AI products. Fin is an AI company that builds from the GPU all the way up to a user agent that resolves millions of customer service queries a month. You’ll join a small, highly technical team working at the cutting edge of modern AI infrastructure. The AI Infra team built the training pipelines and runs the inference for custom models like Fin Apex, which outperforms frontier models in customer service tasks, and is the foundation of the AI Group's full stack approach to AI. We’re particularly interested in engineers who have: A track record of working on model training or model inference at scale , or on low‑level GPU coding (e.g. CUDA, Triton). Experience with one is great, multiple is even better. What will I be doing? As a Senior AI Infrastructure Engineer focused on model training and inference, you will: Implement and scale training pipelines for large transformer and LLM models, from data ingestion and preprocessing through distributed training and evaluation. Build and optimize inference services that deliver low‑latency, high‑reliability experiences for our customers, including autoscaling, routing, and fallbacks. Work on GPU‑level performance : tuning kernels, improving utilization, and identifying bottlenecks across our training and inference stack. Collaborate closely with ML scientists to implement cutting edge training and inference methods and bring them to production. Play an active role in hiring, mentoring, and developing other engineers on the team. Raise the bar for technical standards, reliability, and operational excellence across Fin's AI platform. Profile we’re looking for: These are indicative, not hard requirements We’re looking to hire Senior+ AI Infrastructure Engineers . You’re likely a great fit if: You have 5+ years of experience in software engineering , with a strong track record of shipping high‑quality products or platforms. You hold a degree in Computer Science, Computer Engineering, or a related field (or you have equivalent experience with very strong fundamentals). You have hands‑on experience with one or more of the following: Model training (especially transformers and LLMs). Model inference at scale (again, especially transformers and LLMs). Low‑level GPU work , such as writing CUDA or Triton kernels. Comfortable working in production environments at meaningful scale (traffic, data, or organizational). You communicate clearly, can explain complex technical topics to different audiences, and enjoy close collaboration with both engineers and non‑engineers. You take pride in strong technical fundamentals , love learning, and are willing to invest in your own development. Have deep knowledge of at least one programming language (for example Python, Ruby, Java, Go, etc.). Specific language experience is less important than your ability to write clean, reliable code and learn new stacks quickly. Bonus skills & attributes None of these are required, but they’re nice to have: Experience at AI native companies that train and/or run inference for their own models (e.g. modern AI labs or AI‑native product companies). Experience running training or inference workloads on Kubernetes . Experience with AWS or other major cloud providers. Production experience with Python in ML or infrastructure contexts. Demonstrated passion for technology through personal projects, open source, meetups, or publishing content about your work and learnings Benefits We are a well treated bunch, with awesome benefits! If there’s something important to you that’s not on this list, talk to us! Competitive salary and equity in a fast-growing start-up We serve lunch every weekday, plus a variety of snack foods and a fully stocked kitchen Regular compensation reviews - we reward gre