Engineering Manager, Platform Infrastructure
full-time
junior
Posted 6 days ago
About this role
About Decagon
Decagon is the leading conversational AI platform empowering every brand to deliver concierge customer experiences.
Our technology enables industry-defining enterprises like Avis Budget Group, Block’s Cash App and Square, Chime, Oura Health, and Hunter Douglas to deploy AI agents that power personalized, deeply satisfying interactions across voice, chat, email, SMS, and every other channel.
We’re building a future where customer experiences are being redefined from support tickets and hold music to faster resolutions, richer conversations, and deeper relationships. We’re proud to be backed by world-class investors who share that vision, including a16z, Accel, Bain Capital Ventures, Coatue, and Index Ventures, along with many others.
We’re an in-office company, driven by a shared commitment to excellence and velocity. Our values — Just Get It Done, Invent What Customers Want, Winner’s Mindset, and The Polymath Principle — shape how we work and grow as a team.
About the Team
The Infrastructure team builds and operates the foundations that power Decagon: platform, model inference, compute, data, and developer experience. We partner closely with product, research, and applied AI teams to deliver high-scale, low-latency systems with clear SLOs and great developer ergonomics.
We organize around a couple of focus areas:
- Platform: The foundational cloud stack — networking, compute, storage, security, and infrastructure-as-code — to ensure reliability, scale, and cost efficiency. CI/CD, paved paths, and core services that make shipping fast, safe, and consistent across teams.
- ML & Data: Streaming/batch data platforms powering analytics/BI and customer-facing telemetry, including for customer-managed and on-prem environments. Realtime databases that enable low-latency agents. GPU and model-serving platforms for LLM inference with multi-provider routing.
Our mission is to deliver magical support experiences — AI agents working alongside humans to resolve issues quickly and accurately.
About the Role
We're looking for a hands-on Engineering Manager to lead the Platform team. This is a deeply technical player/coach role that sits at the foundation of everything Decagon ships. You'll lead the team responsible for the compute, networking, CI/CD, and deployment systems that every other engineering team builds on — from our multi-cloud SaaS environments to the single-tenant VPC and on-prem deployments we operate for regulated enterprise customers like major financial institutions.
You'll stay close to the code and systems — reviewing designs, participating in incident response, and contributing directly when it helps the team move faster. You'll also lead by example on AI-assisted engineering, setting the standard for how the team uses AI coding tools to ship higher-quality work more quickly.
You'll hire and develop a high-performing team while partnering closely with Security, Product Engineering, AI & Data Infrastructure, and customer-facing teams to make shipping fast and safe across a wide range of environments — from our primary cloud to air-gapped customer deployments. Success requires strong people leadership, crisp execution across concurrent enterprise commitments, and the technical depth to make sound architectural calls under real constraints.
In this role, you will
- Build, lead, and develop a high-performing team of infrastructure engineers, including hiring, coaching, and performance management.
- Own the technical strategy and roadmap for Decagon's Platform — compute, networking, CI/CD, IaC, and the deployment systems that underpin both SaaS and enterprise environments.
- Stay hands-on: review designs and PRs with depth, lead architecture for hard problems, and contribute code directly when the team needs it — whether that's a critical migration, an on-call escalation, or an enterprise deployment under time pressure.
- Drive architecture for multi-cloud and on-prem/cloud-prem deployments, including single-tenant VPC topologies, private connectivity, and air-gapped environments for regulated customers.
- Set reliability, security, and cost standards across the platform, and build an operating cadence (on-call, incident review, capacity planning) that prevents repeated incidents and keeps the platform healthy as we scale.
- Invest in developer experience — paved paths, golden templates, and CI/CD systems that let product teams ship quickly without compromising safety or consistency.
- Raise the bar on AI-assisted engineering: define how your team uses AI coding tools, agents, and internal tooling to deliver faster with higher quality, and build the workflows, evals, and guardrails that make this durable.
- Partner with Security, Product Engineering, and customer-facing teams to deliver enterprise deployments on aggressive timelines, navigate compliance requirements, and translate customer constraints into durable platform capabil
Similar Jobs
Related searches:
On-site Jobs
Junior Jobs
On-site Junior Jobs
Junior NLP & Language AIJunior AI Agents & RAGJunior Machine LearningJunior Healthcare AIJunior AI Infrastructure
AI Jobs in San Francisco
NLP & Language AI in San FranciscoAI Agents & RAG in San FranciscoMachine Learning in San FranciscoHealthcare AI in San FranciscoAI Infrastructure in San Francisco
agentsllmhealthcareinfrastructure
Get jobs like this delivered weekly
Free AI jobs newsletter. No spam.