AI Engineering Intern - Growth Team

Cerebras · Sunnyvale, CA

full-time junior Posted 5 days ago

Apply Now Get weekly job alerts like this → Hiring? Promote this listing →

llm embeddings generative-ai agents rag

About this role

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras , to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference. Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation. About the Team The Growth Team drives AI adoption across Cerebras. We are a multi-disciplinary team that owns product, engineering and marketing responsibilities. We build agentic workflows, internal knowledge systems, and developer infrastructure that help our engineering orgs – kernel, design verification, and cloud platform – ship faster. Our stack includes Claude Code, MCP (Model Context Protocol), RAG pipelines, and multi-agent architectures, and we work directly with the teams building Cerebras’s chips and inference platform. This summer, you’ll embed with engineers across the company to build AI tooling that accelerate real hardware and software development workflows. Your work won’t sit on a shelf – you’ll ship internal tools that our engineers actually use. About the Role We’re looking for an AI Engineer Intern to join the Growth Team for Summer 2026. You’ll own end-to-end workstreams: scoping problems with engineering teams, building agentic systems, iterating based on user feedback, and shipping a working internal tool by the end of the internship. This is a 12-week, paid, in-person internship based in our Sunnyvale, CA or Toronto, ON office, running June through August 2026. What You’ll Work On 1. AI Agents for Design Verification & ASICs Work directly with the design verification and ASICs teams to build AI agents that speed up chip development. This could include automated test generation, debug triage, or verification workflow acceleration. You’ll learn how silicon gets shipped and build tooling that compresses the iteration cycle. 2. AI Agents for Kernel & Model Bringup Build agentic workflows that accelerate how we bring up new models on Cerebras hardware. This means working with the kernel team to identify bottlenecks in the bringup process and building AI-powered tools – using Claude Code, MCP integrations, and RAG systems – that help engineers move faster from first compile to production readiness. 3. AI Agents for Cloud Platform & SRE Partner with the cloud platform team to build AI agents that reduce incident response time and speed up SRE workflows. Think automated log analysis, intelligent runbook execution, and agentic debug loops that help on-call engineers resolve issues faster. Capstone: Ship an Internal Tool By the end of the internship, you’ll have built and shipped at least one internal tool – whether that’s a RAG-powered knowledge base, an MCP integration, or a multi-agent system – that engineering teams at Cerebras are actively using. You’ll present your work and its impact to engineering leadership. What We’re Looking For Currently pursuing a Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field Strong Python skills – you’ll be writing real infrastructure code, not just notebooks Comfortable with ambiguity – you’d rather start building and iterating than wait for a perfect spec Genuine interest in AI tooling and developer productivity – you care about how engineers work, not just what they build Familiarity with LLM APIs (Chat Completions, tool calling, structured outputs) Bonus: Experience with RAG systems, vector databases, or embedding pipelines Has used Claude Code, Cursor, or similar AI coding tools in personal or academic projects Familiarity with MCP (Model Context Protocol) or agentic framework patterns Exposure to hardware development workflows (chip design, verification, ASIC toolchains) or SRE/infrastructure operations Able to work in-person from our Sunnyvale, CA or Toronto, ON office for the duration of the internship What You’ll Get Direct mentorship from the Head of Growth, with regular exposure to engineering leads across kernel, DV, and cloud platform teams Real impact – you’ll ship t