Machine Learning Engineer III, Core Agents

Box · Redwood City, CA · $175k - $219k
full-time mid Posted 3 weeks ago
Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

About this role

WHAT IS BOX?   Box (NYSE:BOX) is the leader in Intelligent Content Management. Our platform enables organizations to fuel collaboration, manage the entire content lifecycle, secure critical content, and transform business workflows with enterprise AI. We help companies thrive in the new AI-first era of business. Founded in 2005, Box simplifies work for leading global organizations, including JLL, Morgan Stanley, and Nationwide. Box is headquartered in Redwood City, CA, with offices across the United States, Europe, and Asia. By joining Box, you will have the unique opportunity to continue driving our platform forward. Content powers how we work. It’s the billions of files and information flowing across teams, departments, and key business processes every single day: contracts, invoices, employee records, financials, product specs, marketing assets, and more. Our mission is to bring intelligence to the world of content management and empower our customers to completely transform workflows across their organizations. With the combination of AI and enterprise content, the opportunity has never been greater to transform how the world works together and at Box you will be on the front lines of this massive shift. WHY BOX NEEDS YOU  AI is transforming how enterprises work, and Box is building an enterprise-grade Agents Platform at the core of the Box Content Cloud. Our platform, built on LangGraph, enables teams across Box and our customers to design, deploy, and operate AI agents that handle real-world enterprise workflows—from content understanding and generation to intelligent metadata, automation, and complex, multi-step orchestrations. As a founding ML Engineer on the Core Agents team, you will build and evaluate the foundational agents that power the Box AI ecosystem, including DeepSearch, DeepResearch, Extract, and Compose. You’ll design techniques for intent detection, ranking, evaluation, retrieval-augmented generation (RAG), and multi-agent orchestration, while also establishing metrics and evaluation frameworks to measure agent quality. Your work will shape how agents retrieve, reason, and act on enterprise content with high accuracy and trustworthiness. You’ll collaborate closely with platform engineers to build the core components of the Agents Platform that enable these agents to run at scale, while also empowering other Box teams and customers to configure and customize agents for their workflows. WHAT YOU'LL DO  Build, evaluate, and evolve foundational agents such as DeepSearch, DeepResearch, Extract, and Compose. Develop techniques for intent detection, query understanding, ranking, and RAG to improve accuracy and relevance. Define metrics, evaluation pipelines, and benchmarks for agent quality, including precision/recall, factual grounding, and latency trade-offs. Research and implement best practices in retrieval, orchestration, and evaluation of multi-agent workflows. Collaborate with platform engineers to design core components that enable secure, reliable, and scalable deployment of agents. Partner with product teams to translate enterprise use cases into agentic solutions, ensuring measurable improvements in user experience. Contribute to technical discussions, share research insights, and help define the roadmap for Box’s agent ecosystem. WHO YOU ARE  You are passionate about building and evaluating AI agents that solve enterprise problems. You enjoy working at the intersection of machine learning and distributed systems, bridging research with production. You’ve designed or evaluated ML systems for search, ranking, RAG, or conversational AI. You like to be an owner and strive to do work you’re proud of—both technically and in your team interactions. You are collaborative, curious, and comfortable mentoring or learning from other engineers and ML practitioners. Must Have Experience 3+ years of industry experience building or evaluating ML-powered systems. MS or PhD degree in Machine Learning, Computer Science, or a related field. Strong background in machine learning, information retrieval, or natural language processing. Proficiency with at least one programming language such as Python, Java, or Scala. Experience designing, training, and evaluating ML models in production. Familiarity with retrieval systems, ranking models, RAG pipelines, or intent classification. Nice To Have Experience Advanced degree in computer science, machine learning, or related field. Hands-on experience with LangChain, LangGraph, or other agent frameworks. Familiarity with LLMs, embeddings, semantic search, indexing, and relevance optimization. Experience with cloud-based ML platforms such as Vertex AI, AWS Bedrock, or SageMaker. Experience with Kubernetes-based systems for deploying and scaling ML workloads. Research or applied experience in evaluation of generative AI systems (factuality, safety, grounding). Box lives its values, with c

Similar Jobs

Related searches:

On-site Jobs Mid-Level Jobs On-site Mid-Level Jobs Mid-Level Machine LearningMid-Level AI InfrastructureMid-Level Backend & SystemsMid-Level NLP & Language AIMid-Level Generative AIMid-Level AI Agents & RAGMid-Level Data Engineering AI Jobs in Redwood City Machine Learning in Redwood CityAI Infrastructure in Redwood CityBackend & Systems in Redwood CityNLP & Language AI in Redwood CityGenerative AI in Redwood CityAI Agents & RAG in Redwood CityData Engineering in Redwood City searchragdistributed-systemsagentsgenerative-aicloudllmnlp

Get jobs like this delivered weekly

Free AI jobs newsletter. No spam.