Applied AI Engineer - Agent

General Intelligence · New York, NY
full-time mid Posted 6 months ago

About this role

We’re hiring an Applied AI Engineer to push the boundaries of our Cofounder agent. You’ll own core backend systems and applied LLM work: advancing agent reliability and autonomy, building evaluation pipelines, and shipping techniques that measurably improve agent performance. This is a hands-on role with high ownership across research-to-production: prototyping, instrumenting, evaluating, and deploying improvements that show up directly in user outcomes. WHAT YOU’LL DO - Design and implement agent improvements end-to-end: prompting strategies, tool selection, action planning, memory usage, safety/guardrails, and recovery paths - Build robust evaluation pipelines for the agent: offline evals (golden tasks, regression suites, behavior tests), online metrics (latency, success rate, fallout modes, cost efficiency), and experimentation frameworks (A/B, canaries, guardrail thresholds) - Productionize applied LLM techniques: function/tool-calling orchestration, self-reflection, retrieval/RAG, multi-agent handoffs, caching/embedding strategies, and hallucination reduction - Improve core backend systems: reliable job orchestration, retries/backoff, idempotency, and auditability; scalable memory and context routing; data pipelines across Gmail, Slack, Notion, Linear, Google Workspace, etc.; observability and tracing for agent actions/outcomes - Partner with product and infra to define success metrics and ship fast, safe iterations - Write clean, well-tested code; document design decisions and runbooks WHAT YOU’LL BRING - 4+ years backend engineering experience, preferably Python (we care about impact over years) - Hands-on LLM experience: prompt engineering, function-calling, retrieval, embeddings, evaluation design; you’ve shipped LLM features to production - Track record building evaluation harnesses and using them to drive improvements (regression suites, task success metrics, cost/runtime tradeoffs) - Solid distributed systems fundamentals: concurrency, reliability, performance, data modeling, lifecycle management - Pragmatic experimentation: hypothesis → prototype → measured improvement → rollout - Excellent debugging and instrumentation skills; you enjoy finding and fixing edge cases in the wild NICE TO HAVE - Experience with agent frameworks, tool orchestration, and memory architectures - RAG systems in production (chunking, retrieval quality, freshness strategies) - Redis, Postgres/Supabase, queues (e.g., Celery/Arq/SQS), and event-driven designs - Observability stacks (Datadog, OpenTelemetry), and cost/latency optimization WHY JOIN US - Mission: build autonomous agents that run entire businesses - Impact: ship core agent improvements that users feel immediately - Velocity: small, senior team; fast decision cycles; high ownership - Stack: modern tooling across AI orchestration, integrations, and memory systems COMPENSATION - Competitive salary and meaningful equity - Comprehensive benefits and flexible work setup

Similar Jobs

Related searches:

Hybrid Jobs Mid-Level Jobs Hybrid Mid-Level Jobs Mid-Level Data EngineeringMid-Level Machine LearningMid-Level NLP & Language AIMid-Level AI Agents & RAGMid-Level Backend & SystemsMid-Level AI Infrastructure AI Jobs in New York Data Engineering in New YorkMachine Learning in New YorkNLP & Language AI in New YorkAI Agents & RAG in New YorkBackend & Systems in New YorkAI Infrastructure in New York agentsragdistributed-systemsdata-pipelinellm