Data Engineering Team Lead (Agentic Search)

Nebius · Israel

full-time lead Posted 1 month ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

payments data-pipeline search agents cloud data-engineering

About this role

About Nebius: Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure. Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI. Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R&D. The Product: In a rapidly evolving world, trust in AI depends on AI agents being grounded in fresh, verified real-world data. Search is the foundation that makes this possible. We are building an agent-native search platform designed specifically for AI systems rather than human users. Our product provides programmatic, low-latency, and observable search APIs that AI agents use to retrieve, filter, and reason over real-world information at scale. Behind every search request is a rich stream of signals - query patterns, retrieval decisions, crawling outcomes, ranking quality, usage and revenue events. Turning that stream into a trustworthy, queryable data platform is what makes the product improvable, the business measurable, and the models trainable. The Role: We are looking for a Data Engineering Team Lead to lead our data platform - both the data behind our search quality and ML pipelines, and the analytics that drive product and business decisions. In this role, you will lead a team of data engineers and own the end-to-end data lifecycle: ingestion from production services, helping model and architect our data warehouse, and exposing clean, well-documented data to researchers, engineers, and analysts across the company. The platform spans tens of terabytes and ingests from tens of proprietary and third-party sources - our own search engine and its components, CRM, billing, identity, and product analytics across multi-region production environments. Around 100 internal users rely on it daily. You will lead our data platform, hire and grow the team, and stay hands-on enough to design and review the systems your team ships. In this position, your responsibility will be to: Lead and architect Tavily's data platform - from real-time ingestion through data warehouse medallion layers to consumer-facing datasets and dashboards Lead, hire, mentor, and grow a team of data engineers; set engineering standards for code quality, testing, documentation, and on-call Work closely with engineers across the company to make sure batch and streaming pipelines are done correctly Define and implement observability for the data platform: data quality checks, freshness monitors, lineage, schema evolution, and cost controls Partner with researchers, engineers , analysts, finance, and product managers to deliver trustworthy datasets for product & gtm analytics. Define the objects, entities, and relationships that model Tavily's search domain - agent inputs, URLs, chunks, agent sessions, crawls, and the connections between them - and turn that mental model into a clean, queryable data model that the rest of the company can reason about. Data Governance: Can ensure the highest standards of data quality, integrity, and security across all environments. You may be a good fit if you: 5+ years of Data Engineering experience, with a focus on designing and implementing scalable, analytics-ready data models and cloud data warehouses (e.g., BigQuery, Snowflake). Have hands-on experience with Snowflake (or a comparable cloud data warehouse) and a strong grasp of data warehouse architecture preferably medallion schema. Deep knowledge of databases (schema design, query optimization) and familiarity with NoSQL use cases. Expertise in modern data orchestration and transformation frameworks (e.g., Airflow, DBT ). Solid understanding of cloud data services (e.g., AWS, GCP) and streaming platforms (e.g., Kafka, Pub/Sub). Have hands-on experience with the Spark / MapReduce paradigm and understand when distributed processing is the right tool Are fluent in Python and SQL for production data work Have operated data systems in production: debugged them under pressure, recovered from data incidents, and understand what it means to backfill a corrupted table. Benefits & Perks: Competitive compensation Career growth and learning opportunities Flexibility and ownership Collaborative and innovative culture Opportunity to work on impactful AI projects International environment and talented teams What's it like to work at Nebius: Fast moving - Bold thinking - Constant growth - Meaningful impact - Trust and real ownership - Opportuni