Software Engineer, Web Crawling
full-time
mid
Posted 4 months ago
About this role
Exa is building a search engine from scratch to serve every AI application. We build massive-scale infrastructure to crawl the web, train state-of-the-art embedding models to index it, and develop super high performant vector databases in Rust to search over it. We also own a $5M H200 GPU cluster that regularly lights up tens of thousands of machines.
As a Web Crawler engineer, you'd be responsible for crawling the entire web. Basically build Google-scale crawling!
DESIRED EXPERIENCE
- You have extensive experience building and scaling web crawlers, or would be excited to ramp up very quickly
- You have experience with some high performance language (C++, Rust, etc.)
- You are familiar with TypeScript, Playwright, modern web design, CDP (Chrome DevTools Protocol)
- You’re comfortable optimizing a system to an exceptional degree
- You care about the problem of finding high quality knowledge and recognize how important this is for the world
EXAMPLE PROJECTS
- Build a distributed crawler that can handle 100M+ pages per day
- Optimize crawl politeness and rate limiting across thousands of domains
- Design systems to detect and handle dynamic content, JavaScript rendering, and anti-bot measures
- Create intelligent crawl scheduling and prioritization algorithms for maximum coverage efficiency
This is an in-person opportunity in Singapore. We’re happy to sponsor international candidates.
In addition to premium healthcare benefits (medical, dental, vision), we also offer fertility benefits and a monthly wellness stipend to all of our employees.
Similar Jobs
Related searches:
On-site Jobs
Mid-Level Jobs
On-site Mid-Level Jobs
Mid-Level Data ScienceMid-Level Healthcare AIMid-Level Data EngineeringMid-Level Machine LearningMid-Level NLP & Language AIMid-Level AI Infrastructure
AI Jobs in Singapore
Data Science in SingaporeHealthcare AI in SingaporeData Engineering in SingaporeMachine Learning in SingaporeNLP & Language AI in SingaporeAI Infrastructure in Singapore
healthcareembeddingsgpusearchcomputer-graphics