Infrastructure Engineer
full-time
lead
Posted 1 week ago
About this role
WHO WE ARE
Our mission is to make the world programmable. Sight is one of the key ways we understand the world, and soon this will be true for the software we use, too.
We’re building the tools, community, and resources needed to make the world programmable with artificial intelligence. Roboflow simplifies building and using computer vision models. Today, over 1M+ developers, including those from half the Fortune 100, use Roboflow’s machine learning open source and hosted tools. That includes counting cells https://blog.roboflow.com/cancer-research-computer-vision/ to accelerate cancer research, improving construction site safety https://blog.roboflow.com/preventing-accidents-on-construction-sites-with-computer-vision/, digitizing floor plans https://blog.roboflow.com/floor-plan-analysis-computer-vision/, preserving coral reef populations https://blog.roboflow.com/reefos-supercharging-coral-reef-restoration-with-ai/, guiding drone flight https://blog.roboflow.com/georeferencing-drone-videos/, and much more https://roboflow.com/templates.
Roboflow is supported by great customers and investors, having raised over 63 million from Y Combinator, Google Ventures, Craft Ventures, Sam Altman, Lachy Groom, amongst other leading software investors.
Roboflowers love building great things with passionate teammates. We value ownership, accountability, and a bias toward action—whether it's a big initiative or a small fix. You’re naturally curious, hands-on with new tech (maybe even played with ChatGPT or AI products early on), and prefer to show your work over talking about it. Many of us have founder mindsets and thrive in Roboflow’s high-autonomy environment—some even started as side hustlers in school.
WHAT WE'RE LOOKING FOR
Primarily, you like to make great things with passionate colleagues. You are someone that likes to own outcomes, not only inputs. You’re motivated by having responsibility and accountability. You’re eager to ‘do the work,’ big and small.
You’re curious and learning about new technologies, perhaps an early tinkerer with MLOps products. You show more than you tell.
You’re motivated by the question, “How can I improve this?” and have a track record of doing so, even in ways adjacent to your role. Much of our current team is made up of former founders and thrive in the level of autonomy at Roboflow. Maybe you had a side hustle in high school or college.
Many Roboflowers have used our tools before joining. One of the best ways to stand out amongst other applicants is to write about something you have built with Roboflow or contribute to one of ouropen source projects https://roboflow.com/open-source.Likewise we highly value users with meaningful contributions to successful open source devtool and security projects.
WHAT YOU'LL DO
As a member of our infrastructure team, you'll be at the heart of a fast-paced startup environment. Your primary focus will be on striking the right balance between rapid delivery, high reliability, and robust security. This isn't a traditional, siloed role; you'll need to wear many hats—acting as an infrastructure engineer one moment, and a developer, or even a security analyst.
You will be securing, scaling, and maintaining the core infrastructure that powers our product. This includes our cloud architecture, databases, file storage, search clusters, microservices, and machine learning pipelines. You'll work closely with our product team and collaborate across the company on product, operations, and customer-facing projects, constantly context-switching to solve the next critical challenge.
SKILLSET
We're looking for a versatile engineer excited by high-impact challenges. At Roboflow, we are AI-native: we expect our team to use AI to accelerate everything from writing code and fixing bugs to analyzing security, cost, and performance. Experience in some or all of the following areas will be crucial:
- Production experience with Kubernetes: Building and managing containerized applications at scale.
- Infrastructure-as-Code (IaC): Using Terraform, Helm charts, bash scripting, and Python to automate everything.
- Scale & Site Reliability: Operating, monitoring, and scaling large-scale applications (especially in ML/AI) in AWS and/or GCP.
- Development Skills: Proficiency in Node.js and Python, with the ability to collaborate with full-stack developers on designing and operating SaaS applications.
- ML/Big Data Ops: Hands-on experience with the infrastructure required for machine learning at scale (GPUs, Docker, Kubernetes) and familiarity with libraries like PyTorch or Tensorflow.
- CI/CD Automation: Experience with tools like GitHub Actions or Spacelift to build and deploy code efficiently.
- Pragmatic Security: Awareness of security best practices for cloud operations and how they can be applied to startup environments.
- AI-Native Engineering: Leveraging LLMs and AI tools to accelerate the development lifecycle—
Similar Jobs
Related searches:
Get jobs like this delivered weekly
Free AI jobs newsletter. No spam.