Infrastructure Engineer

MatX · Mountain View, CA · $200k - $300k
full-time principal Posted 1 month ago

About this role

What MatX Is Building MatX is designing a custom chip. Our engineering team works in Rust and SystemVerilog, builds with a hermetic build system, and runs compute-intensive verification workloads on a managed cluster. The infrastructure that supports this work — CI/CD, compute fleet, shared filesystems, developer environments — is what you'd own. This is a small team. There's no ops org, no ticket queue, no on-call rotation. You'd work directly with engineers who'll tell you "X is broken" or "we need Y" and you'd figure out how to make it happen. You'd use the same tools they do — git, SSH, the same VMs. The current infrastructure was built by engineers who needed it and moved on to other things; you'd inherit it, clean it up, and make it yours and improve it. We use AI-assisted development tools extensively as a force multiplier. You should be comfortable with that or willing to learn. What You'll Do Here You’d be the person who makes sure 20+ engineers are set up for success to do their work without thinking about infrastructure. Concretely: Compute & Cluster, Storage Management Manage an HPC-style compute cluster on GCP — job scheduling, autoscaling, node provisioning Provision and maintain developer VMs with consistent tooling, shared storage mounts, and remote desktop access Manage shared network storage for home directories, CAD tools, and IP libraries CI/CD Maintain self-hosted CI runner fleet on GCP (registration, scaling, image management) Own the remote build and caching infrastructure Debug CI failures that turn out to be infrastructure, not code — runner registration races, mount timing, network conflicts Developer Environment Keep the tool stack working: build system, EDA tools on shared storage, license servers, automounted shares Onboard new engineers onto the development environment Solve the kind of problems that start with "my build is slow" and end with tracing a metadata server timeout to a missing link-local route Infrastructure as Code All infrastructure is codified and version-controlled. You'd maintain and extend modules for VMs, networking, fleet policies, IAM, DNS Execute migrations: subnet changes, fleet resizing, blue/green cutovers Review your own plans carefully — a bad apply can take down the shared filesystem Who You Are Deep Linux systems knowledge — you can debug from userspace down to syscalls and routing tables Infrastructure-as-code experience on a major cloud provider (we use GCP, therefore GCP is preferred) Comfort with networking fundamentals: VPCs, subnets, DNS, firewalls, shared filesystems, SSH tunneling Experience managing HPC job schedulers or similar batch compute systems Git proficiency — you'll interact with the same repos and PR workflows as the engineering team Hands-on proficiency with core networking and security concepts that influence infrastructure integrity Willingness to read code you didn't write to understand what infrastructure it needs You don’t need to be a software engineer - but you should be able to read a build rule, a Rust error message, or a CI workflow and figure out what went wrong This is a hybrid role that will require you to work from our Mountain View, CA office 3 days a week on Tuesday through Thursday Bonus Points If You Have EDA/semiconductor tool chain familiarity (Synopsys, Cadence) Rust or Python scripting (for tooling, not product code) Experience with OS-level fleet management (policies, images, package distribution) You don’t need to write RTL or understand hardware architect but this is a plus Compensation The US base salary for this full-time position is determined based on a variety of factors, including role, experience, location, job-related skills, and relevant education and training. Career length is only a guideline for compensation.  $200,000 - $300,000 + equity What We Offer A Stake in our success A cash/equity mix that fits your needs, and option to do early exercise Health & Wellness Company subsidized Health, Dental, Vision, and Life insurance; Pre-tax Health Savings Accounts with generous company contribution (even if you don’t) Time To Recharge 4 weeks paid time off (accrued), 12 company holidays, and 3 weeks remote/flexible work per year Support to Parents Up to 12 weeks of paid parental leave, regardless of your path to parenthood Learning & Development $1,500 yearly towards your professional development e.g. conferences, courses, and other learning opportunities Team Connection Team Lunches, quarterly off-sites, and regular town halls Financial Wellbeing 401K and/or Roth IRA, with 5% company contribution, even if you don’t! Flexible Spending Accounts Pre-tax spend accounts for medical, dental/vision, dependent care, parking, and transit expenses Commute On Us For those commuting up to 1 hour, put your rideshare cost on our company card and reclaim the drive-time to get work done! MatX E[x]

Similar Jobs

Related searches:

Remote Jobs Principal Jobs Remote Principal Jobs Principal AI Infrastructure AI Jobs in Mountain View AI Infrastructure in Mountain View cloudinfrastructure