Senior Data Center Deployment Engineer

Nebius · Oklahoma, United States · $125k - $180k

full-time senior Posted 4 months ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

AI Market Demand Pack · $29 one-time

Compare this role's skills with the full AI hiring market. Get ranked demand, salary bands, leading companies, public source URLs, and a decision brief.

See the live sample →

cloud infrastructure

About this role

About Nebius: Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure. Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI. Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R&D. The role Nebius operates large-scale, GPU-dense AI infrastructure across mission-critical data center environments. As a Senior Delivery Deployment Engineer, you will own the end-to-end delivery, deployment, and production readiness of next-generation GPU platforms inside our data centers. This role sits at the intersection of hardware, Linux systems, and operational execution. You will lead on-site rack bring-up, validate NVIDIA-based AI systems, coordinate repairs, and ensure GB-series infrastructure moves from installation to fully operational production environments with precision and reliability. You will collaborate closely with hardware engineering, networking, and infrastructure teams to deploy and stabilize H200 and B200-based GPU systems at scale. Your responsibilities will include: Lead end-to-end deployment of GB-series racks within data center environments Oversee installation, bring-up, validation, and production readiness of NVIDIA H200 and B200-based servers Troubleshoot complex hardware, firmware, Linux OS, and networking issues Execute structured testing and validation procedures during deployment Develop and maintain basic Linux-based hardware health-check and diagnostic scripts Coordinate on-site hardware repairs, part replacements, and vendor escalations Drive root cause analysis and ensure corrective actions are implemented Manage and prioritize deployment timelines across multiple concurrent rollouts Provide technical leadership and guidance to on-site engineers and technicians Partner with networking and infrastructure teams to ensure seamless integration Document deployment processes, validation standards, and operational runbooks What we expect you to have: Strong hands-on experience deploying and operating data center infrastructure Deep familiarity with GPU-dense systems, ideally NVIDIA H-series platforms Experience working with high-density rack deployments (GB-series or similar) Solid Linux experience, including troubleshooting and scripting Ability to diagnose issues across hardware, OS, firmware, and network layers Experience coordinating field repairs and working directly with hardware vendors Proven experience leading technical teams or overseeing field operations High ownership mindset and ability to operate in production-critical environments Clear communication skills and ability to collaborate across distributed teams It will be an added bonus if you have: Experience deploying AI or HPC clusters at scale Familiarity with automated provisioning or infrastructure lifecycle systems Background in hardware qualification, burn-in testing, or factory validation Experience supporting rapid infrastructure expansion Exposure to ARM-based or heterogeneous compute environments Working conditions: Collaboration with globally distributed engineering and operations teams Key employee benefits: Health insurance: 100% company-paid medical, dental, and vision coverage for employees and families 401(k) plan: up to 4% company match with immediate vesting Parental leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers Remote work reimbursement: up to $85/month for mobile and internet Disability & life insurance: company-paid short-term, long-term, and life insurance coverage Compensation We offer competitive salaries, ranging from $125k- $180k base + quarterly performance bonuses. Join Nebius today and help build the software that powers the next generation ofAI infrastructure. Benefits & Perks: Competitive compensation Career growth and learning opportunities Flexibility and ownership Collaborative and innovative culture Opportunity to work on impactful AI projects International environment and talented teams What's it like to work at Nebius: Fast moving - Bold thinking - Constant growth - Meaningful impact - Trust and real ownership - Opportunity to shape the future of AI Equal Opportunity Statement: Nebius is an equal opportunity employer. We are committed to fostering an inclusive and diverse workplace and to providing equal employment opportunities in all aspects of employment. We do not discriminate on the basis of race, c