Kernel Driver Software Engineer

Etched · San Jose, CA

full-time mid Posted 6 months ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

AI Market Demand Pack · $29 one-time

Compare this role's skills with the full AI hiring market. Get ranked demand, salary bands, leading companies, public source URLs, and a decision brief.

See the live sample →

gpu pytorch tensorflow

About this role

About Etched Etched is building hardware for frontier intelligence. We co-design chips, racks, software, and manufacturing to deliver best-in-class throughput and latency across both prefill and decode workloads. Our first products are heavily focused on inference. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history. Key Responsibilities - Design, develop, and maintain kernel-mode drivers ensuring high reliability, informative debug, and optimal performance. - Analyze and optimize driver performance for demanding AI workloads, focusing on minimizing latency and maximizing throughput. - Collaborate closely with hardware engineers throughout the ASIC design process.. - Implement driver support for device virtualization technologies, including SR-IOV, VFIO, and para-virtualization. - Implement efficient memory management strategies considering kernel memory mapping, page tables configuration, NUMA awareness for device data caching, and IOMMU configuration. - Build kernel drivers fundamentally designed to support and maintain security across host processes, physical memory spaces, and device attestation. - Diagnose and resolve complex driver-related issues, using common kernel debugging tools and techniques (ftrace, dmesg, etc.) to identify and fix bugs. - Design and implement synchronization mechanisms to handle concurrent access to multiple accelerators. - Develop and execute comprehensive test plans to validate driver functionality, stability, and performance in manufacturing and in general production environments. - Collaborate with software and hardware teams to diagnose and resolve complex system-level issues. Representative Projects - Develop and optimize kernel-mode drivers for new ML accelerators. - Implement and optimize memory management, including kernel memory mapping and IOMMU configurations, for high-bandwidth data transfers. - Debug and resolve complex driver-related issues impacting ML workload performance. - Develop performance benchmarks and profiling tools to analyze driver performance. - Integrate driver support for advanced features like hardware virtualization and security, including SR-IOV and VFIO. - Optimizing PCIe communication between the host and PCIe devices, using advanced equipment like PCIe analyzers. - Implement and debug power management features for PCIe devices. - Integrating ML accelerators into containerized and virtualized environments. - Implementing and optimizing para-virtualization techniques for PCIe devices. - Configure and optimize page tables for efficient memory access from the ML accelerator. - Participate in hardware-software co-design reviews across teams to optimize performance and power efficiency. You may be a good fit if you have - Proficiency in C/C++. - Strong understanding of kernel-mode driver development and debugging. - Deep understanding of operating system internals (Linux preferred). - Experience with hardware/software interfacing and device drivers. - Experience with memory management and synchronization in kernel environments. - Strong understanding of PCIe and other hardware interfaces. - Experience with device virtualization technologies, including SR-IOV and VFIO. - Strong understanding of kernel memory mapping, page table configuration, and IOMMU. - Familiarity with hardware-software co-design principles. - Proven ability to analyze complex technical problems and provide effective solutions. - Excellent communication and collaboration 1 skills. - Experience with version control systems (e.g., Git). - Experience with debugging tools (e.g., gdb, kgdb). Strong candidates may also have experience with (Nice-to-have qualifications) - Candidates with experience in developing and debugging kernel-mode drivers for GPU or other accelerator devices. - Candidates with a strong understanding of hardware/software interactions. - Candidates with experience in optimizing driver performance for demanding workloads. - Candidates with experience in ML workloads. - Candidates who have debugged complex hardware and software interactions, especially in virtualized environments. - Candidates with experience in implementing and optimizing SR-IOV and VFIO. - Candidates with in-depth knowledge of kernel memory mapping, page tables, and IOMMU. - Candidates with experience in hardware-software co-design projects. - Experience with GPU driver development. - Experience with CUDA, OpenCL, or other GPU programming models. - Experience with performance profiling and benchmarking tools (perf, VTune). - Knowledge of hardware virtualization techniques, including para-virtualization. - Experience with CI/CD pipelines. - Experience with Rust. - Experience with ML frameworks like Tensorflow or Pytorch. - Experience with data center o