Kernel Driver Software Engineer

Etched · San Jose, CA
full-time mid Posted 3 months ago

About this role

About Etched Etched is building the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history. Key Responsibilities - Design, develop, and maintain kernel-mode drivers ensuring high reliability, informative debug, and optimal performance. - Analyze and optimize driver performance for demanding AI workloads, focusing on minimizing latency and maximizing throughput. - Collaborate closely with hardware engineers throughout the ASIC design process.. - Implement driver support for device virtualization technologies, including SR-IOV, VFIO, and para-virtualization. - Implement efficient memory management strategies considering kernel memory mapping, page tables configuration, NUMA awareness for device data caching, and IOMMU configuration. - Build kernel drivers fundamentally designed to support and maintain security across host processes, physical memory spaces, and device attestation.  - Diagnose and resolve complex driver-related issues, using common kernel debugging tools and techniques (ftrace, dmesg, etc.) to identify and fix bugs. - Design and implement synchronization mechanisms to handle concurrent access to  multiple accelerators. - Develop and execute comprehensive test plans to validate driver functionality, stability, and performance in manufacturing and in general production environments. - Collaborate with software and hardware teams to diagnose and resolve complex system-level issues. Representative Projects - Develop and optimize kernel-mode drivers for new ML accelerators. - Implement and optimize memory management, including kernel memory mapping and IOMMU configurations, for high-bandwidth data transfers. - Debug and resolve complex driver-related issues impacting ML workload performance. - Develop performance benchmarks and profiling tools to analyze driver performance. - Integrate driver support for advanced features like hardware virtualization and security, including SR-IOV and VFIO. - Optimizing PCIe communication between the host and PCIe devices, using advanced equipment like PCIe analyzers. - Implement and debug power management features for PCIe devices. - Integrating ML accelerators into containerized and virtualized environments. - Implementing and optimizing para-virtualization techniques for PCIe devices. - Configure and optimize page tables for efficient memory access from the ML accelerator. - Participate in hardware-software co-design reviews across teams to optimize performance and power efficiency. You may be a good fit if you have - Proficiency in C/C++. - Strong understanding of kernel-mode driver development and debugging. - Deep understanding of operating system internals (Linux preferred). - Experience with hardware/software interfacing and device drivers. - Experience with memory management and synchronization in kernel environments. - Strong understanding of PCIe and other hardware interfaces. - Experience with device virtualization technologies, including SR-IOV and VFIO. - Strong understanding of kernel memory mapping, page table configuration, and IOMMU. - Familiarity with hardware-software co-design principles. - Proven ability to analyze complex technical problems and provide effective solutions. - Excellent communication and collaboration 1 skills.    - Experience with version control systems (e.g., Git). - Experience with debugging tools (e.g., gdb, kgdb). Strong candidates may also have experience with (Nice-to-have qualifications) - Candidates with experience in developing and debugging kernel-mode drivers for GPU or other accelerator devices. - Candidates with a strong understanding of hardware/software interactions. - Candidates with experience in optimizing driver performance for demanding workloads. - Candidates with experience in ML workloads. - Candidates who have debugged complex hardware and software interactions, especially in virtualized environments. - Candidates with experience in implementing and optimizing SR-IOV and VFIO. - Candidates with in-depth knowledge of kernel memory mapping, page tables, and IOMMU. - Candidates with experience in hardware-software co-design projects. - Experience with GPU driver development. - Experience with CUDA, OpenCL, or other GPU programming models. - Experience with performance profiling and benchmarking tools (perf, VTune). - Knowledge of hardware virtualization techniques, including para-virtualization. - Experience with CI/CD pipelines. - Experience w

Similar Jobs

Related searches:

On-site Jobs Mid-Level Jobs On-site Mid-Level Jobs Mid-Level Machine LearningMid-Level AI Infrastructure AI Jobs in San Jose Machine Learning in San JoseAI Infrastructure in San Jose tensorflowpytorchgpu