Kernel Driver Software Engineer
full-time
mid
Posted 3 months ago
About this role
About Etched
Etched is building the world’s first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep & parallel chain-of-thought reasoning agents. Backed by hundreds of millions from top-tier investors and staffed by leading engineers, Etched is redefining the infrastructure layer for the fastest growing industry in history.
Key Responsibilities
- Design, develop, and maintain kernel-mode drivers ensuring high reliability, informative debug, and optimal performance.
- Analyze and optimize driver performance for demanding AI workloads, focusing on minimizing latency and maximizing throughput.
- Collaborate closely with hardware engineers throughout the ASIC design process..
- Implement driver support for device virtualization technologies, including SR-IOV, VFIO, and para-virtualization.
- Implement efficient memory management strategies considering kernel memory mapping, page tables configuration, NUMA awareness for device data caching, and IOMMU configuration.
- Build kernel drivers fundamentally designed to support and maintain security across host processes, physical memory spaces, and device attestation.
- Diagnose and resolve complex driver-related issues, using common kernel debugging tools and techniques (ftrace, dmesg, etc.) to identify and fix bugs.
- Design and implement synchronization mechanisms to handle concurrent access to multiple accelerators.
- Develop and execute comprehensive test plans to validate driver functionality, stability, and performance in manufacturing and in general production environments.
- Collaborate with software and hardware teams to diagnose and resolve complex system-level issues.
Representative Projects
- Develop and optimize kernel-mode drivers for new ML accelerators.
- Implement and optimize memory management, including kernel memory mapping and IOMMU configurations, for high-bandwidth data transfers.
- Debug and resolve complex driver-related issues impacting ML workload performance.
- Develop performance benchmarks and profiling tools to analyze driver performance.
- Integrate driver support for advanced features like hardware virtualization and security, including SR-IOV and VFIO.
- Optimizing PCIe communication between the host and PCIe devices, using advanced equipment like PCIe analyzers.
- Implement and debug power management features for PCIe devices.
- Integrating ML accelerators into containerized and virtualized environments.
- Implementing and optimizing para-virtualization techniques for PCIe devices.
- Configure and optimize page tables for efficient memory access from the ML accelerator.
- Participate in hardware-software co-design reviews across teams to optimize performance and power efficiency.
You may be a good fit if you have
- Proficiency in C/C++.
- Strong understanding of kernel-mode driver development and debugging.
- Deep understanding of operating system internals (Linux preferred).
- Experience with hardware/software interfacing and device drivers.
- Experience with memory management and synchronization in kernel environments.
- Strong understanding of PCIe and other hardware interfaces.
- Experience with device virtualization technologies, including SR-IOV and VFIO.
- Strong understanding of kernel memory mapping, page table configuration, and IOMMU.
- Familiarity with hardware-software co-design principles.
- Proven ability to analyze complex technical problems and provide effective solutions.
- Excellent communication and collaboration 1 skills.
- Experience with version control systems (e.g., Git).
- Experience with debugging tools (e.g., gdb, kgdb).
Strong candidates may also have experience with (Nice-to-have qualifications)
- Candidates with experience in developing and debugging kernel-mode drivers for GPU or other accelerator devices.
- Candidates with a strong understanding of hardware/software interactions.
- Candidates with experience in optimizing driver performance for demanding workloads.
- Candidates with experience in ML workloads.
- Candidates who have debugged complex hardware and software interactions, especially in virtualized environments.
- Candidates with experience in implementing and optimizing SR-IOV and VFIO.
- Candidates with in-depth knowledge of kernel memory mapping, page tables, and IOMMU.
- Candidates with experience in hardware-software co-design projects.
- Experience with GPU driver development.
- Experience with CUDA, OpenCL, or other GPU programming models.
- Experience with performance profiling and benchmarking tools (perf, VTune).
- Knowledge of hardware virtualization techniques, including para-virtualization.
- Experience with CI/CD pipelines.
- Experience w
Similar Jobs
Related searches: