Linux PCIe Driver Developer ( 8 - 12 years)
Sandisk
About the role
We are seeking a talented and driven User-Mode Driver (UMD) / system software engineer to join our team and contribute to the host-side software stack for machine learning in the Next Gen Computational PCIe Flash Controller project. In this role, you will be responsible for building high-performance user-space driver frameworks and runtime interfaces that enable efficient communication and data flow between applications and our device via the kernel driver. You will work on key components including user-space APIs, command queues, memory orchestration, and multi-device management to enable scalable ML workloads.
User-Mode Driver Development: Architect and implement high-performance user-space driver libraries for Linux. This includes designing scalable abstractions for multi-device and multi-card systems, and enabling efficient interaction with PCIe devices through kernel interfaces.
Device Discovery & Topology Management: Design and implement mechanisms for PCIe device discovery, enumeration, and logical grouping across multiple endpoints and cards. Develop topology-aware abstractions for managing devices in complex multi-card and switched environments.
Custom Protocol Design: Design and implement a custom, NVMe-like command and control protocol in user space. You'll be responsible for the host-side orchestration, including:
Command Queues: Manage submission and completion queue abstractions and request tracking
Command Orchestration: Implement tag-based request management, async/sync execution, and callback handling
Event Handling: Design event-driven completion handling using poll/eventfd mechanisms
User ↔ Kernel Interface Integration: Develop efficient interaction with the kernel driver using ioctl, mmap, and event mechanisms. Translate high-level runtime requests into kernel-mediated operations while maintaining performance and isolation.
Memory & Dataflow Management: Design user-space data movement strategies, including DMA buffer lifecycle management, memory pinning workflows, and zero-copy data paths for efficient host-device communication.
Multi-Application Support: Enable concurrent multi-process access to devices using per-file descriptor resource management, ensuring isolation, scalability, and robustness across workloads.
Collaboration: Work closely with runtime, kernel driver, firmware, and hardware teams to define clean interfaces and deliver a cohesive, high-performance end-to-end solution.
Qualifications
Experience: 8+ years in system software or runtime development.
User-Space Systems Expertise: Strong experience building user-space drivers, runtime libraries, or high-performance system software interacting with kernel interfaces (ioctl, mmap, eventfd).
Protocol Knowledge: Deep understanding of PCIe and NVMe-like queue models, including submission/completion queues, descriptors, and asynchronous execution patterns.
Low-Level Proficiency: Mastery of C/C++ and strong understanding of memory management, concurrency, and virtual m