Principal Engineer, Systems Design Engineering
Sandisk
About the role
Role Summary
Own the end-to-end PCIe system design for an NVMe SSD product line across client laptops and enterprise servers, from PHY/MAC review through ASIC/SoC integration, PCIe SFR/register analysis, and firmware design guidelines for robust link training, link transitions, low-power behavior. This role sits at the intersection of PCIe spec compliance, NVMe behavior, FW architecture, platform interoperability, and power/performance tuning.
Key Responsibilities
Own system‑level PCIe Gen5/Gen6 architecture from an NVMe SSD endpoint perspective
Define and review PCIe + NVMe integration across SSD products
PHY + MAC IP review, integration requirements and constraints
SoC/ASIC integration: clocks, resets, power domains, straps, lane mapping, sidebands
PCIe SFR + FW guidelines: flow control, LTSSM observability, power states, error handling
Link & low power transitions: DLRM, L1, L1SS, L0p, ASPM, clock-down, APST Coordination
Bring-up + debug: enumeration, speed negotiation, width detection, stability, AER/error recovery
Customer requirement tuning: latency/power, performance, reliability and consistency
Provide deep expertise in PCIe configuration and extended capability registers, including:Link, power management, MSI/MSI‑X, AER, BARs, L1SS
Lead platform bring‑up and debug:Enumeration, link training, speed negotiation, power states, error handling
Act as the technical authority for cross‑team and customer escalations
Detailed Responsibilities (End‑to‑End PCIe for NVMe SSD)
PHY/MAC IP Review (System Design Perspective)
Understand criteria for PHY/MAC/controller IP:Gen5/Gen6 readiness, equalization capability, margining hooks, lane mapping flexibility
SRNS/SRIS tolerance, clocking modes, power management support
Observability: LTSSM state visibility, error counters, replay/NAK stats, equalization telemetry
Review IP documents:Reset sequences, compliance features, link speed change support
L1SS behavior, CLKREQ#/REFCLK control expectations
AER robustness, surprise down handling, hot/warm reset behavior
Specify platform-facing requirements:Retimer/redriver compatibility assumptions (backplane/adaptor/cables)
ASIC/SoC Integration Ownership
Integrate PCIe subsystem with:Clocking: REFCLK handling, clock request gating, clock-down sequences
Resets: PERST# behavior, internal resets, warm/hot resets, FLR support as applicable
Power domains: retention strategies, wake sources, D-state coordination
Sidebands: WAKE#, CLKREQ#, presence detect patterns (platform dependent)
Define lane policy:x4 typical NVMe, lane reversal/polarity, width detection & recovery from degraded width
PCIe SFR / Register + FW Design Guidelines
Define a clean SFR map that FW uses for:LTSSM control/observability (state, substate, timers, retries)
Link speed/width control and status (negotiated vs target)
Low-power triggers: ASPM enable/disable, L1SS policy, L0p policy (if implemented)
Clock request & clock gating behavior (safe entry/exit rules)
Error logging counters (replay