Principal Systems Engineer

🇺🇸 Austin, Texas
Posted 5 days ago
Expires August 3, 2026
Full TimeOn-siteEngineeringOperations

Graphcore is seeking a Principal Systems Engineer to provide advanced operational, diagnostic, and engineering support for its Arm-based hardware platforms across lab and data center environments. This role focuses on supporting hardware bring-up, validation, and troubleshooting of complex AI compute platforms, including server blades, racks, and rack-scale infrastructure. The successful candidate will collaborate closely with engineering, platform, and data center teams to ensure the reliability and performance of next-generation AI systems.

Key responsibilities include leading advanced troubleshooting for server blades, motherboards, power systems, and rack-scale infrastructure; supporting engineering bring-up activities, including component validation and firmware interaction testing; diagnosing system-level failures involving thermal behavior, power anomalies, network configuration, and BIOS/BMC issues; collaborating with server engineering teams to perform root cause analysis and propose corrective actions or design improvements; supporting deployment and rollout of next-generation hardware platforms through structured validation and qualification cycles; interfacing with facilities and infrastructure teams to understand environmental factors impacting system reliability; developing and maintaining standard operating procedures (SOPs), troubleshooting guides, and validation documentation; providing guidance and mentorship to junior technicians and engineers on troubleshooting methodologies and hardware diagnostics; and participating in on-call rotations or off-hours support during critical engineering milestones or hardware bring-up phases.

The ideal candidate will have a Bachelor's degree in Electrical Engineering, Computer Engineering, Computer Science, or a related discipline; strong experience with server hardware architectures and board-level debugging; experience analyzing system logs, hardware telemetry, and power/thermal metrics to isolate hardware failures; hands-on experience with HPC systems, AI compute platforms, or rack-scale infrastructure; strong collaboration skills and the ability to work effectively in fast-paced engineering environments; and excellent written and verbal communication skills. Desirable qualifications include experience supporting prototype or pre-production hardware bring-up; familiarity with data center facilities, including liquid cooling and power distribution systems; experience using Python, Bash, or automation tools for hardware validation or troubleshooting; and exposure to structured failure analysis and reliability engineering methodologies.

In addition to a competitive salary, Graphcore offers a comprehensive benefits package. The company is committed to building an inclusive work environment that makes Graphcore a great home for everyone. They offer an equal opportunity process and understand that there are visible and invisible differences in all individuals. Graphcore provides a flexible approach to interviews and encourages candidates to discuss any reasonable adjustments they may require.

Graphcore fosters a culture of continuous learning and constant innovation. The company is opening a new AI Engineering Campus in Austin, which will play a central role in building the future of AI computing. Joining Graphcore offers the opportunity to work with a diverse team of AI research specialists, silicon designers, software engineers, and systems architects, all dedicated to developing cutting-edge AI compute systems.

More Jobs at Graphcore