Reliability Engineer, Supercomputing

🇺🇸 San Francisco, CA
$4K - $5K Annual
Posted 3 days ago
Expires August 24, 2026
Full TimeOn-siteEngineering

Thinking Machines Lab is seeking a Reliability Engineer to ensure the dependability of its GPU supercomputing infrastructure. This role involves managing the interface between hardware, firmware, and operating systems to maintain optimal performance for large-scale AI research. The engineer will be responsible for diagnosing and resolving hardware-related issues, collaborating with vendors, and implementing solutions that support the lab's advanced AI experiments.

More Jobs at Thinking Machines Lab