ML Runtime Optimization Engineer

🇺🇸 Sunnyvale, California
$2K - $2K Annual
Posted 1 month ago
Expires June 9, 2026
Full TimeOn-siteEngineeringProduct

Applied Intuition is seeking a software engineer with expertise in optimizing machine learning models for deployment on production-grade embedded runtime environments. The role involves working across the entire machine learning framework stack, including technologies such as PyTorch, JAX, ONNX, TensorRT, CUDA, XLA, and Triton. This position is based in Sunnyvale, California, and requires in-office work five days a week, with some flexibility for occasional remote work.

The primary responsibilities include driving machine learning performance optimization for both on-road and off-road Advanced Driver-Assistance Systems (ADAS) and autonomous driving stacks, targeting deployment on various embedded compute platforms. The engineer will develop strategies to enhance the efficiency and reduce the latency of model inference on customer-selected compute boards. Additionally, the role involves working on model pruning and quantization to support deployment on memory-constrained platforms, collaborating closely with machine learning engineers and software developers to identify and implement efficient model architecture solutions, and establishing methodologies to profile model performance and identify bottlenecks during stack integration.

Candidates should possess a Bachelor's degree in Electrical Engineering, Computer Science, or a related field, along with at least three years of experience with machine learning accelerators, GPU, CPU, System on Chip (SoC) architecture, and micro-architecture. Strong software development skills with a focus on embedded programming are essential, as is experience in profiling and optimizing model performance on embedded compute platforms. Proficiency with deep learning frameworks such as PyTorch, JAX, and ONNX is also required.

Preferred qualifications include a Master's or PhD in a machine learning-related area, experience in building machine learning optimization frameworks from scratch, and a background in deploying machine learning solutions to embedded chips for real-time robotics applications.

The compensation package for this full-time position includes a base salary ranging from $159,053 to $199,295 annually, along with equity options and comprehensive benefits. Benefits encompass health, dental, vision, life, and disability insurance coverage, a 401(k) retirement plan with employer match, learning and wellness stipends, and paid time off. Please note that benefits are subject to change and may vary based on the jurisdiction of employment.

More Jobs at Applied Intuition