Machine Learning Eval Engineer

🇺🇸 San Francisco, California
$2K - $3K Annual
Posted 1 month ago
Expires June 9, 2026
Full TimeOn-siteEngineeringData Science

About Reducto

Reducto helps AI teams ingest real-world enterprise data with state-of-the-art accuracy.

Most enterprise data, from financial statements to health records, is locked in unstructured file formats like PDFs and spreadsheets. We train vision models to read those documents the way a human would, enabling teams to build products, train models, and automate processes at scale.

We’ve grown rapidly, increasing revenue 8x YOY and partnering with hundreds of companies, from leading AI teams like Harvey, Vanta, and Scale, to enterprise customers across FAANG and top trading firms.

Reducto has raised over $100M from world-class investors including a16z, Benchmark, and First Round Capital.

THE OPPORTUNITY

As an ML Eval Engineer, you’ll play a key role in building the evaluation systems and benchmarks that make Reducto’s models better over time. You’ll collaborate closely with our ML, platform, and GTM teams to identify model weaknesses, design strong benchmarks, and create metrics and tooling that surface new failure modes as we scale. This is a high-impact role where you’ll help define how model quality is measured at Reducto and shape the systems we use to improve it.

WHAT YOU’LL DO

- Design, build, and maintain evaluation benchmarks that reveal where our models perform well and where they fail.

- Develop metrics, heuristics, and workflows to automatically identify new failure modes across large and messy real-world datasets.

- Partner closely with other ML engineers to turn evaluation insights into model improvements and better training priorities.

- Work hands-on with unstructured enterprise data, including PDFs, spreadsheets, and other difficult document formats, to uncover edge cases and hard examples.

- Build lightweight internal and user-facing tools, including simple interfaces in Python frameworks like Flask, to help teams inspect results, analyze model behavior, and communicate evaluation outcomes.

- Collaborate with customers and...

More Jobs at Reducto