Topic

Benchmarks

2 pieces in this thread.

2026-04-04 Featured

Recursive by Design

Building Recursive Language Model agents for real engineering tasks — from 1.5M tokens to 53K with Lambda-RLM, and what we learned about agent harness design along the way.

recursive-language-modelsrlmlambda-rlmharness-design
2026-04-03 Featured

The Harness Is All You Need

Why domain-specific agent harnesses, not bigger models, are what close the AI performance gap on real engineering tasks — and why the AEC industry needs proper benchmarks to prove it.

benchmarksaecai-agentsharness-design