#mep-automation

Posts in this topic thread.

2026-03-12
Benchmarking Agents on Real Engineering Work Is Already Teaching Us Something Important

Benchmarking AI agents on real HVAC engineering tasks across Claude and GPT models. Results on harness-dependent capability, agent evaluation design, and why AEC-domain benchmarks reveal what general benchmarks miss.

harness-engineering agentic-ai ai-in-aec ai-benchmarks agent-evaluation hvac-ai design-review mep-automation