SWE-WebDev Bench

A 68-metric evaluation framework for AI coding platforms as virtual software agencies.

Please enable JavaScript to view the full interactive benchmark site.

View on GitHub