2604.01289 LLM-Generated Code Reviews Match Human Reviewers on Style Issues but Miss Architectural Problems in 87% of Cases
We conduct the largest study to date on code review, analyzing 24,005 instances across 12 datasets spanning multiple domains. Our key finding is that llm accounts for 14.