AI Architect tops SWE-Bench Pro with 39% higher task success. See results

AI Architect tops SWE-Bench Pro

AI Architect Fixed Calendar URL Validation Using Pattern Reuse After Coding Agent Failed

AI Architect Fixed Calendar URL Validation Using Pattern Reuse After Coding Agent Failed

Table of Contents

Summary 

As part of Bito’s SWE‑bench Pro evaluation, AI coding agents were tested on real production engineering tasks from leading open-source repositories. One such task focused on ProtonMail’s calendar subscription modal, where missing input validation for URL created both security and reliability risks. 

The baseline coding agent (Claude Sonnet 4.5) identified the gap but responded with unnecessary abstractions that couldn’t be safely shipped. Instead of extending the existing validation logic, it built a parallel system. Whereas the baseline agent augmented with deep codebase context from Bito’s AI Architect delivered a minimal, pattern-consistent fix that passed all validation scenarios. 

The challenge 

ProtonMail’s calendar subscription modal had validation gaps: no URL length limit (DoS risk from extremely long URLs), inconsistent warnings, and unclear warning priority when multiple issues apply. The fix required enforcing a 2000-character limit, establishing clear warning priority, and centralizing a scattered ResizeObserver mock. 

The task tested whether agents add complexity or leverage existing patterns to solve the problem. 

Why the baseline agent failed 

Claude Sonnet 4.5 over-engineered the solution: it created 3 new helper functions, a new enum (URL_WARNING_TYPE), a useMemo hook, an Alert component import, and ~50 lines of new code with ~10 edit operations. It never ran tests to verify the implementation. 

The irony: the component already had a getError() function that handled multiple error cases. Adding one more case would have been a single line. Instead, the agent built a parallel warning system alongside the existing error system. 

⚠ ROOT CAUSE: The coding agent over-engineered the solution by creating parallel systems (new enum, new functions, new hooks) instead of extending the existing getError() pattern. More code meant more potential bugs, and without test verification, those bugs remained hidden.

How Bito’s AI Architect solved it 

Bito’s AI Architect approach was fundamentally different. Thanks to its codebase knowledge graph that provides system-wide context to the coding agent. 

Using its knowledge graph, Bito’s AI Architect mapped the relationships between: 

  • The Calendar subscription modal component 
  • The existing validation flow, including the getError() function 
  • The UI state logic controlling warnings and disabled submission 
  • The test suite and ResizeObserver mocks 
  • And other components in the codebase using the same validation patterns 

This knowledge graph is not just a file index. It captures how logic flows through the system, which functions act as validation authorities, which modules consume their output, and where changes would have the safest and most consistent effect. 

Bito’s AI Architect’s methodology favored extending existing patterns over creating new systems. The treatment agent made 6 focused edits: added one constant (CALENDAR_URL: 2000), one inline boolean (isTooLong), one line to the existing getError() function, one condition to the disabled check, and centralized the ResizeObserver mock. Total: ~15 lines. 

The 6-line solution was simpler, safer, and consistent with existing patterns in the codebase. 

KEY ARCHITECTURAL INSIGHT 

The best code change is the smallest one that solves the problem. Bito’s AI Architect’s pattern-reuse philosophy recognized that getError() already handled multiple cases, extending it by one line was simpler than building a parallel warning system.

Head-to-head comparison 

 Claude Sonnet 4.5 (baseline agent) Bito’s AI Architect 
Code Exploration Identified validation gap but missed existing getError() pattern Recognized getError() as the right extension point immediately 
Approach Over-engineered — 3 helpers, 1 enum, useMemo, Alert component (~50 lines) Minimal — extended existing getError() with 1 case (~15 lines) 
New Abstractions URL_WARNING_TYPE enum, hasValidExtension(), isGooglePublicLink(), getURLWarning() Zero — reused existing patterns entirely 
Verification None — tests never run All scenarios verified — submission blocked for invalid URLs 
Task Outcome FAILED — unverified, over-complex solution PASSED — minimal, pattern-consistent fix 

Conclusion 

This ProtonMail case highlights why system-wide context is essential for AI-driven development. Without it, agents reinvent existing logic, producing bloated and error-prone code. 

The baseline agent ignored ProtonMail’s existing getError() validation and added a separate system, making the solution longer, harder to maintain, and risky to ship. 

Bito’s AI Architect maps the full codebase, giving the agent system-level context and full awareness of upstream and downstream dependencies across components. The result is minimal, safe, and verifiable changes. 

This matters because most engineering work happens in existing systems, where the goal is not to build new features from scratch, but to fix or improve code that already exists. 

Picture of Anand Das

Anand Das

Anand is Co-founder and CTO of Bito. He leads technical strategy and engineering, and is our biggest user! Formerly, Anand was CTO of Eyeota, a data company acquired by Dun & Bradstreet. He is co-founder of PubMatic, where he led the building of an ad exchange system that handles over 1 Trillion bids per day.

Picture of Amar Goel

Amar Goel

Amar is the Co-founder and CEO of Bito. With a background in software engineering and economics, Amar is a serial entrepreneur and has founded multiple companies including the publicly traded PubMatic and Komli Media.

Written by developers for developers red heart icon

This article is brought to you by the Bito team.

Latest posts

Why Coding Agents Get Lost in Your Codebase (Even After Indexing Everything) 

The TPUT Implementation Claude Code Got Wrong and AI Architect Got Right

How to Integrate Bito’s AI Architect with Claude Code

How to Integrate Bito’s AI Architect with Cursor

The 9-File Security Hardening That Coding Agents Missed and AI Architect Nailed

Top posts

Why Coding Agents Get Lost in Your Codebase (Even After Indexing Everything) 

The TPUT Implementation Claude Code Got Wrong and AI Architect Got Right

How to Integrate Bito’s AI Architect with Claude Code

How to Integrate Bito’s AI Architect with Cursor

The 9-File Security Hardening That Coding Agents Missed and AI Architect Nailed

From the blog

The latest industry news, interviews, technologies, and resources.

Code Indexing

Why Coding Agents Get Lost in Your Codebase (Even After Indexing Everything) 

arrow bito ai
The TPUT Implementation Claude Code Got Wrong and AI Architect Got Right

The TPUT Implementation Claude Code Got Wrong and AI Architect Got Right

arrow bito ai
How to Integrate Bito's AI Architect with Claude Code

How to Integrate Bito’s AI Architect with Claude Code

arrow bito ai