Taming Legacy Code with AI
Legacy code is simply code that works but lacks a safety net. Before AI, inheriting an undocumented system meant weeks of manual archaeology. With a structured AI workflow, you can build a mental map of any codebase in hours, extract hidden business rules in an afternoon, and begin safe modernization by the next morning.
Understand Before You Touch. Never ask AI to modify legacy code you haven't first understood. AI will confidently refactor code while ignoring the subtle side effects or undocumented business rules that keep the system alive.
Phase 1: Comprehension (The Archaeology)
Build your mental map without changing a single line. Use these five prompts in sequence to extract the system's "DNA."
The Archaeology Prompts
"Explain in plain English: what problem does this code solve? What are the primary inputs and outputs? Who are the actors involved?"
"Identify all internal and external dependencies. Map the data flow from entry to persistent storage. Flag any hidden couplings."
"Identify every logic branch where a business decision is made. List them as: IF [condition] THEN [outcome]. Explain the 'why' if possible."
"Identify the 3 most dangerous sections of this code. What is most likely to break if I modify it? Are there any 'magic numbers'?"
"I need to add [feature]. Based on the logic above, which modules will be affected? What should I test before and after the change?"
Phase 2: The Documentation Stack
Document while the comprehension is fresh. This is the work that pays compounding dividends to your team.
- 1. FEATURE REGISTER: A list of all business rules extracted from the code.
- 2. DATA DICTIONARY: Explanation of every database field and state transition.
- 3. TEST SPECIFICATION: A list of 'Characterization Tests' that define how the code actually behaves today.
Phase 3: Safe Refactoring (The Strangler Fig)
The most dangerous action is a big rewrite. The safest is the Strangler Fig pattern: replacing small pieces incrementally while the old and new systems run in parallel.
Modernization Strategies
Use AI to write tests that match the current behavior of the legacy code. Prompt: "Write Vitest characterization tests for this function. Capture its current behavior for all edge cases, including its bugs."
Create a new wrapper function that calls the old logic. This is your "switch."
Re-implement one logic branch in the new function. Use AI to refactor the logic to modern standards while keeping the tests passing.
Once all tests pass on the new implementation, delete the old code. The legacy code has been "strangled."
Where AI Fails Legacy Code (The Human Layer)
Static analysis has limits. AI reads the syntax but cannot feel the pulse of a running system.
Verification Checkpoints
AI cannot see race conditions, memory leaks, or environment-specific config issues that only appear under load.
AI doesn't know why a specific hack was added in 2022. That 'inefficient' code might be a fix for a specific browser bug.
Key Takeaways
Legacy code is a system to be understood, not just a problem to be solved. Use AI to map the system before you break ground.
Characterization tests are your only true specification. If they pass, you are safe; if you skip them, you are guessing.
The goal is a modernized system, not a new one. The Strangler Fig pattern is the only way to modernize without losing the business logic buried in the legacy layers.