Part 14 — Taming Legacy Code with AI: From Archaeology to Modern Architecture

Taming Legacy Code with AI

Legacy code is simply code that works but lacks a safety net. Before AI, inheriting an undocumented system meant weeks of manual archaeology. With a structured AI workflow, you can build a mental map of any codebase in hours, extract hidden business rules in an afternoon, and begin safe modernization by the next morning.

Primary Objective

Comprehension Prompts | Documentation Stack | Strangler Fig Pattern

💡

The Golden Rule of Legacy

Understand Before You Touch. Never ask AI to modify legacy code you haven't first understood. AI will confidently refactor code while ignoring the subtle side effects or undocumented business rules that keep the system alive.

Phase 1: Comprehension (The Archaeology)

Build your mental map without changing a single line. Use these five prompts in sequence to extract the system's "DNA."

The Archaeology Prompts

👁️

01: OVERVIEW

"Explain in plain English: what problem does this code solve? What are the primary inputs and outputs? Who are the actors involved?"

🔗

02: DEPENDENCY

"Identify all internal and external dependencies. Map the data flow from entry to persistent storage. Flag any hidden couplings."

📜

03: BUSINESS RULES

"Identify every logic branch where a business decision is made. List them as: IF [condition] THEN [outcome]. Explain the 'why' if possible."

🛑

04: RISK ASSESSMENT

"Identify the 3 most dangerous sections of this code. What is most likely to break if I modify it? Are there any 'magic numbers'?"

📐

05: IMPACT ANALYSIS

"I need to add [feature]. Based on the logic above, which modules will be affected? What should I test before and after the change?"

Phase 2: The Documentation Stack

Document while the comprehension is fresh. This is the work that pays compounding dividends to your team.

The Documentation Artifacts

1. FEATURE REGISTER: A list of all business rules extracted from the code.
2. DATA DICTIONARY: Explanation of every database field and state transition.
3. TEST SPECIFICATION: A list of 'Characterization Tests' that define how the code actually behaves today.

Phase 3: Safe Refactoring (The Strangler Fig)

The most dangerous action is a big rewrite. The safest is the Strangler Fig pattern: replacing small pieces incrementally while the old and new systems run in parallel.

Modernization Strategies

🛡️

01: CHARACTERIZE

Use AI to write tests that match the current behavior of the legacy code. Prompt: "Write Vitest characterization tests for this function. Capture its current behavior for all edge cases, including its bugs."

🎁

02: WRAP

Create a new wrapper function that calls the old logic. This is your "switch."

🔄

03: REPLACE

Re-implement one logic branch in the new function. Use AI to refactor the logic to modern standards while keeping the tests passing.

✂️

04: STRANGLE

Once all tests pass on the new implementation, delete the old code. The legacy code has been "strangled."

Where AI Fails Legacy Code (The Human Layer)

Static analysis has limits. AI reads the syntax but cannot feel the pulse of a running system.

Verification Checkpoints

⚡RUNTIME BEHAVIOR

AI cannot see race conditions, memory leaks, or environment-specific config issues that only appear under load.

🧠HISTORICAL CONTEXT

AI doesn't know why a specific hack was added in 2022. That 'inefficient' code might be a fix for a specific browser bug.

Key Takeaways

Archaeology first, Engineering second

Legacy code is a system to be understood, not just a problem to be solved. Use AI to map the system before you break ground.

Tests as Truth

Characterization tests are your only true specification. If they pass, you are safe; if you skip them, you are guessing.

Strangle, Don't Explode

The goal is a modernized system, not a new one. The Strangler Fig pattern is the only way to modernize without losing the business logic buried in the legacy layers.