In 1952, the World Health Organization sent a team to Borneo with a clear mission: stop a malaria outbreak. Their tool was DDT. Their execution was flawless. Within weeks, mosquito populations crashed and malaria rates fell sharply. The intervention worked.
Then the roofs started falling in. Rats began spreading plague and typhus. The WHO found itself shipping live cats by parachute into the jungle to fix a catastrophe that had not existed before they arrived.
This is not a story about incompetence. The people involved were experts acting in good faith on the best available evidence. It is a story about what happens when you intervene in a system you have not mapped. The WHO understood the goal: reduce mosquitoes. They did not understand the system: what else DDT would kill, what those things were doing, and what would fill the vacuum they left behind.
In Part 1, we established what a system is — interconnected elements, working toward a purpose, inside a boundary. In Part 2, we build the instruments to see inside systems before we touch them. Five tools. That is the entire toolkit.
Tool 1: The System Map
Think of a system map the way you would think about a satellite view before a hiking trip. You are not learning the terrain at ground level — you are learning who is connected to whom, and through what paths. It is the first thing you build before you do anything else.
A system map has two steps. Nothing more.
Step 1: Identify the elements. Elements are anything you can see, feel, count, or measure. They can be individual components or larger subsystems — departments in a company, species in an ecosystem, microservices in a platform. They can be internal (inside your boundary) or external (outside it, but connected to it). For the little convenience store in your neighborhood, the internal elements are the owner, the inventory, and the bank account. The external elements are customers, the local community, utility providers, and the tax department.
Step 2: Map the interconnections. Draw the flows between elements. Who sends what to whom? The owner manages the inventory. Customers buy from the inventory. That purchase increases the bank account. The bank account funds restocking. The tax department withdraws from the bank account. Each arrow is a claim about how a change in one element propagates to another.
Two practical problems you will run into immediately.
The map never ends. You can always zoom in or zoom out. The convenience store expands to a supermarket chain, which expands to include logistics networks, which expands to include global trade agreements. At some point you are mapping everything on the planet. The solution is subsystems. Distribution is a subsystem. Finance is a subsystem. You draw a box around each one and treat it as a single node until you need to go deeper. Break complexity into scoped containers.
Not every element matters. In a map of the supermarket chain, you might include trade associations as an external element. But if trade associations have no realistic influence on your current question, they just add noise. Limit the map to players of influence — elements that will meaningfully change the system's behavior if they change. This is subjective. That is fine.
The goal is not a perfect picture. The goal is a picture that is more accurate than the one you had before you drew it.
Tool 2: Causal Loop Diagrams
A system map tells you who is connected. A causal loop diagram tells you how the connection behaves over time — specifically, whether elements amplify each other or constrain each other. This distinction is the engine of all system dynamics.
There are two types of relationships in any system.
Positive relationship (+): More of A produces more of B. More content produced by a social media influencer → more visibility. More visibility → more followers. More followers → more content. Every arrow in this chain carries a + sign. The loop is self-reinforcing. Call it R.
Negative relationship (−): More of A produces less of B. More negative reviews of an online course → fewer sales. Fewer sales → less revenue. That is a − relationship. The loop feeds back against itself. Call it B — a balancing loop. It is the system's stabilizer, the mechanism that pulls runaway behavior back toward equilibrium.
Look at this diagram carefully. Three things are happening at once.
First, there is a reinforcing loop: more sales generate more reviews, more positive reviews generate more sales. The system feeds itself. In a healthy range, this is a growth engine. It is also why successful course platforms compound over time in ways that new entrants cannot match.
Second, there is a balancing loop: more sales generate more support demand, more support demand increases staff pressure, more staff pressure degrades service quality, more quality problems generate negative reviews, negative reviews reduce sales. The system self-corrects. But notice: the correction is delayed. The pressure builds invisibly for weeks before the reviews arrive.
Third, there is a second balancing loop: negative reviews directly reduce sales. The first loop punishes growth through quality degradation. This one punishes growth through reputation damage. In an extreme scenario — say, sales unexpectedly triple — this loop can dominate entirely, overriding the reinforcing growth loop.
This last observation is loop dominance: the loop that controls system behavior shifts depending on where the system is. A moderate, healthy organization is usually dominated by the reinforcing loop. The same organization at ten times scale may suddenly be dominated by the balancing loops, producing behavior that looks like success turning into failure from the inside.
If you only build and run the reinforcing loop — if you only think about growth — you will be surprised every time the balancing loops assert themselves. And they always do.
Tool 3: Stock and Flow Diagrams
The causal loop diagram shows you relationships. The stock and flow diagram adds the element that makes systems genuinely counterintuitive: time.
Two definitions.
A stock is anything you can measure at a single point in time. The number of employees. The water level in a reservoir. Your bank account balance. The inventory of books on a shelf. The active customer count. What makes something a stock is not what it is — it is that it can be measured right now, at this instant, and the number means something.
A flow is what fills or depletes a stock. Not the stock itself — the movement that changes it. Hiring is a flow (it increases employee headcount). Attrition is a flow (it decreases it). Sales are a flow (depleting inventory). Purchasing from a supplier is a flow (filling it). Flows have rates. Stocks have levels.
Three things in this diagram deserve attention.
The delay. When the bookstore orders new stock from the supplier, books do not arrive instantly. There is a lag — days or weeks — before the inflow affects the inventory level. This delay is marked with two parallel lines crossing the flow arrow. It is not decoration. Delays are why systems oscillate. An inventory manager who sees low stock, orders a large batch, and then receives that batch after demand has already dropped will now have excess inventory — and will under-order for the next cycle. The lag between action and consequence creates the oscillation. Every supply chain shortage and glut in history traces back to a delay that the model did not include.
The stock limit. The shelf can only hold so many books. When inventory hits the limit, the inflow is constrained regardless of the purchase rate. Engineering equivalent: your cache has a maximum size. Your thread pool has a ceiling. Your database write throughput has a limit. Stocks have limits and ignoring them is how systems fail under load.
The reinforcing feedback loop. The bookstore discovered something: local collaborations generated positive reviews, positive reviews drove sales, sales generated cash, cash funded more collaborations. This is a reinforcing loop operating on top of the stock-and-flow model. In practice, stock and flow diagrams and causal loop diagrams often merge — because stocks are what accumulate, and loops are what drive the rates that change them.
The deeper you build these models, the more questions emerge. What happens if you introduce a mobile app? What balancing loops will activate when collaborations scale? These are not rhetorical questions. They are the output of the modeling process — which is valuable even before you have the answers.
System Traps: How Good Systems Produce Bad Outcomes
Here is the uncomfortable truth that all five tools in this article eventually lead you to: most system failures are not caused by bad intentions. They are caused by predictable patterns that good intentions walk straight into.
These patterns have a name: system traps.
The Cobra Effect
British-ruled India. Cobras everywhere. Someone in the government had a reasonable idea: pay citizens for every dead cobra they turn in. Crowdsource the solution. It worked — immediately. Cobra carcasses flooded in.
Then enterprising citizens figured out they could breed cobras, kill them, and collect the reward. The incentive designed to reduce cobras was now subsidizing cobra production. When the government discovered this and cancelled the program, the breeders had no further use for their inventory. They released every cobra they had been raising.
Cobra population: higher than when the program started.
The Cobra Effect is a specific trap: an incentive designed to solve a problem ends up rewarding the behavior that makes it worse. You see this constantly in software. Velocity metrics rewarded → engineers close tickets without root-cause investigation. Code coverage metrics mandated → engineers write tests that cover lines without testing behavior. On-call escalation rates measured → engineers resolve incidents without post-mortems. Every metric that becomes a target stops measuring what it used to measure.
Addiction and Shifting the Burden
An organization introduces overtime to cover a staffing shortage. The overtime works. The deadline is met. The shortage is never fixed. Overtime becomes permanent. The organization is now addicted to the intervention that was supposed to be temporary.
This pattern appears identically with:
- External consultants hired for a specific capability gap who become structural dependencies
- Government subsidies introduced during a crisis that entire industries build their unit economics around
- Monitoring alerts set up for a specific incident that are never cleaned up and mask real signals for years
The mechanism is always the same. A short-term intervention relieves the symptom. Relieving the symptom reduces the pressure to fix the root cause. The root cause persists. The intervention becomes load-bearing. The system is now dependent on something that was never designed to be permanent.
Drift to Low Performance
This one is the quietest and most dangerous. Performance standards in a system can decline so gradually, and so continuously, that no single moment ever triggers an alarm. The product that used to ship features every two weeks now ships every six. The customer support response time that used to be two hours is now two days. The code review that used to catch architectural issues now approves anything that passes the linter.
Each individual step down was small enough to accept. Together they represent a fundamental collapse. By the time customers leave for competitors, or a new hire joins and is visibly shocked by the pace, the drift has been going on for years.
Cobra Effect: Metrics reward output. Teams optimize for the metric. The metric diverges from the goal.
Addiction: A temporary fix becomes structural. The root problem is never addressed because the pain is masked.
Drift: Each small decline is accepted. The cumulative decline is invisible until customers are already gone.
Cobra Effect: Track behavior, not output. A PR merged is not a problem solved. Measure root-cause resolution rate.
Addiction: Name the intervention as temporary at inception. Set a review date. Build a forcing function to address the root cause.
Drift: Set absolute performance floors, not relative ones. Compare to the baseline from two years ago, not last quarter.
The traps are predictable because they are deeply human. The Cobra Effect exploits our instinct to optimize our own return. Addiction exploits our preference for easy, immediate relief over hard, deferred solutions. Drift exploits our tendency to anchor to recent experience rather than historical baselines.
Every system we build and run is guided by people. When you design an incentive structure, you are not designing for rational actors in a model. You are designing for humans with cognitive biases, competing priorities, and finite attention. The trap-aware engineer designs for that reality, not the idealized version.
Leverage Points: Where Small Shifts Create Big Changes
Every intervention in a system is an investment. Some investments return almost nothing. Others change everything. Leverage points are the places where a small shift produces disproportionate impact.
There is no algorithm for finding them. But there are three categories that consistently produce the highest return.
1. Change the Goal
The most powerful thing you can change in any system is what it is optimizing for. Not how it operates — what it is trying to achieve. This change propagates through every element, every incentive, every decision.
A customer support center measures success by tickets resolved per hour. Change the goal to: reduce the recurrence of tickets. Every single behavior in that organization shifts. Engineers are now incentivized to find root causes. Support staff are incentivized to document patterns. Product teams receive structured feedback instead of noise. The physical structure of the system — the people, the tools, the processes — can remain identical. The goal change rewrites everything.
A training organization shifts from "passing the test" to "developing problem-solving capability." Curriculum design changes. Assessment design changes. What instructors reward in student responses changes. The syllabus might look similar. The graduates are fundamentally different.
Goal changes are the highest-leverage intervention you have access to. They are also the most resisted, because they threaten every optimization that was built for the old goal.
2. Change the Rules
Rules — the structures that govern how elements can interact — shape behavior at a systemic level. The classic example: most large organizations operate under shareholder-primacy governance. Whoever holds enough equity has ultimate decision authority. Change that rule to stakeholder governance — where employees, customers, communities, and shareholders all have voice — and the entire organization behaves differently. Investment horizons lengthen. Cost-cutting decisions get weighed against employee and community impact. Long-term brand decisions are treated as financial decisions.
Rule changes are slower to implement than goal changes but equally durable. They work by reshaping the boundary conditions within which all elements operate.
3. Adjust or Add Feedback Loops
Reinforcing loops are growth engines — if you are a B2C business that benefits from word-of-mouth, deliberately designing a reinforcing loop to fuel that is a leverage point. You are not adding resources. You are structuring the system so its own dynamics accelerate what you want.
Balancing loops are circuit breakers — if a reinforcing loop is heading somewhere dangerous, adding a balancing loop contains it. Governments use laws against pyramid schemes as exactly this kind of balancing leverage point. The reinforcing loop (recruit more people → more people to recruit from → recruit more people) would otherwise compound indefinitely. The law does not eliminate the structure. It adds a counterforce.
Leverage points can be used to drive a system or to stabilize it. The same mechanism — a new feedback loop — can do either depending on whether it reinforces or balances. Before you push a leverage point, know which direction you are pushing and what will happen when the system responds.
To identify leverage points: map the system fully, trace all flows and rules, simulate proposed changes. A leverage point that seems obvious from inside the system often has unintended cascade effects that only appear when you model the whole.
Two Case Studies
Theory is a map. Case studies are the terrain.
The Borneo DDT Disaster: A System Without a Map
The facts, assembled in sequence.
The World Health Organization sprays DDT to kill mosquitoes. The mosquitoes die. Malaria rates fall. Win.
But DDT is broadly toxic to insects. The parasitic wasps that had been eating thatch-eating caterpillars also die. With their predator eliminated, the caterpillar population explodes. The caterpillars eat the thatched roofs of houses. Roofs begin collapsing.
Simultaneously, geckos eat the dying DDT-poisoned insects and absorb the toxin. The geckos become toxic. Cats eat the geckos. The cats die. With the cats gone, the rat population multiplies unchecked. The rats carry two diseases that were not present before: bubonic plague and typhus. The people of Borneo now face not one health crisis but three: the original malaria, collapsing houses, and two new epidemics.
The WHO's solution: Operation Cat Drop. Hundreds of live cats, parachuted into Borneo to reintroduce the missing balancing loop.
What the system map would have revealed, had anyone drawn it before spraying:
- DDT is not targeted. It kills broadly.
- Wasps are a balancing loop for caterpillars. Remove the wasps, and the caterpillar stock is unconstrained.
- Geckos eat insects. Cats eat geckos. Rats are controlled by cats. Remove the cats, and rats have no balancing loop.
- The boundary of "our intervention" was drawn around mosquitoes. The actual system boundary included every species in the local food web.
This is not hindsight criticism of people who lived in 1952. It is a demonstration of what systems tools make visible that intuition alone does not. A causal loop diagram of the Borneo ecosystem — drawn before the spraying began — would have shown the two balancing loops that DDT was about to break. The tool does not require perfect knowledge. It requires the discipline to ask: "What else is connected here, and what does it do?"
The Hero Food Waste Startup: Leverage Found in a Broken Flow
My small bakery in Hong Kong has a simple stock and flow structure. Each morning, baked goods arrive from the supplier — an inflow. Throughout the day, sales deplete the stock — an outflow. At closing time, whatever remains cannot be sold the next day. It flows to the landfill. That is a permanent daily sink.
A startup called Hero mapped this system and found a leverage point.
The flow from the bakery's stock to the landfill was waste by design. Hero redirected it. They built a new stock — a digital platform — that captures end-of-day surplus and sells it to customers at a discount, before the goods reach the landfill. The bakery captures revenue it previously lost. Customers access food at reduced cost. Hero takes a commission. The flow to the landfill shrinks.
This is a textbook leverage point: change a flow, not a goal, not a rule — a single flow in an existing system. And it generated a new business.
But Hero's story is not finished. As a systems thinker, consider the questions their model raises:
What feedback loops is Hero creating? As Hero scales, bakeries may start intentionally overproducing to have surplus for the platform. The outflow to landfill was the signal that kept production calibrated to demand. Removing that signal may break the calibration. Hero's reinforcing loop (more surplus → more discount customers → more bakery partners → more surplus) may inadvertently create a balancing loop that undermines it: more structural overproduction → worse unit economics for bakeries → bakeries leave the platform.
What are the system traps? Hero could drift into being a discounted food channel rather than a food waste solution — similar to addiction, where the temporary intervention becomes the product. The purpose (reduce waste) may gradually be replaced by the incentive (capture a commission per transaction). If that happens, the system that Hero inhabits — the food supply chain — has not changed at all. Just the flow of money.
The Hero case study is left intentionally open-ended in the source material — and that is the point. The value of these tools is not that they give you answers. It is that they force you to ask the questions that would never have occurred to you without the map.
Model It: The Cascade in Code
The Borneo disaster can be modeled as a stock and flow simulation in under thirty lines. Run it and the data does what the diagram cannot: it shows you the lag between the intervention and the crisis, and tells you exactly when the system crosses the threshold.
# The goal (reduce mosquitoes) is achieved — and three crises emerge from the same action
def simulate_borneo(weeks=10, spray_week=2):
"""
Five stocks, each connected to the next.
DDT reduces mosquitoes (the goal) but collapses two balancing loops:
Wasps → Caterpillars (roof collapse)
Geckos → Cats → Rats (plague/typhus)
"""
mosquitoes, wasps, caterpillars, cats, rats = 100, 100, 20, 100, 20
print(f"{'Week':>5} | {'Mosquito':>9} | {'Wasps':>6} | {'Caterpil':>9} | {'Cats':>5} | {'Rats':>5} | Note")
print("─" * 72)
for week in range(1, weeks + 1):
ddt = week >= spray_week
note = ""
if ddt:
mosquitoes = max(0, int(mosquitoes * 0.50)) # DDT kills mosquitoes
wasps = max(0, int(wasps * 0.55)) # collateral — also toxic
cats = max(0, int(cats * 0.82)) # food chain poisoning
if week == spray_week:
note = "<-- DDT begins"
# Caterpillars grow when wasp predator pressure drops
wasp_pressure = min(1.0, wasps / 100)
caterpillars = min(500, int(caterpillars * (1.50 - wasp_pressure * 0.55)))
# Rats grow when cat predator pressure drops
cat_pressure = min(1.0, cats / 100)
rats = min(600, int(rats * (1.35 - cat_pressure * 0.45)))
print(f"{week:>5} | {mosquitoes:>9} | {wasps:>6} | {caterpillars:>9} | {cats:>5} | {rats:>5} | {note}")
simulate_borneo()
Week | Mosquito | Wasps | Caterpil | Cats | Rats | Note
────────────────────────────────────────────────────────────────────────
1 | 100 | 100 | 19 | 100 | 18 |
2 | 50 | 55 | 22 | 82 | 17 | <-- DDT begins
3 | 25 | 30 | 29 | 67 | 17 |
4 | 12 | 16 | 40 | 54 | 18 |
5 | 6 | 8 | 58 | 44 | 20 |
6 | 3 | 4 | 85 | 36 | 23 |
7 | 1 | 2 | 126 | 29 | 28 |
8 | 0 | 1 | 188 | 23 | 34 |
9 | 0 | 0 | 282 | 18 | 43 |
10 | 0 | 0 | 423 | 14 | 55 |
Read the output as a timeline. Week 2: DDT begins, mosquitoes halve — the intervention works. Weeks 3–5: wasps collapse, caterpillars start rising, but slowly — the crisis is invisible. Weeks 6–8: wasps near zero, caterpillars passing 85, cats at 36, rats beginning to climb — warning territory, but still easy to rationalize as temporary. Week 9: wasps gone, caterpillars at 282 (14× the starting level), rats at 43. Week 10: caterpillars at 423, cats at 14, rats at 55.
The mosquito goal was achieved by week 3. The cascades became serious by week 7. The lag between "this is working" and "we have three new crises" was five weeks. In real time that was months — long enough for the intervention to feel like a success before the consequences arrived.
The model is not precise. It is not trying to be. It is trying to make a structural argument visible: when you remove a balancing loop, the stock it was constraining grows without limit. When you do it in two places simultaneously, you get two simultaneous crises. That argument is invisible in prose. It is unmistakable in the data.
Applying the Toolkit
Understanding systems and having the tools to map them is not enough on its own. Here is what distinguishes the people who use these tools well from the people who use them only occasionally.
Complex systems are unpredictable. You cannot know everything about the system you are working in. Acknowledge the limits of your map — and keep updating it when reality surprises you.
Map before you move. Identify boundaries, elements, relationships, behaviors, patterns. Use every tool in this article. A thorough analysis before intervention is not overhead — it is what separates the DDT team from the team that maps the wasps and the cats first.
Resilience is a system's ability to absorb shocks and adapt. Build it with delays and balancing feedback loops — not just speed and efficiency. The system that breaks fastest is often the one optimized hardest for a single scenario.
It is always easier to treat the symptom. The painkiller, the overtime, the discount — all of these relieve pressure immediately. But now you have the tools to simulate your intervention over time before you make it. Use them. Sustainability is a systems property, not a values statement.
Systems thinking does not replace design thinking, agile, or lean — it enriches them. Design thinking builds empathy for elements. Agile creates feedback loops by structure. Lean eliminates waste flows. Use all of them together. The mental model is the architecture; the other frameworks are the implementation patterns.
Everything in the world around you is a system. You are already playing a part in multiple systems and likely have the power to influence some of them. When you design an architecture, structure a team, set a metric, or propose an intervention — you are engineering a system. The only question is whether you are doing it with a map or without one.
Pro Tips for Systems Engineers
Build the map before the meeting. When you are about to propose a significant architectural or organizational change, spend 30 minutes drawing a system map first. The questions it surfaces are worth more than the first hour of any stakeholder conversation.
Label every loop R or B. When you draw a causal loop diagram, force yourself to classify each loop. If you cannot clearly label it reinforcing or balancing, you do not yet understand the relationship. That uncertainty is important data.
Always ask: which balancing loop am I breaking? Every intervention that removes a constraint is breaking a balancing loop. Before removing it, ask what it was constraining and what will grow unchecked in its absence. This one question would have saved Borneo.
Find the delays before you set expectations. Every delay in your system is a place where someone will be surprised. Map the delay from a code change to customer impact, from a hiring decision to team productivity, from a pricing change to churn. Knowing the lag is how you set accurate expectations.
Changing a goal is the highest leverage move you have. If you have organizational influence and you are trying to fix a systemic problem, do not add a new process. Change the goal. What the system optimizes for determines everything downstream. Process changes at the goal level beat implementation changes at the element level, every time.
The WHO did not fail because of bad science. They failed because their map ended at the mosquito.
Every system failure you will encounter in your career has this same shape: a boundary drawn too narrow, a balancing loop not mapped, a delay not modeled. The toolkit in this article does not prevent failure. It makes failure visible — before it happens.