Skip to main content
AI-Developer → AI Engineering

Build a Local AI Agent: The 4-Step Agentic Loop That Runs Everything Offline

I built an AI agent that detects my location, checks the weather, finds national parks, and gives me opinionated hiking recommendations — all running locally with zero cloud APIs. This is the 4-step agentic loop that powers every modern AI agent, and how to build one from scratch.

March 14, 2026
15 min read
#AI Agent#Ollama#Local LLM#Agentic Loop#Python#ReAct#Tool Use#Open Source
THE MISUNDERSTOOD DIFFERENCE
A chatbot responds to your question. An agent decides what to do next, executes actions across external systems, evaluates the result, and adjusts its strategy — all without you asking for each step.
The conceptual leap is small. The practical difference is enormous. This article builds a complete, production-quality hiking recommendation agent that runs entirely on your local machine — no cloud APIs, no monthly fees, no data leaving your computer.

I was planning a hike. I opened Google Maps to find my location, then a weather app, then searched for national parks, then read trail reviews across three different websites. Five apps, fifteen minutes, forty browser tabs. The cognitive overhead was exhausting for a task this simple.

So I built an agent. I typed: "Find me a place to hike today."

The agent detected my location, checked the weather, decided conditions were acceptable, found eight national parks within range, analyzed trails, and gave me three ranked recommendations with specific reasoning — in about 20 seconds. Running entirely on a 20-billion parameter open-source model on my laptop.

This article shows you exactly how to build that — and more importantly, how to understand the underlying architecture so you can build any agent you can imagine.


Chatbots vs. Agents: The Fundamental Difference

Before writing a line of code, understand the architectural distinction.

Chatbot Architecture
→ User sends a message
→ Model generates a response
→ Done
All knowledge comes from training data. Cannot interact with the real world. Cannot take actions. Output is always text.
Agent Architecture
→ User states a goal
→ Agent plans: what tools do I need?
→ Agent calls tools, collects data
→ Agent reflects: did I achieve the goal?
→ If not, agent adjusts and retries
→ Agent synthesizes and responds
Can query APIs, read files, call services, make decisions based on real-time data, and take multi-step actions.

The difference is not intelligence — it's architecture. An agent has tools, a planning loop, and the ability to reflect on whether it accomplished its goal.


The 4-Step Agentic Loop

Every AI agent — regardless of framework, language model, or use case — follows the same fundamental loop. Understanding this loop means you can build any agent.

The Universal Agentic Loop
1
Understand Intent
Parse the user's goal into a structured task. Extract: what to do, under what constraints, what success looks like. LLMs are excellent at this — they can turn "find me a place to hike today" into a structured set of requirements.
2
Make a Plan
Based on available tools, decide the sequence of actions. "I need location → then weather → then if conditions good → search parks → analyze trails → rank options." The plan can be implicit (the model decides) or explicit (you structure it in the system prompt).
3
Use Tools
Execute the plan by calling tools — functions that interact with the real world. Location API, weather API, parks database, file system, web search. Tools are just Python functions with clear descriptions. The LLM calls them, you execute them.
4
Reflect
"Did I accomplish the goal? Is this the best answer? Did any tool fail?" Reflection is not just error checking — it's quality evaluation. A reflecting agent asks whether its output is actually useful, not just technically complete. If not satisfied, it replans and retries.

The loop can run once (simple tasks) or iterate multiple times (complex tasks where each step changes what's possible next). The sophistication of the agent comes not from the model, but from how well you design this loop.


Setting Up Your Local LLM Environment

The agent in this guide runs entirely on your machine using Ollama — the most widely used runtime for local open-source models.

Step 1: Install Ollama

Ollama is available for macOS, Windows, and Linux. Download from ollama.com and install normally.

Step 2: Download a Model

Different models have different strengths for agentic tasks:

Model RAM Required Speed Best For
llama3.2:3b 4GB Very fast Simple agents, rapid iteration
llama3.1:8b 8GB Fast Balanced agents, good instruction following
mistral:7b 8GB Fast Tool use, structured outputs
qwen2.5:14b 16GB Moderate Complex reasoning, multi-step agents
# Download the recommended model for this tutorial
ollama pull llama3.2

# Test it works
ollama run llama3.2 "Say hello in 5 words"

# List installed models
ollama list

Step 3: Project Setup

# Create project directory
mkdir hiking-agent && cd hiking-agent

# Set up Python virtual environment
python3 -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install dependencies
pip install ollama requests geocoder python-dotenv

# Create environment file
echo "NPS_API_KEY=your-key-here" > .env

Get a free National Park Service API key at nps.gov/subjects/developer/get-started.htm — it's free and instant.


Building the Agent: Layer by Layer

We'll build this agent in five files, each with a clear single responsibility:

main.py
Orchestrates the full agentic loop
location.py
IP-based location detection
weather.py
Weather forecast from Open-Meteo (free, no API key)
parks.py
National Park Service API — parks and trails
config.py
Loads environment variables

Tool 1: Location Detection (location.py)

import geocoder
from typing import Optional, Tuple

def get_current_location() -> Tuple[Optional[float], Optional[float], Optional[str]]:
    """
    Detect user location via IP geolocation.
    Returns: (latitude, longitude, state_abbreviation)
    Falls back to None values on failure — agent handles this gracefully.
    """
    try:
        g = geocoder.ip('me')
        if not g.ok:
            return None, None, None

        lat, lng = g.latlng
        state = g.state

        # NPS API needs 2-letter abbreviations like "CA", not "California"
        state_map = {
            "California": "CA", "Texas": "TX", "Florida": "FL",
            "New York": "NY", "Washington": "WA", "Oregon": "OR",
            "Colorado": "CO", "Utah": "UT", "Arizona": "AZ",
            # Add more as needed, or use a library like us-states
        }
        state_abbr = state_map.get(state, state[:2].upper() if state else None)

        return lat, lng, state_abbr

    except Exception as e:
        print(f"Location detection failed: {e}")
        return None, None, None

Why plain English output matters: Notice this tool returns structured data (latitude, longitude, state). Later, when we ask the LLM to reason about which parks to visit, we'll convert this to natural language — because LLMs reason better over text than raw coordinates.

Tool 2: Weather Forecast (weather.py)

import requests
from datetime import datetime
from typing import Optional

def get_weather_summary(latitude: float, longitude: float) -> Optional[str]:
    """
    Fetch current day weather from Open-Meteo (free API, no key required).
    Returns a plain English summary optimized for LLM reasoning.
    """
    try:
        url = "https://api.open-meteo.com/v1/forecast"
        params = {
            "latitude": latitude,
            "longitude": longitude,
            "hourly": "temperature_2m,precipitation_probability,weathercode",
            "daily": "weathercode,temperature_2m_max,temperature_2m_min,precipitation_sum",
            "timezone": "auto",
            "forecast_days": 1
        }
        response = requests.get(url, params=params, timeout=10)
        response.raise_for_status()
        data = response.json()

        # Extract daytime hours only (8 AM - 5 PM)
        daily = data.get("daily", {})
        temp_max = daily.get("temperature_2m_max", [None])[0]
        temp_min = daily.get("temperature_2m_min", [None])[0]
        precip = daily.get("precipitation_sum", [None])[0]

        # Convert weather code to description
        weather_codes = {
            0: "clear sky", 1: "mainly clear", 2: "partly cloudy",
            3: "overcast", 45: "foggy", 51: "light drizzle",
            61: "light rain", 63: "moderate rain", 71: "light snow",
            80: "rain showers", 95: "thunderstorm"
        }
        code = daily.get("weathercode", [0])[0]
        description = weather_codes.get(code, f"weather code {code}")

        # Return plain English — better for LLM reasoning than raw JSON
        summary = (
            f"Today's weather: {description}. "
            f"Temperature range: {temp_min:.0f}°C to {temp_max:.0f}°C. "
            f"Total precipitation: {precip:.1f}mm. "
            f"{'Good conditions for outdoor activities.' if precip < 5 and temp_max > 8 else 'Marginal conditions — consider indoor alternatives.'}"
        )
        return summary

    except Exception as e:
        print(f"Weather fetch failed: {e}")
        return None

Tool 3: National Parks Search (parks.py)

import requests
import os
from typing import List, Dict, Optional

def get_parks(state_code: str, api_key: str, limit: int = 20) -> List[Dict]:
    """
    Fetch national parks by state using the NPS API.

    Args:
        state_code: Two-letter state abbreviation (e.g., 'CA', 'CO')
        api_key: NPS API key (free from nps.gov/subjects/developer)
        limit: Maximum number of parks to return

    Returns:
        List of park dictionaries with name, parkCode, description, state
    """
    try:
        url = "https://developer.nps.gov/api/v1/parks"
        params = {
            "stateCode": state_code,
            "limit": limit,
            "api_key": api_key
        }
        response = requests.get(url, params=params, timeout=15)
        response.raise_for_status()

        parks = response.json().get("data", [])
        return [
            {
                "name": p.get("fullName", p.get("name", "Unknown")),
                "code": p.get("parkCode", ""),
                "description": p.get("description", "")[:300],  # Truncate for context
                "state": p.get("states", state_code)
            }
            for p in parks
        ]

    except requests.exceptions.RequestException as e:
        print(f"NPS API request failed: {e}")
        return []


def get_trails(park_code: str, api_key: str) -> List[Dict]:
    """
    Fetch trails/activities for a specific park.
    Falls back gracefully if the park has no trail data.
    """
    try:
        url = "https://developer.nps.gov/api/v1/thingstodo"
        params = {"parkCode": park_code, "limit": 10, "api_key": api_key}
        response = requests.get(url, params=params, timeout=15)
        response.raise_for_status()

        items = response.json().get("data", [])
        return [
            {
                "title": item.get("title", "Unknown"),
                "duration": item.get("duration", ""),
                "difficulty": item.get("difficulty", ""),
                "description": item.get("shortDescription", "")[:200]
            }
            for item in items
            if "hik" in item.get("title", "").lower() or  # Filter for hiking
               "trail" in item.get("title", "").lower()
        ]

    except Exception as e:
        print(f"Trail fetch failed for {park_code}: {e}")
        return []

Main Orchestrator: The Agentic Loop (main.py)

This is where the four steps come together. Note the two reflection points — they're what make this an agent rather than a script.

import ollama
import os
from dotenv import load_dotenv
from location import get_current_location
from weather import get_weather_summary
from parks import get_parks, get_trails

load_dotenv()
NPS_API_KEY = os.getenv("NPS_API_KEY")
MODEL = "llama3.2"

# ─── Conversation memory ──────────────────────────────────────────────────────
# Maintaining history transforms this from one-shot to conversational
conversation_history = []

def chat(system_prompt: str, user_prompt: str, remember: bool = True) -> str:
    """
    Query the local LLM with optional conversation memory.

    Args:
        system_prompt: Role and behavioral instructions for the model
        user_prompt: The actual content/question for this turn
        remember: If True, appends to conversation_history (for follow-ups)
    """
    messages = [{"role": "system", "content": system_prompt}]

    if remember:
        messages += conversation_history

    messages.append({"role": "user", "content": user_prompt})

    response = ollama.chat(model=MODEL, messages=messages)
    reply = response["message"]["content"]

    if remember:
        conversation_history.append({"role": "user", "content": user_prompt})
        conversation_history.append({"role": "assistant", "content": reply})

    return reply


def run_hiking_agent():
    print("🥾 Hiking Agent — Running on local LLM\n" + "="*50)

    # ── STEP 1: Gather data via tools ─────────────────────────────────────────
    print("📍 Detecting location...")
    lat, lng, state = get_current_location()

    if not all([lat, lng, state]):
        print("❌ Could not detect location. Check internet connection.")
        return

    print(f"   Found: {state} ({lat:.2f}, {lng:.2f})")

    print("🌤  Fetching weather...")
    weather = get_weather_summary(lat, lng)
    if not weather:
        print("⚠️  Weather data unavailable. Proceeding without it.")
        weather = "Weather data unavailable."

    print(f"   {weather}\n")

    # ── STEP 2: REFLECTION 1 — Weather suitability decision ──────────────────
    # This is where the LLM uses judgment, not just script logic
    hiking_decision = chat(
        system_prompt=(
            "You are an outdoor safety advisor. Given weather conditions, "
            "respond with ONLY 'yes' if hiking is advisable today, or 'no' if not. "
            "No explanation. Single word answer."
        ),
        user_prompt=f"Should someone go hiking today? Weather: {weather}",
        remember=False  # One-shot decision, no memory needed
    ).strip().lower()

    if "no" in hiking_decision:
        print("🚫 Agent decision: Conditions not suitable for hiking today.")
        print(f"   Reason: {weather}")
        return

    print("✅ Agent decision: Good hiking conditions. Searching for parks...\n")

    # ── STEP 3: Search for parks and trails ───────────────────────────────────
    print(f"🏕  Searching parks in {state}...")
    all_parks = get_parks(state, NPS_API_KEY, limit=25)

    if not all_parks:
        print(f"❌ No parks found in {state}. Try setting your state manually.")
        return

    # ── STEP 4: REFLECTION 2 — Park selection (intelligent filtering) ─────────
    # 25 parks → 8 diverse, interesting options
    # This avoids sending too much data to a small local model
    parks_list = "\n".join([f"- {p['name']}: {p['description'][:150]}" for p in all_parks])

    selected_names = chat(
        system_prompt=(
            "You are an expert travel guide. Analyze this list of national parks "
            "and select the 8 most interesting and geographically diverse options "
            "for a day hike. Consider variety: different park types, different terrains, "
            "different distances from a central point. Return ONLY a comma-separated list "
            "of park names. No explanation, no numbering."
        ),
        user_prompt=f"Select 8 diverse parks from this list:\n{parks_list}",
        remember=False
    )

    # Filter to selected parks
    selected_parks = [
        p for p in all_parks
        if any(name.strip().lower() in p['name'].lower()
               for name in selected_names.split(','))
    ][:8]  # Safety limit

    print(f"   Selected {len(selected_parks)} parks for trail analysis.\n")

    # Gather trail data for selected parks
    parks_with_trails = []
    for park in selected_parks:
        trails = get_trails(park['code'], NPS_API_KEY)
        parks_with_trails.append({**park, "trails": trails})

    # ── STEP 5: REFLECTION 3 — Final recommendations ──────────────────────────
    parks_summary = ""
    for park in parks_with_trails:
        trail_text = "\n".join(
            [f"  Trail: {t['title']} | Difficulty: {t['difficulty']} | {t['description']}"
             for t in park['trails'][:3]]
        ) or "  No specific trail data available."

        parks_summary += f"\n\n{park['name']}:\n{park['description'][:200]}\n{trail_text}"

    recommendations = chat(
        system_prompt=(
            "You are an expert hiking guide and outdoor adventurer. You know every "
            "national park in the US intimately. Analyze the parks and trails provided "
            "and recommend the top 2-3 options for a day hiker today. "
            "For each recommendation include: (1) why it's suited for today, "
            "(2) which specific trail to start with, (3) one thing to watch out for. "
            "Be specific and opinionated — give your honest recommendation, not a generic list."
        ),
        user_prompt=(
            f"Location: {state}\n"
            f"Today's conditions: {weather}\n"
            f"Available parks and trails:\n{parks_summary}\n\n"
            "Give your top 2-3 hiking recommendations for today."
        ),
        remember=True  # This goes into memory for follow-up questions
    )

    print("\n" + "="*50)
    print("🏔  HIKING RECOMMENDATIONS")
    print("="*50)
    print(recommendations)

    # ── STEP 6: Conversational follow-up ─────────────────────────────────────
    print("\n" + "="*50)
    print("💬 Ask me anything about these recommendations:")
    print("   (Type 'quit' to exit)\n")

    while True:
        user_input = input("You: ").strip()
        if user_input.lower() in ('quit', 'exit', 'q'):
            break
        if not user_input:
            continue

        response = chat(
            system_prompt="You are an expert hiking guide. You know all the parks we just discussed. Be helpful, specific, and honest.",
            user_prompt=user_input,
            remember=True
        )
        print(f"\nAgent: {response}\n")


if __name__ == "__main__":
    run_hiking_agent()

Why Reflection Is the Most Important Concept

Most agent tutorials skip past reflection or treat it as an error-checking step. It's not. Reflection is what separates a useful agent from a brittle script.

Three Types of Reflection in Our Hiking Agent
Reflection 1: Binary Decision
"Should we proceed at all?" — The agent uses judgment to decide if hiking is appropriate given today's weather. This prevents the agent from wasting time searching for parks on a day nobody should be hiking.
Reflection 2: Data Filtering
"Which 8 of 25 parks are actually worth analyzing?" — Without this, a small local model gets overwhelmed with too much context, producing unfocused recommendations. The reflection step improves quality by reducing noise.
Reflection 3: Synthesis and Opinion
"What are the BEST options and why?" — This transforms a data dump into an opinionated recommendation. The agent isn't just listing parks — it's making a judgment call based on conditions, trail characteristics, and suitability.

Making Your Agent Robust: Error Handling

Production agents fail gracefully. Every external API call should be wrapped so that one tool failure doesn't crash the entire agent.

Fragile — One failure crashes everything
def get_weather(lat, lng):
response = requests.get(url)
data = response.json()
return data["daily"]["temp_max"][0]
Robust — Fails gracefully, agent adapts
def get_weather(lat, lng):
try:
response = requests.get(url, timeout=10)
response.raise_for_status()
data = response.json()
return data["daily"]["temp_max"][0]
except Exception as e:
print(f"Weather failed: {e}")
return None # Agent handles None

Key patterns:

  • Always set timeout on requests — hung requests kill agents
  • Use raise_for_status() to catch HTTP errors explicitly
  • Return None (not raise) from tools — let the agent's logic handle missing data
  • Log failures but don't stop execution if the data isn't critical

Advanced Pattern: ReAct (Reasoning + Acting)

The hiking agent uses a fixed sequence of tool calls. For more complex agents where the sequence of tools isn't known in advance, use the ReAct pattern: the model reasons about what action to take next at each step.

def react_agent(task: str, tools: dict, max_iterations: int = 10) -> str:
    """
    ReAct pattern: Let the model decide which tool to use at each step.

    tools: dict of {tool_name: callable}
    """
    history = []
    iterations = 0

    while iterations < max_iterations:
        iterations += 1

        # Give the model current context and available tools
        tool_descriptions = "\n".join([
            f"- {name}: {fn.__doc__}" for name, fn in tools.items()
        ])

        prompt = f"""Task: {task}

Available tools:
{tool_descriptions}

Previous steps:
{chr(10).join(history) if history else "None yet."}

What is your next action? Format your response as:
THOUGHT: [your reasoning]
ACTION: [tool_name OR "ANSWER"]
INPUT: [tool input OR final answer]"""

        response = chat(
            system_prompt="You are an AI agent. Reason step by step. Use tools to gather information.",
            user_prompt=prompt,
            remember=False
        )

        # Parse the response
        lines = response.strip().split('\n')
        action_line = next((l for l in lines if l.startswith('ACTION:')), None)
        input_line = next((l for l in lines if l.startswith('INPUT:')), None)

        if not action_line or not input_line:
            break

        action = action_line.replace('ACTION:', '').strip()
        tool_input = input_line.replace('INPUT:', '').strip()

        # If the model is done, return the answer
        if action.upper() == "ANSWER":
            return tool_input

        # Execute the tool
        if action in tools:
            result = tools[action](tool_input)
            history.append(f"Used {action}({tool_input}) → {result}")
        else:
            history.append(f"Unknown tool: {action}")

    return "Max iterations reached. Final state: " + str(history)

Performance Benchmarks

Running on a MacBook Pro M3 with Llama 3.2 (3B parameters via Ollama):

Location detection:
<1 second
Weather fetch:
1–2 seconds
Weather reflection (LLM):
2–3 seconds
Parks search + trails:
3–5 seconds
Park selection reflection:
3–6 seconds
Final recommendations:
5–12 seconds
Total (full run):
~15–30 seconds
Park filtering improvement:
~50% faster final LLM call (8 parks vs 25)

Common Mistakes

Mistake Why It's Bad Fix
No error handling in tools One API failure crashes the whole agent Every external call wrapped in try/except, return None on failure
No conversation memory Follow-up questions don't make sense Maintain conversation_history, append each turn
Dumping all data to the LLM Context overflow, unfocused output Add reflection step to filter/reduce before final LLM call
Vague system prompts Inconsistent, unfocused responses "Respond ONLY with yes or no" beats "tell me if hiking is good"
Skipping the Hello World step Start too complex, hard to debug Build: Hello World → one tool → two tools → loop → reflection

Key Takeaways

The 4-step loop is universal. Intent → Plan → Tools → Reflect. Every agent you'll ever build follows this pattern. The sophistication comes from how you design each step, not from the model.
Reflection is not optional. Without it, you have a script, not an agent. The difference between a useful agent and a brittle one is whether it evaluates its own output and makes quality judgments.
Local LLMs are genuinely capable for agents. A 3B–7B parameter model running on consumer hardware is sufficient for most agentic tasks — location, weather, search, analysis. You don't need GPT-4 for everything.
Plain English beats raw JSON for LLM reasoning. Convert tool output to natural language before feeding it to the model. "Partly cloudy, 13°C, low precipitation" outperforms a weather JSON blob for generating coherent recommendations.
Build incrementally. Hello World → one tool → two tools → loop → reflection → memory. Each step is testable. Never skip to autonomous behavior before the simpler steps work reliably.

What's Next in the Series

NEXT IN SERIES
Build an Autonomous Research Agent with the Claude Agent SDK
The hiking agent uses a fixed tool sequence. The next step is a truly autonomous agent — one that searches the web for research papers, downloads PDFs, reads them, and answers questions conversationally, all with zero human intervention after the initial prompt. Built with the Anthropic SDK and real tool use.
✦ Real web search via Tavily API
✦ PDF download and analysis
✦ Conversational memory across turns
✦ Streamlit production UI
MH

Mohamed Hamed

20 years building production systems — the last several deep in AI integration, LLMs, and full-stack architecture. I write what I've actually built and broken. If this was useful, the next one goes to LinkedIn first.

Follow on LinkedIn →