From Data Engineer to AI Engineer — Part 5: AI Agents — Making AI Take Actions

Series: From Data/Software Engineer to AI Engineer Part 5 of 7 — ← Part 4: RAG

The Limitation of RAG

RAG answers questions. It is brilliant at "what does our policy say about X?"

But what if the question is: "Check our inventory system, find products below reorder threshold, and draft purchase orders for the three most critical ones"?

That requires:

Calling an inventory API
Analysing the results
Making a decision (which are most critical?)
Generating documents

A single LLM call cannot do this. You need an agent.

What an Agent Actually Is

An agent is just an LLM in a loop.

Goal: "Find products below reorder threshold and draft purchase orders"

Loop:
  [Think] What do I need to do first?
  → I need to check the inventory system
  
  [Act] call_tool("get_inventory", {"threshold": "reorder"})
  [Observe] {"items": [{"sku": "A123", "stock": 5, "reorder_at": 10}, ...]}
  
  [Think] I have the inventory data. Which are most critical?
  → Sort by (reorder_at - stock) descending: SKU A123, B456, C789
  
  [Act] call_tool("get_supplier_info", {"skus": ["A123", "B456", "C789"]})
  [Observe] {"A123": {"supplier": "Acme", "lead_time": "3 days"}, ...}
  
  [Think] Now I have all information to draft the orders
  [Act] draft_purchase_orders(...)
  [Observe] "3 purchase orders created"
  
  [Think] Goal is complete
  [Final Answer] "I've identified 3 products below reorder threshold and created purchase orders..."

This Think → Act → Observe loop is called ReAct (Reasoning + Acting). It is the foundation of every production agent.

Tool Calling: The Key Mechanism

Tools are functions that the agent can call. The LLM decides:

Which tool to call
With what arguments
Whether to call another tool based on the result

import anthropic
import json

client = anthropic.Anthropic()

# ── Define your tools ─────────────────────────────────────
# Claude uses a JSON schema to understand what each tool does and expects

tools = [
    {
        "name": "get_stock_level",
        "description": "Get the current stock level for a product SKU from the inventory system",
        "input_schema": {
            "type": "object",
            "properties": {
                "sku": {
                    "type": "string",
                    "description": "The product SKU code, e.g. 'MNS-SHIRT-001'"
                }
            },
            "required": ["sku"]
        }
    },
    {
        "name": "search_products",
        "description": "Search for products by name or category",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string"},
                "category": {"type": "string", "description": "Optional category filter"}
            },
            "required": ["query"]
        }
    }
]

# ── Implement your tools ──────────────────────────────────
# These are just normal Python functions — the agent calls them via JSON

def get_stock_level(sku: str) -> dict:
    # In production: call your actual inventory API
    fake_inventory = {
        "MNS-SHIRT-001": {"stock": 45, "reorder_at": 20, "location": "Warehouse A"},
        "MNS-JEANS-002": {"stock": 8, "reorder_at": 25, "location": "Warehouse B"},
    }
    return fake_inventory.get(sku, {"error": f"SKU {sku} not found"})

def search_products(query: str, category: str = None) -> list:
    # In production: query your product database
    return [
        {"sku": "MNS-SHIRT-001", "name": "Oxford Cotton Shirt", "category": "menswear"},
        {"sku": "MNS-JEANS-002", "name": "Slim Fit Jeans", "category": "menswear"},
    ]

# Tool dispatcher — maps tool names to functions
TOOL_MAP = {
    "get_stock_level": get_stock_level,
    "search_products": search_products,
}

def execute_tool(tool_name: str, tool_input: dict) -> str:
    if tool_name not in TOOL_MAP:
        return json.dumps({"error": f"Unknown tool: {tool_name}"})
    result = TOOL_MAP[tool_name](**tool_input)
    return json.dumps(result)

Building the Agent Loop

def run_agent(user_message: str, max_iterations: int = 10) -> str:
    """
    The core agent loop: Think → Act → Observe → repeat until done.
    max_iterations prevents infinite loops.
    """
    messages = [{"role": "user", "content": user_message}]
    
    for iteration in range(max_iterations):
        print(f"\n── Iteration {iteration + 1} ──")
        
        # ── Think: ask the LLM what to do next ──────────
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            tools=tools,
            messages=messages,
            system="""
            You are a helpful retail operations assistant.
            Use the available tools to answer questions accurately.
            Only use information from tool results — do not guess.
            """
        )
        
        print(f"Stop reason: {response.stop_reason}")
        
        # ── Check if agent is done ────────────────────────
        if response.stop_reason == "end_turn":
            # Agent has finished — return its final answer
            final_text = next(
                block.text for block in response.content 
                if hasattr(block, "text")
            )
            print(f"Final answer: {final_text}")
            return final_text
        
        # ── Act: agent wants to use tools ─────────────────
        if response.stop_reason == "tool_use":
            # Add assistant's response (including tool calls) to history
            messages.append({"role": "assistant", "content": response.content})
            
            # Execute each tool the agent called
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    print(f"Calling tool: {block.name}({block.input})")
                    
                    # ── Observe: run the tool and capture result ──
                    result = execute_tool(block.name, block.input)
                    print(f"Tool result: {result}")
                    
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result
                    })
            
            # Feed tool results back to the agent
            messages.append({"role": "user", "content": tool_results})
    
    return "Agent reached maximum iterations without completing the task."

# ── Run it ────────────────────────────────────────────────
result = run_agent("What is the stock level for MNS-JEANS-002 and is it below reorder threshold?")

Output:

── Iteration 1 ──
Stop reason: tool_use
Calling tool: get_stock_level({'sku': 'MNS-JEANS-002'})
Tool result: {"stock": 8, "reorder_at": 25, "location": "Warehouse B"}

── Iteration 2 ──
Stop reason: end_turn
Final answer: MNS-JEANS-002 (Slim Fit Jeans) currently has 8 units in stock at 
Warehouse B. This is below the reorder threshold of 25 units — it should be 
reordered as soon as possible.

Real-World Tool Patterns

Pattern 1: API Call Tool

import httpx

def get_weather(city: str, country_code: str = "GB") -> dict:
    """Call a real weather API."""
    response = httpx.get(
        f"https://api.openweathermap.org/data/2.5/weather",
        params={"q": f"{city},{country_code}", "appid": "YOUR_KEY", "units": "metric"}
    )
    data = response.json()
    return {
        "city": city,
        "temperature": data["main"]["temp"],
        "description": data["weather"][0]["description"]
    }

Pattern 2: Database Query Tool

import psycopg2

def query_database(sql: str) -> list[dict]:
    """
    Allow the agent to query a database.
    CRITICAL: Never pass raw user input as SQL — always use parameterised queries
    or restrict to a whitelist of allowed queries.
    """
    # Safety: only allow SELECT statements
    if not sql.strip().upper().startswith("SELECT"):
        return [{"error": "Only SELECT queries are permitted"}]
    
    conn = psycopg2.connect("your_connection_string")
    cursor = conn.cursor()
    cursor.execute(sql)
    columns = [desc[0] for desc in cursor.description]
    rows = cursor.fetchall()
    return [dict(zip(columns, row)) for row in rows]

Pattern 3: RAG Tool (Agent + RAG Combined)

def search_knowledge_base(question: str) -> str:
    """
    Allow the agent to query your RAG system as a tool.
    This combines agents with RAG — the agent decides when to search.
    """
    result = rag_query(question)  # from Part 4
    return f"Answer: {result['answer']}\nSources: {', '.join(result['sources'])}"

Pattern 4: Code Execution Tool

import subprocess

def run_python(code: str) -> str:
    """
    Allow the agent to run Python code.
    DANGER: Sandbox this properly in production — never run arbitrary code unsandboxed.
    """
    result = subprocess.run(
        ["python", "-c", code],
        capture_output=True,
        text=True,
        timeout=10  # prevent runaway execution
    )
    return result.stdout or result.stderr

Multi-Agent Systems

For complex tasks, you split work across specialist agents coordinated by an orchestrator agent.

# Orchestrator: decides which specialist to use
# Specialist agents: each has specific tools and expertise

orchestrator_system = """
You are an operations orchestrator. Based on the user's request, 
determine which specialist to route to:

- "inventory_agent": for stock levels, orders, warehouse questions
- "customer_agent": for customer accounts, orders, complaints  
- "analytics_agent": for reports, trends, performance data

Respond with JSON: {"route_to": "agent_name", "task": "refined task description"}
"""

def route_request(user_request: str) -> str:
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=100,
        system=orchestrator_system,
        messages=[{"role": "user", "content": user_request}]
    )
    routing = json.loads(response.content[0].text)
    
    # Route to the appropriate specialist agent
    specialist = routing["route_to"]
    task = routing["task"]
    
    if specialist == "inventory_agent":
        return run_agent(task)  # with inventory tools
    elif specialist == "customer_agent":
        return run_customer_agent(task)  # with customer tools
    # ...

This is the pattern for an audience-intelligence assistant — an orchestrator understands the business user's intent (who should receive this promotion?) and routes to specialist agents that query audience data, apply business rules, and validate outputs.

Guardrails: The Non-Negotiable

Agents are powerful. That power requires guardrails.

class SafeAgent:
    def __init__(self, max_iterations=10, max_cost_usd=0.50):
        self.max_iterations = max_iterations
        self.max_cost_usd = max_cost_usd
        self.total_tokens = 0
        
    def run(self, user_message: str) -> str:
        # Input validation
        if len(user_message) > 2000:
            return "Request too long. Please be more concise."
        
        # Dangerous action detection
        dangerous_keywords = ["delete all", "drop table", "rm -rf", "format"]
        if any(kw in user_message.lower() for kw in dangerous_keywords):
            return "This request requires human approval before proceeding."
        
        return self._run_loop(user_message)
    
    def _check_budget(self, input_tokens: int, output_tokens: int):
        cost = (input_tokens * 0.000003) + (output_tokens * 0.000015)
        self.total_cost += cost
        if self.total_cost > self.max_cost_usd:
            raise Exception(f"Cost limit exceeded: ${self.total_cost:.4f}")

# Require human approval for irreversible actions
IRREVERSIBLE_TOOLS = {"delete_order", "cancel_shipment", "send_email"}

def execute_tool_safe(tool_name: str, tool_input: dict) -> str:
    if tool_name in IRREVERSIBLE_TOOLS:
        # In production: send approval request to Slack/email
        print(f"⚠️  APPROVAL REQUIRED: {tool_name}({tool_input})")
        approval = input("Approve? (yes/no): ")
        if approval.lower() != "yes":
            return json.dumps({"status": "cancelled", "reason": "user declined"})
    
    return execute_tool(tool_name, tool_input)

When to Use Agents (and When Not To)

This is the judgement call that separates senior AI engineers.

Use an agent when:

The task requires multiple steps with decision-making between them
You need to call external APIs or databases
The path to the answer is not known in advance
Different inputs require different tool sequences

Do NOT use an agent when:

A single LLM call or RAG query is sufficient (YAGNI)
The task is deterministic — write a Python function instead
Latency matters — agents are 3–10x slower than single calls
The action is irreversible and high-stakes — agents can make mistakes

Simple question → Single LLM call (Part 3)
Knowledge question → RAG (Part 4)
Multi-step action → Agent (Part 5)

The most common mistake is reaching for agents when a structured prompt chain (Part 3) would do the job with 10x less complexity and cost.

Frameworks at a Glance

Framework	Best for	Maturity
Raw API loop (what we built)	Learning, maximum control	Stable
LangChain Agents	General purpose, huge ecosystem	Stable
LangGraph	Complex stateful workflows	Growing
Databricks AI Agents	Databricks-native, Unity Catalog	Databricks
AutoGen	Multi-agent conversations	Microsoft
CrewAI	Role-based agent teams	Growing

Start with the raw loop. Once you understand it, any framework is just an abstraction on top of what you already know.

Summary

Concept	What it is
Agent	LLM + tools + loop until goal achieved
Tool	A Python function the agent can call
ReAct	Think → Act → Observe loop pattern
Multi-agent	Orchestrator + specialist agents
Guardrails	Budget limits, input validation, human approval

The agent pattern unlocks the full power of AI for operational tasks. But with that power comes responsibility: every tool call can have real-world consequences. Build in guardrails from day one.

Next: Part 6 — Production AI: From Notebook to Live System

In Part 6, we take your AI system into production — deployment, monitoring, drift detection, and the operational patterns that keep it reliable.