Build Your First AI Agent Tool in 10 Lines of Python

Someone I follow spent three hours on a live stream breaking this concept down. After the stream, I scrolled the comments: “Still confused.” “When do we use BaseTool vs @tool?” “My agent keeps calling the wrong function.” Three hours. Still confused.

Here’s what I’ve learned teaching hundreds of developers over the past two years: the confusion almost never comes from the code. It comes from the mental model. Get the mental model right, and the code writes itself in ten lines. Everything else is just error handling and polish.

Let me give you that mental model first. Then we’ll build something that would actually impress a hiring manager.

What a Tool Actually Is (And Why This Matters)

Ask most developers what an AI agent tool is, and they’ll say “a function the AI can call.” Technically correct. Completely useless as a mental model.

Here’s the better version: a tool is a function with a resume.

Your Python function has code. The tool has code, a name, a description, and a type signature. The LLM never reads your code. It reads the resume. The name, the docstring, the parameter types. That’s everything it gets. Based on that, it decides whether to call the function, when to call it, and with what arguments.

This changes everything about how you write tools. You stop thinking “how do I expose this function?” and start thinking “how do I describe this function so an LLM knows exactly when to use it?”

Your First Tool: Stock Price Lookup

Let’s stop the useless conceptual discussion or the multiply-two-numbers demo. Let’s build something real. We’ll use yfinance, Yahoo Finance’s unofficial Python library. It’s free, it has no API key requirements, and it pulls live market data.

Before we start: everything in this post goes into a single Python file. We’ll build it piece by piece so I can explain each part, but at the end you’ll have one complete script you can copy, save as agent.py, and run. I’ve included the full combined version at the bottom if you want to skip ahead.

Install what you need:

pip install langchain langchain-openai yfinance

Create a new file called agent.py and start adding the imports and your first tool:

from langchain_core.tools import tool
import yfinance as yf

@tool
def get_stock_price(ticker: str) -> str:
“””Get the current stock price for a given ticker symbol.

Use this when a user asks about the current price, value, or quote
for a publicly traded company. Accepts standard ticker symbols like
AAPL (Apple), MSFT (Microsoft), TSLA (Tesla), NVDA (Nvidia).
Returns the current price in USD.
“””
try:
stock = yf.Ticker(ticker.upper())
info = stock.info
price = info.get(“currentPrice”) or info.get(“regularMarketPrice”, 0)
name = info.get(“longName”, ticker)
return f”{name} ({ticker.upper()}): ${price:.2f}”
except Exception as e:
return f”Could not fetch price for {ticker}: {str(e)}”

That’s it. Ten lines including the docstring.

Notice that the docstring isn’t just documentation for you. It’s the tool’s entire communication to the LLM. When an agent has ten tools available and someone asks “what’s Apple trading at?”, the LLM reads all ten docstrings and picks this one. Write a vague docstring and the agent picks the wrong tool, or picks nothing. Write a clear one and it works like magic.

Adding a Second Tool (This Is Where It Gets Interesting)

One tool alone isn’t much of an agent. Add a second tool right below the first one in the same agent.py file:

@tool
def get_company_summary(ticker: str) -> str:
“””Get a business description and overview for a company by ticker symbol.

Use this when a user wants to know what a company does, its industry,
business model, or general background. NOT for stock prices. Use
get_stock_price for current price data. Accepts ticker symbols like
AAPL, MSFT, AMZN, NVDA.
“””
try:
stock = yf.Ticker(ticker.upper())
info = stock.info
name = info.get(“longName”, ticker)
summary = info.get(“longBusinessSummary”, “No description available.”)
sector = info.get(“sector”, “Unknown sector”)
industry = info.get(“industry”, “Unknown industry”)
return f”{name} | {sector} > {industry}\n\n{summary[:500]}…”
except Exception as e:
return f”Could not fetch company info for {ticker}: {str(e)}”

See what I did in that docstring? “NOT for stock prices. Use get_stock_price for current price data.” You’re giving the LLM explicit routing instructions. When tools have overlapping territory, you need that disambiguation right inside the description. Most tutorials skip this entirely, then wonder why the agent behaves unpredictably in production.

The Moment Everything Clicks

Now, still in the same agent.py file, add the code that wires these tools to an LLM and creates the agent loop:

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, ToolMessage
import os

os.environ[“OPENAI_API_KEY”] = “your-key-here”

# Gather tools and wire them to the LLM
tools = [get_stock_price, get_company_summary] llm = ChatOpenAI(model=”gpt-4o-mini”, temperature=0)
llm_with_tools = llm.bind_tools(tools)

def run_agent(question: str) -> str:
messages = [HumanMessage(question)]

# First pass: LLM reads the question and decides which tool to call
response = llm_with_tools.invoke(messages)

# If no tools were needed, return the direct answer
if not response.tool_calls:
return response.content

# Execute each tool the LLM chose
messages.append(response)
tool_map = {t.name: t for t in tools}

for call in response.tool_calls:
result = tool_map[call[“name”]].invoke(call[“args”])
messages.append(ToolMessage(result, tool_call_id=call[“id”]))

# Second pass: LLM turns tool results into a human-readable answer
return llm_with_tools.invoke(messages).content

Now test it:

print(run_agent(“What’s Microsoft’s stock price right now?”))
# Routes to: get_stock_price(“MSFT”)

print(run_agent(“What does Nvidia actually do as a company?”))
# Routes to: get_company_summary(“NVDA”)

print(run_agent(“Is Apple’s stock expensive for what the company actually does?”))
# Routes to: BOTH tools, then synthesizes an answer

That last one is the one I want you to sit with. The question doesn’t explicitly ask for a price or a description. The LLM reads the question, determines it needs both the financial data and the business context, calls both tools in sequence, and then writes a coherent answer combining both results.

You didn’t write any routing logic. No if/else for which tool to call. You described the tools clearly, and the model figured out the rest.

That’s the mental model shift. You’re not programming the decisions. You’re creating the options and describing them well enough that the LLM can make decisions itself.

What Most AI Agent Tool Tutorials Get Wrong

I’ve read a lot of these. A few patterns bother me.

The first is toy examples. Calculating factorial, converting Celsius to Fahrenheit. Those are fine for testing whether the decorator works. They’re useless for understanding why any of this is valuable, because the LLM already knows the answer without calling a tool. You never see the decision-making in action. You just see a function get called.

The second is skipping error handling entirely. Your tool will receive bad input. The ticker symbol will be misspelled. The API will time out. The data will be missing. If your tool throws an uncaught exception, the agent crashes. Returning a clean error string, as in the try/except blocks above, lets the agent recover gracefully. It might even retry with a corrected ticker, ask the user to clarify, or move on to a different approach. That flexibility matters enormously in production.

The third, and most frustrating, is skipping the docstring conversation. Tutorials show the decorator working but don’t explain that the docstring is the contract between you and the model. Every word affects tool selection behavior. “Get the current stock price” is a worse description than “Get the current stock price. Use when a user asks for a stock quote, current value, or how much a share costs.” The second version handles more edge cases in how real users phrase real questions.

The Pydantic Upgrade (When You Need It)

For simple tools with one or two string parameters, the @tool decorator is everything you need. When you’re dealing with complex inputs, date ranges, multiple optional filters, or nested data, Pydantic validation adds structure that helps both you and the model:

from pydantic import BaseModel, Field

class StockHistoryInput(BaseModel):
ticker: str = Field(description=”Stock ticker symbol, e.g. AAPL, MSFT, NVDA”)
period: str = Field(
default=”1mo”,
description=”Time period: 1d, 5d, 1mo, 3mo, 6mo, 1y, 2y, 5y, ytd, max”
)

@tool(args_schema=StockHistoryInput)
def get_stock_history(ticker: str, period: str = “1mo”) -> str:
“””Get historical stock price data for trend analysis.

Use when a user asks about stock performance over time, historical
highs and lows, or how a stock has moved over a specific period.
“””
stock = yf.Ticker(ticker.upper())
hist = stock.history(period=period)

if hist.empty:
return f”No historical data found for {ticker}”

start_price = hist[“Close”].iloc[0] end_price = hist[“Close”].iloc[-1] change_pct = ((end_price – start_price) / start_price) * 100
high = hist[“High”].max()
low = hist[“Low”].min()

return (
f”{ticker.upper()} over {period}: “
f”${start_price:.2f} to ${end_price:.2f} ({change_pct:+.1f}%) | “
f”High: ${high:.2f} | Low: ${low:.2f}”
)

Note: This Pydantic version is an optional upgrade. You don’t need it for the basic agent we’re building. If you want to add it, drop it into the same agent.py file alongside the other tools and include it in the tools list. For your first run, the two tools above are plenty.

The Pydantic schema does two things. It gives the LLM richer parameter guidance, because the field descriptions end up in the schema the model sees. And it validates inputs before your function body ever runs. If the model sends an invalid period string, Pydantic catches it before you hit the API. That’s one less failure mode you have to debug at 11 PM.

Now Scale It Up: The Enterprise Version

Here’s where my background kicks in. Thirty years in enterprise IT across Fortune 100 companies. Data platforms that handle millions of records. Systems entire divisions depend on daily.

The tools above connect to Yahoo Finance. Interesting, and free. But the same pattern, almost unchanged, connects to your company’s Oracle data warehouse:

@tool
def query_sales_pipeline(region: str, quarter: str) -> str:
“””Query the enterprise sales pipeline for a specific region and quarter.

Use when users ask about sales performance, deal counts, pipeline value,
or revenue forecast for a specific business region (AMER, EMEA, APAC)
and fiscal quarter (e.g. ‘Q1 2025’, ‘Q4 2024’).
“””
conn = get_db_connection() # your actual connection pool
result = conn.execute(
“SELECT region, quarter, SUM(deal_value), COUNT(*) “
“FROM sales_pipeline WHERE region = ? AND quarter = ?”,
(region, quarter)
).fetchone()

return f”Pipeline for {region} {quarter}: ${result[2]:,.0f} across {result[3]} deals”

Same decorator. Same pattern. Now your agent can answer “How’s the APAC pipeline looking for Q1 2025?” by querying your actual data warehouse. Extend this to Snowflake, BigQuery, your internal REST APIs, your Salesforce CRM, your ServiceNow tickets, and suddenly you’re not building demos anymore.

That’s the thing about this pattern. It doesn’t care what’s on the other side of the function. Stock data, SQL queries, REST calls, file system reads. The interface between the LLM and the outside world is always the same: a clear description, a clean input schema, a useful return value.

A Note on Production Reality

What’s above is enough to get something working today. Before you put it in front of actual users, add these three things.

Timeouts. External API calls fail. Wrap them with a timeout so your agent doesn’t hang indefinitely waiting for a response that isn’t coming.

Logging. Log every tool call: what was requested, what was returned, how long it took. This is how you debug agent behavior in production. LangSmith, LangChain’s observability platform, handles this automatically if you’re already in that ecosystem.

Rate limiting. If the LLM decides to call your data warehouse tool 50 times in a complex workflow, your infrastructure team will find you. A simple counter or token bucket at the tool level prevents the worst scenarios.

The BaseTool class, rather than the @tool decorator, gives you more control over initialization and async behavior. That matters when your tools connect to connection pools or shared resources. For what we’ve built here, the decorator is exactly right.

The Full Working Example

Here’s everything end to end, ready to run:

# pip install langchain langchain-openai yfinance
import os
import yfinance as yf
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, ToolMessage

os.environ[“OPENAI_API_KEY”] = “your-key-here”

@tool
def get_stock_price(ticker: str) -> str:
“””Get the current stock price for a publicly traded company.
Use for questions about current price, value, or quote.
Examples: AAPL, MSFT, TSLA, NVDA, AMZN.
“””
try:
info = yf.Ticker(ticker.upper()).info
price = info.get(“currentPrice”) or info.get(“regularMarketPrice”, 0)
return f”{info.get(‘longName’, ticker)}: ${price:.2f}”
except Exception as e:
return f”Error fetching {ticker}: {e}”

@tool
def get_company_summary(ticker: str) -> str:
“””Get a business description for a company. What they do, their sector,
their industry. NOT for price information. Use ticker like AAPL, MSFT.
“””
try:
info = yf.Ticker(ticker.upper()).info
summary = info.get(“longBusinessSummary”, “No description available.”)
return f”{info.get(‘longName’)} | {info.get(‘sector’)} | {summary[:400]}…”
except Exception as e:
return f”Error fetching info for {ticker}: {e}”

tools = [get_stock_price, get_company_summary] llm_with_tools = ChatOpenAI(model=”gpt-4o-mini”, temperature=0).bind_tools(tools)
tool_map = {t.name: t for t in tools}

def ask(question: str) -> str:
msgs = [HumanMessage(question)] response = llm_with_tools.invoke(msgs)
if not response.tool_calls:
return response.content
msgs.append(response)
for call in response.tool_calls:
result = tool_map[call[“name”]].invoke(call[“args”])
msgs.append(ToolMessage(result, tool_call_id=call[“id”]))
return llm_with_tools.invoke(msgs).content

# Ask anything. The agent routes automatically.
print(ask(“What’s Tesla’s current stock price?”))
print(ask(“Explain what Nvidia does in simple terms.”))
print(ask(“Is Amazon a good investment based on its business model?”))

To run it:

Open your terminal, navigate to the folder where you saved agent.py, and run:

python agent.py

You should see three responses printed to your terminal, one for each question. The first will show Microsoft’s current stock price. The second will explain what Nvidia does. The third is the interesting one. The agent will call both tools on its own, pull Apple’s price and business summary, and give you a combined answer about whether the stock seems expensive for what the company does.

If you get an authentication error, double-check that you replaced “your-key-here” with your actual OpenAI API key. If you don’t have one, sign up at platform.openai.com (https://platform.openai.com/), add a few dollars of credit, and generate a key under API Keys.

If you want to play with it interactively instead of running the three pre-written questions, replace the three print(ask(…)) lines at the bottom with this:

while True:
question = input(“\nAsk something (or ‘quit’): “)
if question.lower() == “quit”:
break
print(ask(question))

Now you can type any question and watch the agent decide which tools to use in real time.

One More Thing

The title of this post promises 10 lines. And it’s true. The core of a working tool is under 10 lines. But that was never really the point.

The point is that once you understand the pattern, you stop seeing “AI agent tools” as a magical or complicated concept. You see them for what they are: functions with good descriptions, connected to a model that can read those descriptions and make routing decisions.

The real skill isn’t writing the decorator. It’s knowing what to describe, how to describe it, and what your tool should return so the LLM can do something useful with the result. Write that well, and you can replace “stock price lookup” with any function your organization needs an AI to interact with.

The pattern scales. From a weekend side project to production systems handling real enterprise queries with real business consequences.

That’s the part no three-hour livestream can shortcut. But now at least you have the mental model.