How to Use LangChain Agents for Powerful Automated Tasks

Name: Lynn Mikami

Published on 4/30/2024

In the fascinating world of language models and automation, LangChain Agents stand out as a beacon of innovation, enabling developers and tech enthusiasts to create sophisticated, automated tasks that seem straight out of a sci-fi novel. If you're looking to dive into the realm of LangChain Agents but don't know where to start, you're in the right place. This guide will demystify the process, making it accessible and straightforward. So, grab a cup of coffee, and let's embark on this exciting journey together.

How to Use LangChain Agents: Quick Start

To best understand the agent framework, let’s build an agent that has two tools: one to look things up online, and one to look up specific data that we’ve loaded into a index.

This will assume knowledge of LLMs and retrieval so if you haven’t already explored those sections, it is recommended you do so.

Setup: LangSmith
By definition, agents take a self-determined, input-dependent sequence of steps before returning a user-facing output. This makes debugging these systems particularly tricky, and observability particularly important. LangSmith is especially useful for such cases.

When building with LangChain, all steps will automatically be traced in LangSmith. To set up LangSmith we just need set the following environment variables:

export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="<your-api-key>"

Define tools
We first need to create the tools we want to use. We will use two tools: Tavily (to search online) and then a retriever over a local index we will create.

Tavily
We have a built-in tool in LangChain to easily use Tavily search engine as tool. Note that this requires an API key - they have a free tier, but if you don’t have one or don’t want to create one, you can always ignore this step.

Once you create your API key, you will need to export that as:

export TAVILY_API_KEY="..."

from langchain_community.tools.tavily_search import TavilySearchResults
 
search = TavilySearchResults()
 
search.invoke("what is the weather in SF")

Retriever
We will also create a retriever over some data of our own. For a deeper explanation of each step here, see this section.

from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
 
loader = WebBaseLoader("https://docs.smith.langchain.com/overview")
docs = loader.load()
documents = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200
).split_documents(docs)
vector = FAISS.from_documents(documents, OpenAIEmbeddings())
retriever = vector.as_retriever()
 
retriever.get_relevant_documents("how to upload a dataset")[0]

Now that we have populated our index that we will do doing retrieval over, we can easily turn it into a tool (the format needed for an agent to properly use it).

from langchain.tools.retriever import create_retriever_tool
 
retriever_tool = create_retriever_tool(
    retriever,
    "langsmith_search",
    "Search for information about LangSmith. For any questions about LangSmith, you must use this tool!",
)

Tools
Now that we have created both, we can create a list of tools that we will use downstream.

tools = [search, retriever_tool]

How to Create a LangChain Agent

Now that we have defined the tools, we can create the agent. We will be using an OpenAI Functions agent - for more information on this type of agent, as well as other options, see this guide.

First, we choose the LLM we want to be guiding the agent.

from langchain_openai import ChatOpenAI
 
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

Next, we choose the prompt we want to use to guide the agent.

from langchain import hub
 
# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")
prompt.messages

Now, we can initialize the agent with the LLM, the prompt, and the tools. The agent is responsible for taking in input and deciding what actions to take. Crucially, the Agent does not execute those actions - that is done by the AgentExecutor (next step). For more information about how to think about these components, see our conceptual guide.

from langchain.agents import create_openai_functions_agent
 
agent = create_openai_functions_agent(llm, tools, prompt)

Finally, we combine the agent (the brains) with the tools inside the AgentExecutor (which will repeatedly call the agent and execute tools). For more information about how to think about these components, see our conceptual guide.

from langchain.agents import AgentExecutor
 
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

Run the agent
We can now run the agent on a few queries! Note that for now, these are all stateless queries (it won’t remember previous interactions).

agent_executor.invoke({"input": "hi!"})

agent_executor.invoke({"input": "how can langsmith help with testing?"})

agent_executor.invoke({"input": "what's the weather in sf?"})

Adding in memory
As mentioned earlier, this agent is stateless. This means it does not remember previous interactions. To give it memory we need to pass in previous chat_history. Note: it needs to be called chat_history because of the prompt we are using. If we use a different prompt, we could change the variable name.

# Here we pass in an empty list of messages for chat_history because it is the first message in the chat
agent_executor.invoke({"input": "hi! my name is bob", "chat_history": []})

from langchain_core.messages import AIMessage, HumanMessage
 
agent_executor.invoke(
    {
        "chat_history": [
            HumanMessage(content="hi! my name is bob"),
            AIMessage(content="Hello Bob! How can I assist you today?"),
        ],
        "input": "what's my name?",
    }
)

If we want to keep track of these messages automatically, we can wrap this in a RunnableWithMessageHistory. For more information on how to use this, see this guide.

from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
 
message_history = ChatMessageHistory()
 
agent_with_chat_history = RunnableWithMessageHistory(
    agent_executor,
    # This is needed because in most real world scenarios, a session id is needed
    # It isn't really used here because we are using a simple in memory ChatMessageHistory
    lambda session_id: message_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)
 
agent_with_chat_history.invoke(
    {"input": "hi! I'm bob"},
    # This is needed because in most real world scenarios, a session id is needed
    # It isn't really used here because we are using a simple in memory ChatMessageHistory
    config={"configurable": {"session_id": "<foo>"}},
)

Types of LangChain Agents

Agent Types

In the realm of LangChain Agents, diversity is the name of the game. These agents come in various flavors, each suited to different tasks and capabilities. Let's explore the nuanced world of Agent Types, breaking down their intended model types, support for chat history, multi-input tools, parallel function calling, and required model parameters. Understanding these categories will help you select the perfect agent for your needs.

OpenAI Tools: A cutting-edge agent optimized for recent OpenAI models (1106 onwards). It's designed to handle chat interactions, support multi-input tools, and execute parallel function calls, requiring the 'tools' model parameters. Ideal for those leveraging the latest in OpenAI's offerings.
OpenAI Functions: Tailored for OpenAI or finetuned open-source models that mimic OpenAI's function-calling capabilities. It excels in chat environments, handles multi-input tools, and requires 'functions' model parameters. A solid choice for models adept at function calling.
XML: This agent is a match for models skilled in XML, such as Anthropic’s. It's built for LLMs (not chat models), supports chat history, and is best used with unstructured tools requiring a single string input. Choose this when working with models proficient in XML.
Structured Chat: A versatile agent that shines in chat-based applications and supports tools with multiple inputs. It doesn't need additional model parameters, making it a great option for projects requiring complex tool interactions.
JSON Chat: Geared towards models that excel in JSON, this agent is suitable for chat applications. It simplifies working with JSON-friendly models, streamlining the development of chat-based applications.
ReAct: Simplistic yet effective, this agent is tailored for basic LLM applications. It's an excellent starting point for simple models needing straightforward implementation.
Self Ask With Search: Ideal for simple models with a single search tool, this agent is perfect for projects where a concise, focused toolset is necessary.

Each agent type brings its own set of strengths, making it crucial to match the agent with the specific requirements of your project. Whether you need support for chat history, multi-input tools, or parallel function calling, there's an agent tailored to meet those needs. Consider the complexity of your tasks, the capabilities of your chosen language model, and the nature of the interactions your application will handle. By aligning these factors with the right agent type, you can unlock the full potential of LangChain Agents in your projects, paving the way for innovative solutions and streamlined workflows.

Expanding on the intricacies of LangChain Agents, this guide aims to provide a deeper understanding and practical applications of different agent types. By exploring detailed sample codes and scenarios, you'll gain insights into selecting and implementing the most suitable agent for your project's needs. Let's dive into the core of LangChain Agents, highlighting their unique features and capabilities through examples.

LangChain Agents #1: OpenAI Tools Agent

The OpenAI Tools agent is designed to work seamlessly with the most recent OpenAI models, facilitating the execution of multiple functions or "tools" simultaneously. This agent is particularly useful when your application requires interacting with several external APIs or performing multiple tasks in parallel, thereby reducing the overall processing time.

Sample Code:

# Import necessary libraries and tools
from langchain import hub
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_openai import ChatOpenAI
 
# Initialize the Tavily Search tool
tools = [TavilySearchResults(max_results=1)]
 
# Retrieve the prompt and set up the LLM
prompt = hub.pull("hwchase17/openai-tools-agent")
llm = ChatOpenAI(model="gpt-3.5-turbo-1106", temperature=0)
 
# Create the OpenAI Tools agent
agent = create_openai_tools_agent(llm, tools, prompt)
 
# Execute the agent with a sample input
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
response = agent_executor.invoke({"input": "what is LangChain?"})
print(response)

In this example, the create_openai_tools_agent function constructs an agent that can utilize the OpenAI model to intelligently decide when to invoke one or more tools based on the input. The Tavily Search tool is used here to demonstrate web search capabilities.

LangChain Agents #2: OpenAI Functions Agent

The OpenAI Functions agent is best suited for tasks where the model needs to decide whether and which function to call based on the input. Although similar to the Tools agent, it's specifically designed for scenarios where function calling is central to the task, with OpenAI having deprecated this approach in favor of tools for newer models.

Sample Code:

# Setup for OpenAI Functions agent
from langchain import hub
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_openai import ChatOpenAI
 
# Initialize tools and choose the LLM
tools = [TavilySearchResults(max_results=1)]
prompt = hub.pull("hwchase17/openai-functions-agent")
llm = ChatOpenAI(model="gpt-3.5-turbo-1106")
 
# Construct and run the OpenAI Functions agent
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
response = agent_executor.invoke({"input": "what is LangChain?"})
print(response)

This example highlights how to set up and use the OpenAI Functions agent, utilizing Tavily Search as a tool for demonstrating the agent's capability to invoke specific functions based on user queries.

LangChain Agents #3: XML Agent

The XML Agent is optimized for models that are proficient in generating and interpreting XML structures. This agent type is advantageous when working with structured data or when the interaction with the model benefits from the structured format of XML.

Sample Code:

# Initialize the XML Agent
from langchain import hub
from langchain.agents import AgentExecutor, create_xml_agent
from langchain_community.chat_models import ChatAnthropic
from langchain_community.tools.tavily_search import TavilySearchResults
 
tools = [TavilySearchResults(max_results=1)]
prompt = hub.pull("hwchase17/xml-agent-convo")
llm = ChatAnthropic(model="claude-2")
 
# Create and run the XML agent
agent = create_xml_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
response = agent_executor.invoke({"input": "what is LangChain?"})
print(response)

This setup showcases the use of an XML agent with Claude-2, an Anthropic model known for its proficiency with XML. The XML format provides a structured way to communicate between the agent and the model, facilitating complex data handling.

LangChain Agents #4: JSON Chat Agent

The JSON Chat Agent leverages JSON formatting for its outputs, making it suitable for applications that require structured response data. This agent is ideal for chat models that excel in processing and generating JSON structures.

Sample Code:

# Setup for JSON Chat Agent
from langchain import hub
from langchain.agents import AgentExecutor, create_json_chat_agent
from langchain_community
 
.tools.tavily_search import TavilySearchResults
from langchain_openai import ChatOpenAI
 
tools = [TavilySearchResults(max_results=1)]
prompt = hub.pull("hwchase17/react-chat-json")
llm = ChatOpenAI()
 
# Create and execute the JSON Chat Agent
agent = create_json_chat_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, handle_parsing_errors=True)
response = agent_executor.invoke({"input": "what is LangChain?"})
print(response)

Here, the JSON Chat Agent is used to process an input and generate a JSON-structured response, utilizing the Tavily Search tool for web search capabilities. This agent is particularly useful in scenarios where the structured response is required for further processing or for a more structured interaction with the user.

LangChain Agents #5: Structured Chat Agent

The Structured Chat Agent excels in scenarios that involve multi-input tools, enabling complex interactions that require more than just a simple string input. This agent is designed to facilitate complex workflows where multiple parameters need to be considered for each tool invocation.

Sample Code:

# Initialize the Structured Chat Agent
from langchain import hub
from langchain.agents import AgentExecutor, create_structured_chat_agent
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_openai import ChatOpenAI
 
tools = [TavilySearchResults(max_results=1)]
prompt = hub.pull("hwchase17/structured-chat-agent")
llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-1106")
 
# Construct and run the Structured Chat Agent
agent = create_structured_chat_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, handle_parsing_errors=True)
response = agent_executor.invoke({"input": "what is LangChain?"})
print(response)

This example illustrates the use of a Structured Chat Agent to interact with multi-input tools effectively. By specifying a detailed prompt and selecting an appropriate language model, this agent can navigate complex queries and tool interactions seamlessly.

By understanding the distinctions and capabilities of these LangChain Agents, developers can better tailor their applications to leverage the full potential of language models and automated tasks. Whether your application requires simple function calls, structured data processing, or complex multi-tool interactions, there's a LangChain Agent that fits the bill. With these examples as a foundation, you're well-equipped to embark on your journey of building more intelligent, efficient, and responsive applications.

Conclusion

In conclusion, LangChain Agents offer a versatile and powerful toolkit for developers looking to integrate advanced language model capabilities into their applications. By understanding the nuances of each agent type—from OpenAI Tools and Functions Agents to XML, JSON Chat, and Structured Chat Agents—you can choose the right tool for your project's specific needs. These agents can handle a wide range of tasks, from simple function calls to complex interactions involving multiple inputs and structured data formats.

Extract Lyrics from AZLyrics Using AZLyricsLoader: Step-by-Step Guide