使用 Llama Stack 和 Haystack Agent 进行构建

在 Colab 中打开下载

_{最后更新：2025 年 7 月 21 日}

此笔记本演示了如何将 LlamaStackChatGenerator 组件与 Haystack Agent 一起使用，以实现函数调用功能。我们将创建一个简单的天气工具，Agent 可以调用该工具来提供动态、最新的信息。

我们从安装集成包开始。

%%bash

pip install llama-stack-haystack

设置

在运行此示例之前，您需要

通过推理提供商设置 Llama Stack 服务器
拥有一个可用的模型（例如，llama3.2:3b）

有关如何使用 Ollama 设置服务器的快速入门，请参阅 Llama Stack 文档。

一旦服务器运行起来，它通常可以通过 https://:8321/v1/openai/v1 访问。

定义工具

Tool 在 Haystack 中允许模型调用函数以获取实时信息或执行操作。让我们创建一个简单的天气工具，模型可以使用该工具来提供天气信息。

from haystack.dataclasses import ChatMessage
from haystack.tools import Tool

# Define a tool that models can call
def weather(city: str):
    """Return mock weather info for the given city."""
    return f"The weather in {city} is sunny and 32°C"

# Define the tool parameters schema
tool_parameters = {
    "type": "object", 
    "properties": {
        "city": {"type": "string"}
    }, 
    "required": ["city"]
}

# Create the weather tool
weather_tool = Tool(
    name="weather",
    description="Useful for getting the weather in a specific city",
    parameters=tool_parameters,
    function=weather,
)

设置 Agent

现在，让我们创建一个 LlamaStackChatGenerator 并将其传递给 Agent。Agent 组件将使用通过 LlamaStackChatGenerator 运行的模型来推理和做出决策。

from haystack.components.agents import Agent
from haystack_integrations.components.generators.llama_stack import LlamaStackChatGenerator
from haystack.components.generators.utils import print_streaming_chunk

# Create the LlamaStackChatGenerator
chat_generator = LlamaStackChatGenerator(
    model="ollama/llama3.2:3b",  # model name varies depending on the inference provider used for the Llama Stack Server
    api_base_url="https://:8321/v1/openai/v1",
)
# Agent Setup
agent = Agent(
    chat_generator=chat_generator,
    tools=[weather_tool],
)

# Run the Agent
agent.warm_up()

将工具与 Agent 一起使用

现在，当我们提问时，Agent 将利用提供的 tool 和 LlamaStackChatGenerator 来生成答案。我们通过 streaming_callback 在 Agent 中启用了流式传输，因此您可以实时观察工具的调用和结果。

# Create a message asking about the weather
messages = [ChatMessage.from_user("What's the weather in Tokyo?")]

# Generate a response from the model with access to tools
response = agent.run(messages=messages, tools=[weather_tool],     streaming_callback=print_streaming_chunk,
)

[TOOL CALL]
Tool: weather 
Arguments: {"city":"Tokyo"}

[TOOL RESULT]
The weather in Tokyo is sunny and 32°C

In[ASSISTANT]
 Tokyo, the current weather conditions are mostly sunny with a temperature of 32°C. Would you like to know more about Tokyo's climate or weather forecast for a specific date?

与 ChatGenerator 进行简单聊天

为了更简单的用例，您还可以创建一个轻量级的机制来直接与 LlamaStackChatGenerator 聊天。

messages = []

while True:
  msg = input("Enter your message or Q to exit\n🧑 ")
  if msg=="Q":
    break
  messages.append(ChatMessage.from_user(msg))
  response = chat_generator.run(messages=messages)
  assistant_resp = response['replies'][0]
  print("🤖 "+assistant_resp.text)
  messages.append(assistant_resp)

🤖 The main character in The Witcher series, also known as the eponymous figure, is Geralt of Rivia, a monster hunter with supernatural abilities and mutations that allow him to control the elements. He was created by Polish author_and_polish_video_game_development_company_(CD Projekt).
🤖 One of the most fascinating aspects of dolphin behavior is their ability to produce complex, context-dependent vocalizations that are unique to each individual, similar to human language. They also exhibit advanced social behaviors, such as cooperation, empathy, and self-awareness.

如果您想更换模型提供商，可以重用相同的 LlamaStackChatGenerator 代码与不同的提供商。只需在 Llama Stack 服务器上运行所需的推理提供商，并在初始化 LlamaStackChatGenerator 时更新 model 名称。

有关可用推理提供商的更多详细信息，请参阅 Llama Stack 文档。