由 deepset 维护

集成：langfuse

监控和追踪您的 Haystack 请求。

作者

deepset

GitHub Repo PyPI Package

概述

langfuse-haystack 通过 Langfuse 将追踪功能集成到 Haystack 流水线中。该包通过捕获执行追踪的全面详细信息（包括 API 调用、上下文数据、提示等）来增强流水线运行的可视性。无论您是在监控模型性能、找出需要改进的地方，还是从流水线执行中创建用于微调和测试的数据集，langfuse-haystack 都是适合您的工具。

特性

易于集成到 Haystack 流水线
捕获执行的完整上下文
追踪模型使用情况和成本
收集用户反馈
识别低质量输出
构建微调和测试数据集

要使用此集成，请注册 Langfuse 账户。有关最新功能和定价信息，请参阅Langfuse 文档。

安装

pip install langfuse-haystack

使用

组件

此集成引入了一个组件

LangfuseConnector

LangfuseConnector 将 Haystack LLM 框架与 Langfuse 连接起来，以便能够追踪流水线各个组件中的操作和数据流。只需将此组件添加到您的流水线中，但不要将其连接到任何其他组件。LangfuseConnector 将自动追踪流水线中的操作和数据流。

请注意，您需要设置 LANGFUSE_SECRET_KEY 和 LANGFUSE_PUBLIC_KEY 环境变量才能使用此组件。LANGFUSE_SECRET_KEY 和 LANGFUSE_PUBLIC_KEY 是 Langfuse 提供的密钥。您可以通过在Langfuse 网站上注册账户来获取这些密钥。此外，您还需要将 HAYSTACK_CONTENT_TRACING_ENABLED 环境变量设置为 true，以启用流水线中的 Haystack 追踪。

这些代码示例还需要设置 OPENAI_API_KEY 环境变量。Haystack 与模型无关，您可以通过更改下面代码示例中的生成器来使用我们支持的任何模型提供商。

在 RAG 流水线中使用 `LangfuseConnector`

首先，安装一些额外的依赖项。

pip install sentence-transformers datasets

from datasets import load_dataset
from haystack import Document, Pipeline
from haystack.components.builders import PromptBuilder
from haystack.components.embedders import SentenceTransformersDocumentEmbedder, SentenceTransformersTextEmbedder
from haystack.components.generators import OpenAIGenerator
from haystack.components.retrievers import InMemoryEmbeddingRetriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.connectors.langfuse import LangfuseConnector


def get_pipeline(document_store: InMemoryDocumentStore):
    retriever = InMemoryEmbeddingRetriever(document_store=document_store, top_k=2)

    template = """
    Given the following information, answer the question.
    Context:
    {% for document in documents %}
        {{ document.content }}
    {% endfor %}
    Question: {{question}}
    Answer:
    """

    prompt_builder = PromptBuilder(template=template)

    basic_rag_pipeline = Pipeline()
    # Add components to your pipeline
    basic_rag_pipeline.add_component("tracer", LangfuseConnector("Basic RAG Pipeline"))
    basic_rag_pipeline.add_component(
        "text_embedder", SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2")
    )
    basic_rag_pipeline.add_component("retriever", retriever)
    basic_rag_pipeline.add_component("prompt_builder", prompt_builder)
    basic_rag_pipeline.add_component("llm", OpenAIGenerator(generation_kwargs={"n": 2}))

    # Now, connect the components to each other
    # NOTE: the tracer component doesn't need to be connected to anything in order to work
    basic_rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
    basic_rag_pipeline.connect("retriever", "prompt_builder.documents")
    basic_rag_pipeline.connect("prompt_builder", "llm")

    return basic_rag_pipeline

document_store = InMemoryDocumentStore()
dataset = load_dataset("bilgeyucel/seven-wonders", split="train")
embedder = SentenceTransformersDocumentEmbedder("sentence-transformers/all-MiniLM-L6-v2")
embedder.warm_up()
docs_with_embeddings = embedder.run([Document(**ds) for ds in dataset]).get("documents") or []  # type: ignore
document_store.write_documents(docs_with_embeddings)

pipeline = get_pipeline(document_store)
question = "What does Rhodes Statue look like?"
response = pipeline.run({"text_embedder": {"text": question}, "prompt_builder": {"question": question}})
# {'tracer': {'name': 'Basic RAG Pipeline', 'trace_url': 'https://cloud.langfuse.com/trace/3d52b8cc-87b6-4977-8927-5e9f3ff5b1cb'}, 'llm': {'replies': ['The Rhodes Statue was described as being about 105 feet tall, with iron tie bars and brass plates forming the skin. It was built on a white marble pedestal near the Rhodes harbour entrance. The statue was filled with stone blocks as construction progressed.', 'The Rhodes Statue was described as being about 32 meters (105 feet) tall, built with iron tie bars, brass plates for skin, and filled with stone blocks. It stood on a 15-meter-high white marble pedestal near the Rhodes harbor entrance.'], 'meta': [{'model': 'gpt-4o-mini', 'index': 0, 'finish_reason': 'stop', 'usage': {'completion_tokens': 100, 'prompt_tokens': 453, 'total_tokens': 553}}, {'model': 'gpt-4o-mini', 'index': 1, 'finish_reason': 'stop', 'usage': {'completion_tokens': 100, 'prompt_tokens': 453, 'total_tokens': 553}}]}}

运行完这些代码示例后，您还可以使用 Langfuse 仪表板查看和交互追踪。

在包含 `OpenAIChatGenerator` 和 `ChatPromptBuilder` 的流水线中使用 `LangfuseConnector`

from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.connectors.langfuse import LangfuseConnector

pipe = Pipeline()
pipe.add_component("tracer", LangfuseConnector("Chat example"))
pipe.add_component("prompt_builder", ChatPromptBuilder())
pipe.add_component("llm", OpenAIChatGenerator())

pipe.connect("prompt_builder.prompt", "llm.messages")
messages = [
    ChatMessage.from_system("Always respond in German even if some input data is in other languages."),
    ChatMessage.from_user("Tell me about {{location}}"),
]

response = pipe.run(
    data={"prompt_builder": {"template_variables": {"location": "Berlin"}, "template": messages}}
)
print(response["llm"]["replies"][0])
print(response["tracer"]["trace_url"])
# ChatMessage(content='Berlin ist die Hauptstadt von Deutschland und zugleich eines der bekanntesten kulturellen Zentren Europas. Die Stadt hat eine faszinierende Geschichte, die bis in die Zeiten des Zweiten Weltkriegs und des Kalten Krieges zurückreicht. Heute ist Berlin für seine vielfältige Kunst- und Musikszene, seine historischen Stätten wie das Brandenburger Tor und die Berliner Mauer sowie seine lebendige Street-Food-Kultur bekannt. Berlin ist auch für seine grünen Parks und Seen beliebt, die den Bewohnern und Besuchern Raum für Erholung bieten.', role=<ChatRole.ASSISTANT: 'assistant'>, name=None, meta={'model': 'gpt-4o-mini', 'index': 0, 'finish_reason': 'stop', 'usage': {'completion_tokens': 137, 'prompt_tokens': 29, 'total_tokens': 166}})
# https://cloud.langfuse.com/trace/YOUR_UNIQUE_IDENTIFYING_STRING

许可证

langfuse-haystack 根据 Apache-2.0 许可证分发。

集成：langfuse

目录

概述

特性

安装

使用

组件

在 RAG 流水线中使用 LangfuseConnector

在包含 OpenAIChatGenerator 和 ChatPromptBuilder 的流水线中使用 LangfuseConnector

许可证

在 RAG 流水线中使用 `LangfuseConnector`

在包含 `OpenAIChatGenerator` 和 `ChatPromptBuilder` 的流水线中使用 `LangfuseConnector`