📘 **TELUS Agriculture & Consumer Goods** 如何通过 **Haystack Agents** 转变促销交易
由 deepset 维护

集成:NVIDIA

将 NVIDIA 模型与 Haystack 结合使用。

作者
deepset

目录

概述

NVIDIA AI 基础模型NVIDIA 推理微服务让您能够在 NVIDIA 加速基础设施上达到最佳性能。借助预训练的生成式 AI 模型,企业可以更快地创建定制模型,并利用最新的训练和推理技术。

此集成允许您在 Haystack pipeline 中使用 NVIDIA 基础模型和 NVIDIA 推理微服务。

为了使用此集成,您需要一个 NVIDIA API 密钥。将其设置为环境变量 `NVIDIA_API_KEY`。

安装

pip install nvidia-haystack

使用

组件

此集成引入了以下组件

  • NvidiaTextEmbedder:一个用于嵌入字符串的组件,使用 NVIDIA AI 基础模型和 NVIDIA 推理微服务嵌入模型。

    对于区分查询和文档输入的模型,此组件将输入字符串嵌入为查询。

  • NvidiaDocumentEmbedder:一个用于嵌入文档的组件,使用 NVIDIA AI 基础模型和 NVIDIA 推理微服务嵌入模型。

  • NvidiaGenerator:一个用于使用 NVIDIA AI 基础模型端点和 NVIDIA 推理微服务提供的生成模型生成文本的组件。

  • NvidiaRanker:一个用于对文档进行排名的组件,使用 NVIDIA NIM

单独使用这些组件

NvidiaTextEmbedder:

from haystack_integrations.components.embedders.nvidia import NvidiaTextEmbedder

text_to_embed = "I love pizza!"

text_embedder = NvidiaTextEmbedder(model="nvolveqa_40k")
text_embedder.warm_up()

print(text_embedder.run(text_to_embed))
# {'embedding': [-0.02264290489256382, -0.03457780182361603, ...}

NvidiaDocumentEmbedder:

from haystack.dataclasses import Document
from haystack_integrations.components.embedders.nvidia import NvidiaDocumentEmbedder

documents = [Document(content="Pizza is made with dough and cheese"),
             Document(content="Cake is made with floud and sugar"),
             Document(content="Omlette is made with eggs")]



document_embedder = NvidiaDocumentEmbedder(model="nvolveqa_40k")
document_embedder.warm_up()
document_embedder.run(documents=documents)
#{'documents': [Document(id=2136941caed9b4667d83f906a80d9a2fad1ce34861392889016830ac8738e6c4, content: 'Pizza is made with dough and cheese', embedding: vector of size 1024), ... 'meta': {'usage': {'prompt_tokens': 36, 'total_tokens': 36}}}

NvidiaGenerator:

from haystack_integrations.components.generators.nvidia import NvidiaGenerator

generator = NvidiaGenerator(
    model="nv_llama2_rlhf_70b",
    model_arguments={
        "temperature": 0.2,
        "top_p": 0.7,
        "max_tokens": 1024,
        "seed": None,
        "bad": None,
        "stop": None,
    },
)
generator.warm_up()

result = generator.run(prompt="When was the Golden Gate Bridge built?")
print(result["replies"])
print(result["meta"])
# ['The Golden Gate Bridge was built in 1937 and was completed and opened to the public on May 29, 1937....'[{'role': 'assistant', 'finish_reason': 'stop'}]

NvidiaRanker:

from haystack_integrations.components.rankers.nvidia import NvidiaRanker
from haystack import Document
from haystack.utils import Secret

ranker = NvidiaRanker(
    api_key=Secret.from_env_var("NVIDIA_API_KEY"),
)
ranker.warm_up()

query = "What is the capital of Germany?"
documents = [
    Document(content="Berlin is the capital of Germany."),
    Document(content="The capital of Germany is Berlin."),
    Document(content="Germany's capital is Berlin."),
]

result = ranker.run(query, documents, top_k=1)
print(result["documents"][0].content)
# The capital of Germany is Berlin.

在 Haystack pipeline 中使用 NVIDIA 组件

索引管道

from haystack_integrations.components.generators.nvidia import NvidiaGenerator
from haystack_integrations.components.embedders.nvidia import NvidiaDocumentEmbedder
from haystack import Pipeline
from haystack.dataclasses import Document
from haystack.components.writers import DocumentWriter
from haystack.document_stores.in_memory import InMemoryDocumentStore

documents = [Document(content="Tilde lives in San Francisco"),
             Document(content="Tuana lives in Amsterdam"),
             Document(content="Bilge lives in Istanbul")]

document_store = InMemoryDocumentStore()

document_embedder = NvidiaDocumentEmbedder(model="nvolveqa_40k")
writer = DocumentWriter(document_store=document_store)

indexing_pipeline = Pipeline()
indexing_pipeline.add_component(instance=document_embedder, name="document_embedder")
indexing_pipeline.add_component(instance=writer, name="writer")

indexing_pipeline.connect("document_embedder.documents", "writer.documents")
indexing_pipeline.run(data={"document_embedder":{"documents": documents}})

# Calling filter with no arguments will print the contents of the document store
document_store.filter_documents({})

RAG 查询 pipeline

from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack_integrations.components.generators.nvidia import NvidiaGenerator
from haystack_integrations.components.embedders.nvidia import NvidiaTextEmbedder

prompt = """ Answer the query, based on the
content in the documents.
If you can't answer based on the given documents, say so.

Documents:
{% for doc in documents %}
  {{doc.content}}
{% endfor %}

Query: {{query}}
"""

text_embedder = NvidiaTextEmbedder(model="nvolveqa_40k")
retriever = InMemoryEmbeddingRetriever(document_store=document_store)
prompt_builder = PromptBuilder(template=prompt)
generator = NvidiaGenerator(model="nv_llama2_rlhf_70b")
generator.warm_up()

rag_pipeline = Pipeline()

rag_pipeline.add_component(instance=text_embedder, name="text_embedder")
rag_pipeline.add_component(instance=retriever, name="retriever")
rag_pipeline.add_component(instance=prompt_builder, name="prompt_builder")
rag_pipeline.add_component(instance=generator, name="generator")

rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "generator")

question = "Who lives in San Francisco?"
result = rag_pipeline.run(data={"text_embedder":{"text": question},
                                "prompt_builder":{"query": question}})
print(result)
# {'text_embedder': {'meta': {'usage': {'prompt_tokens': 10, 'total_tokens': 10}}}, 'generator': {'replies': ['Tilde'], 'meta': [{'role': 'assistant', 'finish_reason': 'stop'}], 'usage': {'completion_tokens': 3, 'prompt_tokens': 101, 'total_tokens': 104}}}

许可证

nvidia-haystack 是根据 Apache-2.0 许可证的条款分发的。