由 deepset 维护
集成:NVIDIA
将 NVIDIA 模型与 Haystack 结合使用。
目录
概述
NVIDIA AI 基础模型和NVIDIA 推理微服务让您能够在 NVIDIA 加速基础设施上达到最佳性能。借助预训练的生成式 AI 模型,企业可以更快地创建定制模型,并利用最新的训练和推理技术。
此集成允许您在 Haystack pipeline 中使用 NVIDIA 基础模型和 NVIDIA 推理微服务。
为了使用此集成,您需要一个 NVIDIA API 密钥。将其设置为环境变量 `NVIDIA_API_KEY`。
安装
pip install nvidia-haystack
使用
组件
此集成引入了以下组件
-
NvidiaTextEmbedder:一个用于嵌入字符串的组件,使用 NVIDIA AI 基础模型和 NVIDIA 推理微服务嵌入模型。对于区分查询和文档输入的模型,此组件将输入字符串嵌入为查询。
-
NvidiaDocumentEmbedder:一个用于嵌入文档的组件,使用 NVIDIA AI 基础模型和 NVIDIA 推理微服务嵌入模型。 -
NvidiaGenerator:一个用于使用 NVIDIA AI 基础模型端点和 NVIDIA 推理微服务提供的生成模型生成文本的组件。 -
NvidiaRanker:一个用于对文档进行排名的组件,使用 NVIDIA NIM。
单独使用这些组件
NvidiaTextEmbedder:
from haystack_integrations.components.embedders.nvidia import NvidiaTextEmbedder
text_to_embed = "I love pizza!"
text_embedder = NvidiaTextEmbedder(model="nvolveqa_40k")
text_embedder.warm_up()
print(text_embedder.run(text_to_embed))
# {'embedding': [-0.02264290489256382, -0.03457780182361603, ...}
NvidiaDocumentEmbedder:
from haystack.dataclasses import Document
from haystack_integrations.components.embedders.nvidia import NvidiaDocumentEmbedder
documents = [Document(content="Pizza is made with dough and cheese"),
Document(content="Cake is made with floud and sugar"),
Document(content="Omlette is made with eggs")]
document_embedder = NvidiaDocumentEmbedder(model="nvolveqa_40k")
document_embedder.warm_up()
document_embedder.run(documents=documents)
#{'documents': [Document(id=2136941caed9b4667d83f906a80d9a2fad1ce34861392889016830ac8738e6c4, content: 'Pizza is made with dough and cheese', embedding: vector of size 1024), ... 'meta': {'usage': {'prompt_tokens': 36, 'total_tokens': 36}}}
NvidiaGenerator:
from haystack_integrations.components.generators.nvidia import NvidiaGenerator
generator = NvidiaGenerator(
model="nv_llama2_rlhf_70b",
model_arguments={
"temperature": 0.2,
"top_p": 0.7,
"max_tokens": 1024,
"seed": None,
"bad": None,
"stop": None,
},
)
generator.warm_up()
result = generator.run(prompt="When was the Golden Gate Bridge built?")
print(result["replies"])
print(result["meta"])
# ['The Golden Gate Bridge was built in 1937 and was completed and opened to the public on May 29, 1937....'[{'role': 'assistant', 'finish_reason': 'stop'}]
NvidiaRanker:
from haystack_integrations.components.rankers.nvidia import NvidiaRanker
from haystack import Document
from haystack.utils import Secret
ranker = NvidiaRanker(
api_key=Secret.from_env_var("NVIDIA_API_KEY"),
)
ranker.warm_up()
query = "What is the capital of Germany?"
documents = [
Document(content="Berlin is the capital of Germany."),
Document(content="The capital of Germany is Berlin."),
Document(content="Germany's capital is Berlin."),
]
result = ranker.run(query, documents, top_k=1)
print(result["documents"][0].content)
# The capital of Germany is Berlin.
在 Haystack pipeline 中使用 NVIDIA 组件
索引管道
from haystack_integrations.components.generators.nvidia import NvidiaGenerator
from haystack_integrations.components.embedders.nvidia import NvidiaDocumentEmbedder
from haystack import Pipeline
from haystack.dataclasses import Document
from haystack.components.writers import DocumentWriter
from haystack.document_stores.in_memory import InMemoryDocumentStore
documents = [Document(content="Tilde lives in San Francisco"),
Document(content="Tuana lives in Amsterdam"),
Document(content="Bilge lives in Istanbul")]
document_store = InMemoryDocumentStore()
document_embedder = NvidiaDocumentEmbedder(model="nvolveqa_40k")
writer = DocumentWriter(document_store=document_store)
indexing_pipeline = Pipeline()
indexing_pipeline.add_component(instance=document_embedder, name="document_embedder")
indexing_pipeline.add_component(instance=writer, name="writer")
indexing_pipeline.connect("document_embedder.documents", "writer.documents")
indexing_pipeline.run(data={"document_embedder":{"documents": documents}})
# Calling filter with no arguments will print the contents of the document store
document_store.filter_documents({})
RAG 查询 pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.components.builders import PromptBuilder
from haystack_integrations.components.generators.nvidia import NvidiaGenerator
from haystack_integrations.components.embedders.nvidia import NvidiaTextEmbedder
prompt = """ Answer the query, based on the
content in the documents.
If you can't answer based on the given documents, say so.
Documents:
{% for doc in documents %}
{{doc.content}}
{% endfor %}
Query: {{query}}
"""
text_embedder = NvidiaTextEmbedder(model="nvolveqa_40k")
retriever = InMemoryEmbeddingRetriever(document_store=document_store)
prompt_builder = PromptBuilder(template=prompt)
generator = NvidiaGenerator(model="nv_llama2_rlhf_70b")
generator.warm_up()
rag_pipeline = Pipeline()
rag_pipeline.add_component(instance=text_embedder, name="text_embedder")
rag_pipeline.add_component(instance=retriever, name="retriever")
rag_pipeline.add_component(instance=prompt_builder, name="prompt_builder")
rag_pipeline.add_component(instance=generator, name="generator")
rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "generator")
question = "Who lives in San Francisco?"
result = rag_pipeline.run(data={"text_embedder":{"text": question},
"prompt_builder":{"query": question}})
print(result)
# {'text_embedder': {'meta': {'usage': {'prompt_tokens': 10, 'total_tokens': 10}}}, 'generator': {'replies': ['Tilde'], 'meta': [{'role': 'assistant', 'finish_reason': 'stop'}], 'usage': {'completion_tokens': 3, 'prompt_tokens': 101, 'total_tokens': 104}}}
许可证
nvidia-haystack 是根据 Apache-2.0 许可证的条款分发的。
