由 deepset 维护

集成：Mistral

使用 Mistral API 来进行嵌入和文本生成模型。

作者

deepset

GitHub 仓库 PyPI 包

概述

Mistral AI 目前提供两种访问大型语言模型的方式

一个 API，提供按量付费访问最新的 Mistral 模型，例如 mistral-embed 和 mistral-small。
根据 Apache 2.0 许可证提供的开源模型，可在 Hugging Face 上找到，您可以使用 HuggingFaceTGIGenerator 来使用它们。

有关通过 Mistral API 提供的模型信息的更多内容，请参阅 Mistral 文档。

为了能够跟随本指南，您需要一个 Mistral API 密钥。将其添加为环境变量 MISTRAL_API_KEY。

安装

pip install mistral-haystack

使用

组件

此集成引入了 3 个组件

MistralDocumentEmbedder：使用 Mistral 嵌入模型（目前仅 mistral-embed）为 Haystack 文档创建嵌入。
MistralTextEmbedder：使用 Mistral 嵌入模型（目前仅 mistral-embed）为文本（例如查询）创建嵌入。
MistralChatGenerator：使用 Mistral 聊天补全模型，例如 mistral-tiny（默认）。

使用 Mistral 生成模型

import os
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.mistral import MistralChatGenerator

os.environ["MISTRAL_API_KEY"] = "YOUR_MISTRAL_API_KEY"
model = "mistral-medium"

client = MistralChatGenerator(model=model)


response = client.run(
    messages=[ChatMessage.from_user("What is the best French cheese?")]
)
print(response)

{'replies': [ChatMessage(content='The "best" French cheese is subjective and depends on personal taste...', role=<ChatRole.ASSISTANT: 'assistant'>, name=None, meta={'model': 'mistral-medium', 'index': 0, 'finish_reason': 'stop', 'usage': {'completion_tokens': 231, 'prompt_tokens': 16, 'total_tokens': 247}})]}

如果您将回调函数传递给 MistralChatGenerator，Mistral LLM 也支持流式响应，如下所示：

import os

from haystack.components.generators.utils import print_streaming_chunk
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.mistral import MistralChatGenerator

os.environ["MISTRAL_API_KEY"] = "YOUR_MISTRAL_API_KEY"
model = "mistral-medium"

client = MistralChatGenerator(
    model=model,
    streaming_callback=print_streaming_chunk
)

response = client.run(
    messages=[ChatMessage.from_user("What is the best French cheese?")]
)
print(response)

使用 Mistral 嵌入模型

在索引管道中使用 MistralDocumentEmbedder

import os

from haystack import Document, Pipeline
from haystack.components.writers import DocumentWriter
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.embedders.mistral.document_embedder import MistralDocumentEmbedder

os.environ["MISTRAL_API_KEY"] = "YOUR_MISTRAL_API_KEY"

document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")

documents = [Document(content="My name is Wolfgang and I live in Berlin"),
             Document(content="I saw a black horse running"),
             Document(content="Germany has many big cities")]

embedder = MistralDocumentEmbedder()
writer = DocumentWriter(document_store=document_store)

indexing_pipeline = Pipeline()
indexing_pipeline.add_component(name="embedder", instance=embedder)
indexing_pipeline.add_component(name="writer", instance=writer)

indexing_pipeline.connect("embedder", "writer")

indexing_pipeline.run(data={"embedder": {"documents": documents}})

在 RAG 管道中使用 MistralTextEmbedder

import os

from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.embedders.mistral.document_embedder import MistralDocumentEmbedder
from haystack_integrations.components.embedders.mistral.text_embedder import MistralTextEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.components.builders import ChatPromptBuilder

os.environ["MISTRAL_API_KEY"] = "YOUR_MISTRAL_API_KEY"

document_store = InMemoryDocumentStore()

documents = [Document(content="My name is Wolfgang and I live in Berlin"),
             Document(content="I saw a black horse running"),
             Document(content="Germany has many big cities")]

document_embedder = MistralDocumentEmbedder()
documents_with_embeddings = document_embedder.run(documents)['documents']
document_store.write_documents(documents)

text_embedder = MistralTextEmbedder()
retriever = InMemoryEmbeddingRetriever(document_store=document_store)
prompt_builder = ChatPromptBuilder()
llm = MistralChatGenerator(streaming_callback=print_streaming_chunk)

messages = [ChatMessage.from_user("Here are some the documents: {{documents}} \\n Answer: {{query}}")]

rag_pipeline = Pipeline()
rag_pipeline.add_component("text_embedder", text_embedder)
rag_pipeline.add_component("retriever", retriever)
rag_pipeline.add_component("prompt_builder", prompt_builder)
rag_pipeline.add_component("llm", llm)


rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder.prompt", "llm.messages")

question = "Who lives in Berlin?"

result = rag_pipeline.run(
    {
        "text_embedder": {"text": question},
        "prompt_builder": {"template_variables": {"query": question}, "template": messages},
        "llm": {"generation_kwargs": {"max_tokens": 165}},
    }
)

许可证

mistral-haystack 根据 Apache-2.0 许可证的条款进行分发。

集成：Mistral

目录

概述

安装

使用

组件

使用 Mistral 生成模型

使用 Mistral 嵌入模型

许可证