使用 Mixtral-8x7B-Instruct-v0.1 进行 Web QA

在 Colab 中打开下载

_{最后更新：2025 年 7 月 8 日}

Colab 由 Tuana Celik - ( LI & Twitter)

关于如何使用 Mistral AI 新推出的 Mixtral-8x7B-Instruct-v0.1 模型进行构建的快速指南。

Mixtral 有何不同？

这是一个非常酷的新模型。作为开源模型，它是同类中的第一个，因为它是由 8 个 7B 参数模型混合而成（所以叫 Mix tral）。Hugging Face 上的这篇文章比我能说的要好得多，而且详细得多。它的理念是将 8 个不同的“专家”混合在一起，通过一个路由器将查询路由到某个专家（不完全准确，但可以简化理解）。这意味着并非所有 8 个模型都会被推理，这使得它速度非常快！

使用 HuggingFaceAPIChatGenerator 独立查询模型
将生成器添加到完整的 RAG 管道中（在网上）

Screenshot 2023-12-13 at 17.46.33.png

安装依赖项

!uv pip install haystack-ai trafilatura sentence_transformers "huggingface_hub>=0.22.0"

[2mUsing Python 3.12.6 environment at: /Users/dsbatista/haystack-cookbook/.venv[0m
[2K[2mResolved [1m70 packages[0m [2min 813ms[0m[0m                                        [0m
[2K[2mPrepared [1m2 packages[0m [2min 444ms[0m[0m                                             
[2mUninstalled [1m1 package[0m [2min 147ms[0m[0m
[2K[2mInstalled [1m10 packages[0m [2min 111ms[0m[0m                              [0m
 [32m+[39m [1mcourlan[0m[2m==1.3.2[0m
 [32m+[39m [1mdateparser[0m[2m==1.2.2[0m
 [32m+[39m [1mhtmldate[0m[2m==1.9.3[0m
 [32m+[39m [1mjustext[0m[2m==3.0.2[0m
 [32m+[39m [1mlxml[0m[2m==5.4.0[0m
 [32m+[39m [1mlxml-html-clean[0m[2m==0.4.2[0m
 [31m-[39m [1msympy[0m[2m==1.14.0[0m
 [32m+[39m [1msympy[0m[2m==1.13.1[0m
 [32m+[39m [1mtld[0m[2m==0.13.1[0m
 [32m+[39m [1mtrafilatura[0m[2m==2.0.0[0m
 [32m+[39m [1mtzlocal[0m[2m==5.3.1[0m

独立提示模型

我们正在使用 Hugging Face Serverless Inference API。

这需要一个 API 密钥：https://hugging-face.cn/settings/tokens
您还应该在此处接受 Mistral 的条件：https://hugging-face.cn/mistralai/Mixtral-8x7B-Instruct-v0.1

import os
from getpass import getpass

os.environ["HF_API_TOKEN"] = getpass("Enter Hugging Face token: ")

Enter Hugging Face token:  ········

from haystack.components.generators.chat import HuggingFaceAPIChatGenerator

generator = HuggingFaceAPIChatGenerator(
    api_type="serverless_inference_api",
    api_params={"model": "mistralai/Mixtral-8x7B-Instruct-v0.1"}
)

from haystack.dataclasses import ChatMessage

messages = [
    ChatMessage.from_system("\\nYou are a helpful, respectful and honest assistant"),
    ChatMessage.from_user("What's Natural Language Processing?")
]

result = generator.run(messages)
print(result["replies"][0].text)

 Natural Language Processing, often abbreviated as NLP, is a branch of artificial intelligence that focuses on the interaction between computers and humans through natural language. The ultimate objective of NLP is to read, decipher, understand, and make sense of the human language in a valuable way.

NLP involves several complex tasks such as language understanding, language generation, translation, and speech recognition. It's used in many applications we use daily, including search engines, voice-activated assistants, and automated customer service bots. 

By analyzing and interpreting human language, NLP enables machines to understand and respond to text or voice inputs in a way that's similar to how humans communicate. However, it's important to note that NLP technology still has limitations and is not perfect, but it's continually improving with advancements in machine learning and artificial intelligence.

在完整的 RAG 管道中使用该模型（在网上）

在这里，我们将在完整的 RAG 管道中使用与上面相同的 generator 组件。您可以更改此管道以使用您自己的数据源（例如向量数据库、Notion、文档），而不是我们在此处使用的 LinkContentFetcher。

from haystack.components.fetchers.link_content import LinkContentFetcher
from haystack.components.converters import HTMLToDocument
from haystack.components.preprocessors import DocumentSplitter
from haystack.components.rankers import TransformersSimilarityRanker
from haystack.components.builders.chat_prompt_builder import ChatPromptBuilder
from haystack import Pipeline

fetcher = LinkContentFetcher()
converter = HTMLToDocument()
document_splitter = DocumentSplitter(split_by="word", split_length=50)
similarity_ranker = TransformersSimilarityRanker(top_k=3)

prompt_template = """
According to these documents:

{% for doc in documents %}
  {{ doc.content }}
{% endfor %}

Answer the given question: {{question}}
Answer:
"""

prompt_template = [ChatMessage.from_user(prompt_template)]
prompt_builder = ChatPromptBuilder(template=prompt_template)

pipeline = Pipeline()
pipeline.add_component("fetcher", fetcher)
pipeline.add_component("converter", converter)
pipeline.add_component("splitter", document_splitter)
pipeline.add_component("ranker", similarity_ranker)
pipeline.add_component("prompt_builder", prompt_builder)
pipeline.add_component("llm", generator)

pipeline.connect("fetcher.streams", "converter.sources")
pipeline.connect("converter.documents", "splitter.documents")
pipeline.connect("splitter.documents", "ranker.documents")
pipeline.connect("ranker.documents", "prompt_builder.documents")
pipeline.connect("prompt_builder.prompt", "llm")

TransformersSimilarityRanker is considered legacy and will no longer receive updates. It may be deprecated in a future release, with removal following after a deprecation period. Consider using SentenceTransformersSimilarityRanker instead, which provides the same functionality along with additional features.
ChatPromptBuilder has 2 prompt variables, but `required_variables` is not set. By default, all prompt variables are treated as optional, which may lead to unintended behavior in multi-branch pipelines. To avoid unexpected execution, ensure that variables intended to be required are explicitly set in `required_variables`.





<haystack.core.pipeline.pipeline.Pipeline object at 0x3082614c0>
🚅 Components
  - fetcher: LinkContentFetcher
  - converter: HTMLToDocument
  - splitter: DocumentSplitter
  - ranker: TransformersSimilarityRanker
  - prompt_builder: ChatPromptBuilder
  - llm: HuggingFaceAPIChatGenerator
🛤️ Connections
  - fetcher.streams -> converter.sources (List[ByteStream])
  - converter.documents -> splitter.documents (List[Document])
  - splitter.documents -> ranker.documents (List[Document])
  - ranker.documents -> prompt_builder.documents (List[Document])
  - prompt_builder.prompt -> llm.messages (List[ChatMessage])

question = "What do graphs have to do with Haystack?"
result = pipeline.run({"prompt_builder": {"question": question},
                   "ranker": {"query": question},
                   "fetcher": {"urls": ["https://haystack.com.cn/blog/introducing-haystack-2-beta-and-advent"]},
                    "llm":{}})

print(result['llm']['replies'][0])

ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text=' Based on the information provided, graphs, specifically directed acyclic graphs (DAGs), are relevant to the earlier version of Haystack, version 1.x, as the pipeline components were organized in a DAG structure, which meant that the pipeline had to be acyclic and directed, and could not branch out, join, or cycle back to another component. However, with Haystack 2.0, the requirement for the pipeline to be acyclic is being removed, allowing for more complex and flexible pipeline configurations, such as pipelines that can retry, loop back, and potentially cycle back to another component. This change will make the framework better suited to a wider range of use cases and make the code more explicit and self-explanatory.')], _name=None, _meta={'model': 'mistralai/Mixtral-8x7B-Instruct-v0.1', 'finish_reason': 'stop', 'index': 0, 'usage': {'prompt_tokens': 268, 'completion_tokens': 156}})