集成：fastRAG

fastRAG 是一个用于高效和优化检索增强生成管道的研究框架

作者

Intel Labs

GitHub 仓库

fastRAG 是一个研究框架，用于实现高效且优化的检索增强生成管道，整合了最先进的 LLM 和信息检索技术。fastRAG 旨在为研究人员和开发人员提供一套全面的工具集，以推进检索增强生成。

欢迎评论、建议、问题和拉取请求！❤️

📣 更新

2024-05：fastRAG V3 兼容 Haystack 2.0 🔥
2023-12：Gaudi2 和 ONNX 运行时支持；优化嵌入模型；多模态和聊天演示；REPLUG 文本生成。
2023-06：ColBERT 索引修改：添加/删除文档。
2023-05：带有 LLM 和动态提示合成的 RAG 示例。
2023-04：Qdrant DocumentStore 支持。

主要特点

优化 RAG：使用 SOTA 高效组件构建 RAG 管道，以获得更高的计算效率。
针对 Intel 硬件进行优化：利用适用于 PyTorch 的 Intel 扩展 (IPEX)、🤗 Optimum Intel 和🤗 Optimum-Habana，在 Intel® Xeon® 处理器和 Intel® Gaudi® AI 加速器上尽可能优化地运行。
可定制：fastRAG 是使用Haystack 和 HuggingFace 构建的。fastRAG 的所有组件都 100% 兼容 Haystack。

🚀 组件

有关 fastRAG 中各种独特组件的简要概述，请参阅组件概览页面。

*LLM 后端*
Intel Gaudi 加速器	在 Gaudi 2 上运行 LLM
ONNX Runtime	使用优化的 ONNX-runtime 运行 LLM
OpenVINO	使用 OpenVINO 运行量化 LLM
Llama-CPP	使用 Llama CPP 后端运行带有 LLM 的 RAG 管道
*优化组件*
嵌入器	优化的 int8 双编码器
排序器	优化/稀疏交叉编码器
*RAG 高效组件*
ColBERT	基于 Token 的后期交互
Fusion-in-Decoder (FiD)	生成式多文档编码器-解码器
REPLUG	改进的多文档解码器
PLAID	极其高效的索引引擎

📍 安装

初步要求

Python 3.8 或更高版本。
PyTorch 2.0 或更高版本。

要设置软件，请克隆项目并运行以下命令，最好在一个新创建的虚拟环境中运行

pip install fastrag

您可以在此处根据您对 fastRAG 的具体使用情况安装其他依赖项。

对于下面的示例，我们需要通过以下命令安装额外的包

pip install fastrag[intel, openvino]

使用

您可以从 fastRAG 导入组件并在 Haystack 管道中使用它们

from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.rankers import TransformersSimilarityRanker

from fastrag.generators.openvino import OpenVINOGenerator

prompt_template = """
Given these documents, answer the question.
Documents:
{% for doc in documents %}
    {{ doc.content }}
{% endfor %}
Question: {{query}}
Answer:
"""

openvino_compressed_model_path = "path/to/quantized/model"

generator = OpenVINOGenerator(
    model="microsoft/phi-2",
    compressed_model_dir=openvino_compressed_model_path,
    device_openvino="CPU",
    task="text-generation",
    generation_kwargs={
        "max_new_tokens": 100,
    }
)

pipe = Pipeline()

pipe.add_component("retriever", InMemoryBM25Retriever(document_store=store))
pipe.add_component("ranker", TransformersSimilarityRanker())
pipe.add_component("prompt_builder", PromptBuilder(template=prompt_template))
pipe.add_component("llm", generator)

pipe.connect("retriever.documents", "ranker.documents")
pipe.connect("ranker", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")

query = "Who is the main villan in Lord of the Rings?"
answer_result = pipe.run({
    "prompt_builder": {
        "query": query
    },
    "retriever": {
        "query": query
    },
    "ranker": {
        "query": query,
        "top_k": 1
    }
})

print(answer_result["llm"]["replies"][0])
#' Sauron\n'

有关更多示例，请查看示例用例。

许可证

代码根据Apache 2.0 许可证许可。

免责声明

这不是官方的 Intel 产品。