集成：Titan Takeoff 推理服务器

使用 Titan Takeoff，您可以使用 Haystack 本地运行开源 LLM。Titan Takeoff 允许您直接在笔记本电脑上运行 Meta、Mistral 和 Alphabet 的最新模型。

作者

Fergus Finn

Rod Rivera

GitHub Repo PyPI Package

概述

您可以使用 Takeoff 推理服务器，在 Haystack 中高效地部署本地模型。Takeoff 是一个最先进的推理服务器，专注于大规模部署公开可用的语言模型。它可以在配备消费级 GPU 的本地计算机以及云基础设施上运行 LLM。

Haystack 中的 TakeoffGenerator 组件是 Takeoff 服务器 API 的封装，可用于在 Haystack 管道中高效地服务已部署的 takeoff 模型。

安装

pip install takeoff_haystack

使用

您可以使用 Haystack 中的 TakeoffGenerator 组件与已部署的 takeoff 模型进行交互。为此，您必须已部署 takeoff 模型。有关如何操作的信息，请在此处阅读 takeoff 文档这里。

以下示例使用 takeoff 在本地端口 3000 上部署 Llama-2-7B-Chat-AWQ 模型。您可以在此处获取免费许可证这里。

docker run --gpus all -e TAKEOFF_MODEL_NAME=TheBloke/Llama-2-7B-Chat-AWQ \
                      -e TAKEOFF_DEVICE=cuda \
                      -e TAKEOFF_MAX_SEQUENCE_LENGTH=256 \
                      -it \
                      -p 3000:3000 tytn/takeoff-pro:0.11.0-gpu

示例

每日新闻摘要生成

以下是一个在 Haystack RAG 管道中使用 takeoff 模型的示例。它对 Techcrunch、TheVerge、Engadget 等流行科技新闻网站的标题进行摘要。

from typing import Dict, List
from haystack import Document, Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder  
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
import feedparser
from takeoff_haystack import TakeoffGenerator

# Dict of website RSS feeds  
urls = {
  'theverge': 'https://www.theverge.com/rss/frontpage/',
  'techcrunch': 'https://techcrunch.com/feed',
  'mashable': 'https://mashable.com/feeds/rss/all',
  'cnet': 'https://cnet.com/rss/news',
  'engadget': 'https://engadget.com/rss.xml',
  'zdnet': 'https://zdnet.com/news/rss.xml',
  'venturebeat': 'https://feeds.feedburner.com/venturebeat/SZYF',
  'readwrite': 'https://readwrite.com/feed/',    
  'wired': 'https://wired.com/feed/rss',
  'gizmodo': 'https://gizmodo.com/rss',
}

# Configurable parameters
NUM_WEBSITES = 3  
NUM_TITLES = 1

def get_titles(urls: Dict[str, str], num_sites: int, num_titles: int) -> List[str]:
  titles: List[str] = []
  sites = list(urls.keys())[:num_sites]
  
  for site in sites:
    feed = feedparser.parse(urls[site])  
    entries = feed.entries[:num_titles]
    
    for entry in entries:
      titles.append(entry.title)
      
  return titles
  
titles = get_titles(urls, NUM_WEBSITES, NUM_TITLES)

document_store = InMemoryDocumentStore()
document_store.write_documents([Document(content=title) for title in titles])

template = """
HEADLINES:  
{% for document in documents %}
  {{ document.content }}  
{% endfor %}
REQUEST: {{ query }}
"""

pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component("llm", TakeoffGenerator(base_url="https://", port="3000"))
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")

query = f"Summarize each of the {NUM_WEBSITES * NUM_TITLES} provided headlines in three words."
response = pipe.run({"prompt_builder": {"query": query}, "retriever": {"query": query}})
print(response["llm"]["replies"])

您应该会看到类似以下的响应

['\n\n\nANSWER:\n\n1. Poker Roguelike - Exciting gameplay\n2. AI-powered news reader - Personalized feed\n3. Best laptops MWC 2024 - Powerful devices']

集成：Titan Takeoff 推理服务器

目录

概述

安装

使用

示例

每日新闻摘要生成