📘 **TELUS Agriculture & Consumer Goods** 如何通过 **Haystack Agents** 转变促销交易

使用自定义组件进行 Hacker News 摘要


作者 Tuana Celik:TwitterLinkedIn

📚 阅读 使用 Haystack 自定义 RAG 管道以总结 Hacker News 最新帖子 文章,了解此示例的详细演练。

安装依赖项

!pip install newspaper3k
!pip install haystack-ai

创建自定义 Haystack 组件

HackernewsNewestFetcher 获取 Hacker News 上 last_k 条最新帖子,并将内容作为 Haystack Document 对象列表返回。

from typing import List
from haystack import component, Document
from newspaper import Article
import requests

@component
class HackernewsNewestFetcher():

  @component.output_types(articles=List[Document])
  def run(self, last_k: int):
    newest_list = requests.get(url='https://hacker-news.firebaseio.com/v0/newstories.json?print=pretty')
    articles = []
    for id in newest_list.json()[0:last_k]:
      article = requests.get(url=f"https://hacker-news.firebaseio.com/v0/item/{id}.json?print=pretty")
      if 'url' in article.json():
        articles.append(article.json()['url'])

    docs = []
    for url in articles:
      try:
        article = Article(url)
        article.download()
        article.parse()
        docs.append(Document(content=article.text, meta={'title': article.title, 'url': url}))
      except:
        print(f"Couldn't download {url}, skipped")
    return {'articles': docs}

创建 Haystack 2.0 RAG 管道

此管道使用了撰写本文时(2023 年 9 月 22 日)Haystack 2.0 预览版中可用的组件,以及我们上面创建的自定义组件。

最终结果是一个 RAG 管道,旨在提供 Hacker News 上 last_k 条帖子的摘要列表,并在后面附带源 URL。

from getpass import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass("OpenAI Key: ")
from haystack import Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import OpenAIGenerator

prompt_template = """
You will be provided a few of the latest posts in HackerNews, followed by their URL.
For each post, provide a brief summary followed by the URL the full post can be found in.

Posts:
{% for article in articles %}
  {{article.content}}
  URL: {{article.meta['url']}}
{% endfor %}
"""

prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator(model="gpt-4")
fetcher = HackernewsNewestFetcher()

pipe = Pipeline()
pipe.add_component("hackernews_fetcher", fetcher)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)

pipe.connect("hackernews_fetcher.articles", "prompt_builder.articles")
pipe.connect("prompt_builder.prompt", "llm.prompt")
result = pipe.run(data={"hackernews_fetcher": {"last_k": 3}})
print(result['llm']['replies'][0])