# 变更日志 URL: https://developers.cloudflare.com/workers-ai/changelog/ import { ProductReleaseNotes } from "~/components"; {/* */} --- # 代理 URL: https://developers.cloudflare.com/workers-ai/agents/ import { LinkButton } from "~/components";

使用 Cloudflare Workers AI 和代理构建能够代表您的用户执行复杂任务的 AI 助手。

转到代理文档
--- # Cloudflare Workers AI URL: https://developers.cloudflare.com/workers-ai/ import { CardGrid, Description, Feature, LinkTitleCard, Plan, RelatedProduct, Render, LinkButton, Flex, } from "~/components"; 在 Cloudflare 的全球网络上,由无服务器 GPU 提供支持,运行机器学习模型。 Workers AI 允许您以无服务器的方式运行 AI 模型,无需担心扩展、维护或为未使用的基础设施付费。您可以从您自己的代码中——从 [Workers](/workers/)、[Pages](/pages/) 或通过 [Cloudflare API](/api/resources/ai/methods/run/) 的任何地方——调用在 Cloudflare 网络上的 GPU 上运行的模型。 Workers AI 让您可以访问: - **50多种[开源模型](/workers-ai/models/)**,作为我们模型目录的一部分提供 - 无服务器、**按使用付费**的[定价模型](/workers-ai/platform/pricing/) - 所有这些都作为**功能齐全的开发者平台**的一部分,包括 [AI 网关](/ai-gateway/)、[Vectorize](/vectorize/)、[Workers](/workers/) 等等...
开始使用 观看 Workers AI 演示
--- ## 功能 Workers AI 配备了一系列精选的流行开源模型,使您能够执行图像分类、文本生成、对象检测等任务。 --- ## 相关产品 通过缓存、速率限制、请求重试、模型回退等功能,观察和控制您的 AI 应用程序。 使用 Cloudflare 的矢量数据库 Vectorize 构建全栈 AI 应用程序。添加 Vectorize 使您能够执行语义搜索、推荐、异常检测等任务,或用于为 LLM 提供上下文和记忆。 构建无服务器应用程序并立即在全球范围内部署,以获得卓越的性能、可靠性和规模。 创建立即部署到 Cloudflare 全球网络的全栈应用程序。 存储大量非结构化数据,而无需支付与典型云存储服务相关的昂贵出口带宽费用。 创建新的无服务器 SQL 数据库,以便从您的 Workers 和 Pages 项目中查询。 具有强一致性存储的全球分布式协调 API。 创建全球性、低延迟的键值数据存储。 --- ## 更多资源 构建和部署您的第一个 Workers AI 应用程序。 了解免费和付费计划。 了解 Workers AI 的限制。 了解如何构建和部署雄心勃勃的 AI 应用程序到 Cloudflare 的全球网络。 了解哪种存储选项最适合您的项目。 在 Discord 上与 Workers 社区联系,提出问题,分享您正在构建的内容,并与其他开发者讨论平台。 在 Twitter 上关注 @CloudflareDev,了解产品公告和 Cloudflare Workers 的新功能。 --- # Vercel AI SDK URL: https://developers.cloudflare.com/workers-ai/configuration/ai-sdk/ import { PackageManagers } from "~/components"; Workers AI 可用于 JavaScript 和 TypeScript 代码库的 [Vercel AI SDK](https://sdk.vercel.ai/)。 ## 设置 安装 [`workers-ai-provider` 提供程序](https://sdk.vercel.ai/providers/community-providers/cloudflare-workers-ai): 然后,在您的 Workers 项目 Wrangler 文件中添加一个 AI 绑定: ```toml [ai] binding = "AI" ``` ## 模型 AI SDK 可以配置为与[任何 AI 模型](/workers-ai/models/)一起使用。 ```js import { createWorkersAI } from "workers-ai-provider"; const workersai = createWorkersAI({ binding: env.AI }); // 选择任何模型:https://developers.cloudflare.com/workers-ai/models/ const model = workersai("@cf/meta/llama-3.1-8b-instruct", {}); ``` ## 生成文本 选择模型后,您可以从给定的提示生成文本。 ```js import { createWorkersAI } from 'workers-ai-provider'; import { generateText } from 'ai'; type Env = { AI: Ai; }; export default { async fetch(_: Request, env: Env) { const workersai = createWorkersAI({ binding: env.AI }); const result = await generateText({ model: workersai('@cf/meta/llama-2-7b-chat-int8'), prompt: '写一篇关于 hello world 的 50 字短文。', }); return new Response(result.text); }, }; ``` ## 流式文本 对于较长的响应,请考虑在生成完成时流式传输响应。 ```js import { createWorkersAI } from 'workers-ai-provider'; import { streamText } from 'ai'; type Env = { AI: Ai; }; export default { async fetch(_: Request, env: Env) { const workersai = createWorkersAI({ binding: env.AI }); const result = streamText({ model: workersai('@cf/meta/llama-2-7b-chat-int8'), prompt: '写一篇关于 hello world 的 50 字短文。', }); return result.toTextStreamResponse({ headers: { // 添加这些标头以确保 // 响应是分块和流式的 'Content-Type': 'text/x-unknown', 'content-encoding': 'identity', 'transfer-encoding': 'chunked', }, }); }, }; ``` ## 生成结构化对象 您可以提供一个 Zod 模式来生成结构化的 JSON 响应。 ```js import { createWorkersAI } from 'workers-ai-provider'; import { generateObject } from 'ai'; import { z } from 'zod'; type Env = { AI: Ai; }; export default { async fetch(_: Request, env: Env) { const workersai = createWorkersAI({ binding: env.AI }); const result = await generateObject({ model: workersai('@cf/meta/llama-3.1-8b-instruct'), prompt: '生成一份千层面食谱', schema: z.object({ recipe: z.object({ ingredients: z.array(z.string()), description: z.string(), }), }), }); return Response.json(result.object); }, }; ``` --- # Workers 绑定 URL: https://developers.cloudflare.com/workers-ai/configuration/bindings/ import { Type, MetaInfo, WranglerConfig } from "~/components"; ## Workers [Workers](/workers/) 提供了一个无服务器执行环境,允许您创建新应用程序或增强现有应用程序。 要将 Workers AI 与 Workers 一起使用,您必须创建一个 Workers AI [绑定](/workers/runtime-apis/bindings/)。绑定允许您的 Worker 与 Cloudflare 开发者平台上的资源(如 Workers AI)进行交互。您可以在 Cloudflare 仪表板上或通过更新您的 [Wrangler 文件](/workers/wrangler/configuration/)来创建绑定。 要将 Workers AI 绑定到您的 Worker,请将以下内容添加到您的 Wrangler 文件的末尾: ```toml [ai] binding = "AI" # 即在您的 Worker 中通过 env.AI 可用 ``` ## Pages 函数 [Pages 函数](/pages/functions/)允许您通过在 Cloudflare 网络上执行代码来构建具有 Cloudflare Pages 的全栈应用程序。函数本质上是 Workers。 要在您的 Pages 函数中配置 Workers AI 绑定,您必须使用 Cloudflare 仪表板。有关说明,请参阅 [Workers AI 绑定](/pages/functions/bindings/#workers-ai)。 ## 方法 ### async env.AI.run() `async env.AI.run()` 运行一个模型。第一个参数是模型,第二个参数是一个对象。 ```javascript const answer = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", { prompt: "What is the origin of the phrase 'Hello, World'", }); ``` **参数** - `model` - 要运行的模型。 **支持的选项** - `stream` - 在结果可用时返回结果流。 ```javascript const answer = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", { prompt: "What is the origin of the phrase 'Hello, World'", stream: true, }); return new Response(answer, { headers: { "content-type": "text/event-stream" }, }); ``` --- # Hugging Face 聊天界面 URL: https://developers.cloudflare.com/workers-ai/configuration/hugging-face-chat-ui/ 将 Workers AI 与 Hugging Face 提供的开源聊天界面 [Chat UI](https://github.com/huggingface/chat-ui?tab=readme-ov-file#text-embedding-models) 一起使用。 ## 先决条件 您将需要以下内容: - 一个 [Cloudflare 帐户](https://dash.cloudflare.com) - 您的[帐户 ID](/fundamentals/account/find-account-and-zone-ids/) - 一个用于 Workers AI 的 [API 令牌](/workers-ai/get-started/rest-api/#1-get-api-token-and-account-id) ## 设置 首先,决定如何引用您的帐户 ID 和 API 令牌(直接在您的 `.env.local` 中使用 `CLOUDFLARE_ACCOUNT_ID` 和 `CLOUDFLARE_API_TOKEN` 变量,或在端点配置中)。 然后,按照 [Chat UI GitHub 仓库](https://github.com/huggingface/chat-ui?tab=readme-ov-file#text-embedding-models)中的其余设置说明进行操作。 在设置模型时,请指定 `cloudflare` 端点。 ```json { "name": "nousresearch/hermes-2-pro-mistral-7b", "tokenizer": "nousresearch/hermes-2-pro-mistral-7b", "parameters": { "stop": ["<|im_end|>"] }, "endpoints": [ { "type": "cloudflare", // 如果未包含在 .env.local 中,则可选择指定这些 "accountId": "your-account-id", "apiToken": "your-api-token" // } ] } ``` ## 支持的模型 此模板适用于任何以 `@hf` 参数开头的[文本生成模型](/workers-ai/models/)。 --- # 配置 URL: https://developers.cloudflare.com/workers-ai/configuration/ import { DirectoryListing } from "~/components"; --- # OpenAI 兼容 API 端点 URL: https://developers.cloudflare.com/workers-ai/configuration/open-ai-compatibility/ import { Render } from "~/components";
## 用法 ### Workers AI 通常,Workers AI 要求您在 cURL 端点或 `env.AI.run` 函数中指定模型名称。 使用 OpenAI 兼容端点,您可以利用 [openai-node sdk](https://github.com/openai/openai-node) 来调用 Workers AI。这允许您通过简单地更改基本 URL 和模型名称来使用 Workers AI。 ```js title="OpenAI SDK 示例" import OpenAI from "openai"; const openai = new OpenAI({ apiKey: env.CLOUDFLARE_API_KEY, baseURL: `https://api.cloudflare.com/client/v4/accounts/${env.CLOUDFLARE_ACCOUNT_ID}/ai/v1`, }); const chatCompletion = await openai.chat.completions.create({ messages: [{ role: "user", content: "发出一些机器人噪音" }], model: "@cf/meta/llama-3.1-8b-instruct", }); const embeddings = await openai.embeddings.create({ model: "@cf/baai/bge-large-en-v1.5", input: "我喜欢抹茶", }); ``` ```bash title="cURL 示例" curl --request POST \ --url https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/v1/chat/completions \ --header "Authorization: Bearer {api_token}" \ --header "Content-Type: application/json" \ --data ' { "model": "@cf/meta/llama-3.1-8b-instruct", "messages": [ { "role": "user", "content": "如何用三个简短的步骤制作一个木勺?请给出尽可能简短的回答" } ] } ' ``` ### AI 网关 这些端点也与 [AI 网关](/ai-gateway/providers/workersai/#openai-compatible-endpoints)兼容。 --- # 仪表板 URL: https://developers.cloudflare.com/workers-ai/get-started/dashboard/ import { Render } from "~/components"; 请按照本指南使用 Cloudflare 仪表板创建 Workers AI 应用程序。 ## 先决条件 如果您还没有 [Cloudflare 帐户](https://dash.cloudflare.com/sign-up/workers-and-pages),请注册一个。 ## 设置 要创建 Workers AI 应用程序: 1. 登录 [Cloudflare 仪表板](https://dash.cloudflare.com)并选择您的帐户。 2. 转到 **计算 (Workers)** 和 **Workers & Pages**。 3. 选择**创建**。 4. 在 **从模板开始**下,选择 **LLM 应用**。选择模板后,将在仪表板中为您创建一个[AI 绑定](/workers-ai/configuration/bindings/)。 5. 查看提供的代码并选择**部署**。 6. 在其提供的 [`workers.dev`](/workers/configuration/routing/workers-dev/) 子域上预览您的 Worker。 ## 开发 --- # 开始使用 URL: https://developers.cloudflare.com/workers-ai/get-started/ import { DirectoryListing } from "~/components"; 在 Cloudflare 上构建您的 Workers AI 项目有多种选择。要开始,请选择您喜欢的方法: :::note 这些示例旨在创建新的 Workers AI 项目。有关将 Workers AI 添加到现有 Worker 的帮助,请参阅 [Workers 绑定](/workers-ai/configuration/bindings/)。 ::: --- # REST API URL: https://developers.cloudflare.com/workers-ai/get-started/rest-api/ 本指南将指导您设置和部署您的第一个 Workers AI 项目。您将使用 Workers AI REST API 来体验大型语言模型 (LLM)。 ## 先决条件 如果您还没有 [Cloudflare 帐户](https://dash.cloudflare.com/sign-up/workers-and-pages),请注册一个。 ## 1. 获取 API 令牌和账户 ID 您需要您的 API 令牌和账户 ID 才能使用 REST API。 要获取这些值: 1. 登录 [Cloudflare 仪表板](https://dash.cloudflare.com)并选择您的帐户。 2. 转到 **AI** > **Workers AI**。 3. 选择**使用 REST API**。 4. 获取您的 API 令牌: 1. 选择**创建 Workers AI API 令牌**。 2. 查看预填信息。 3. 选择**创建 API 令牌**。 4. 选择**复制 API 令牌**。 5. 保存该值以备将来使用。 5. 对于**获取账户 ID**,复制**账户 ID** 的值。保存该值以备将来使用。 :::note 如果您选择[创建 API 令牌](/fundamentals/api/get-started/create-token/)而不是使用模板,该令牌将需要 `Workers AI - 读取` 和 `Workers AI - 编辑` 的权限。 ::: ## 2. 通过 API 运行模型 创建 API 令牌后,在请求中使用您的 API 令牌进行身份验证并向 API 发出请求。 您将使用[执行 AI 模型](/api/resources/ai/methods/run/)端点来运行 [`@cf/meta/llama-3.1-8b-instruct`](/workers-ai/models/llama-3.1-8b-instruct/) 模型: ```bash curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/meta/llama-3.1-8b-instruct \ -H 'Authorization: Bearer {API_TOKEN}' \ -d '{ "prompt": "Where did the phrase Hello World come from" }' ``` 替换 `{ACCOUNT_ID}` 和 `{API_token}` 的值。 API 响应将如下所示: ```json { "result": { "response": "Hello, World first appeared in 1974 at Bell Labs when Brian Kernighan included it in the C programming language example. It became widely used as a basic test program due to simplicity and clarity. It represents an inviting greeting from a program to the world." }, "success": true, "errors": [], "messages": [] } ``` 此示例执行使用 `@cf/meta/llama-3.1-8b-instruct` 模型,但您可以使用 [Workers AI 模型目录](/workers-ai/models/)中的任何模型。如果使用其他模型,您需要将 `{model}` 替换为您想要的模型名称。 完成本指南后,您已创建了一个 Cloudflare 帐户(如果您还没有),并创建了一个授予您帐户 Workers AI 读取权限的 API 令牌。您使用终端的 cURL 命令执行了 [`@cf/meta/llama-3.1-8b-instruct`](/workers-ai/models/llama-3.1-8b-instruct/) 模型,并在 JSON 响应中收到了对您提示的回答。 ## 相关资源 - [模型](/workers-ai/models/) - 浏览 Workers AI 模型目录。 - [AI SDK](/workers-ai/configuration/ai-sdk) - 了解如何与 AI 模型集成。 --- # Workers 绑定 URL: https://developers.cloudflare.com/workers-ai/get-started/workers-wrangler/ import { Render, PackageManagers, WranglerConfig, TypeScriptExample, } from "~/components"; 本指南将指导您设置和部署您的第一个 Workers AI 项目。您将使用 [Workers](/workers/)、一个 Workers AI 绑定和一个大型语言模型 (LLM) 来在 Cloudflare 全球网络上部署您的第一个由 AI 驱动的应用程序。 ## 1. 创建一个 Worker 项目 您将使用 `create-cloudflare` CLI (C3) 创建一个新的 Worker 项目。[C3](https://github.com/cloudflare/workers-sdk/tree/main/packages/create-cloudflare) 是一个命令行工具,旨在帮助您设置和部署新的应用程序到 Cloudflare。 通过运行以下命令创建一个名为 `hello-ai` 的新项目: 运行 `npm create cloudflare@latest` 将提示您安装 [`create-cloudflare` 包](https://www.npmjs.com/package/create-cloudflare),并引导您完成设置。C3 还将安装 [Wrangler](/workers/wrangler/),即 Cloudflare 开发者平台 CLI。 这将创建一个新的 `hello-ai` 目录。您的新 `hello-ai` 目录将包括: - 一个位于 `src/index.ts` 的 `"Hello World"` [Worker](/workers/get-started/guide/#3-write-code)。 - 一个 [`wrangler.jsonc`](/workers/wrangler/configuration/) 配置文件。 进入您的应用程序目录: ```sh cd hello-ai ``` ## 2. 将您的 Worker 连接到 Workers AI 您必须为您的 Worker 创建一个 AI 绑定以连接到 Workers AI。[绑定](/workers/runtime-apis/bindings/)允许您的 Worker 与 Cloudflare 开发者平台上的资源(如 Workers AI)进行交互。 要将 Workers AI 绑定到您的 Worker,请将以下内容添加到您的 Wrangler 文件的末尾: ```toml [ai] binding = "AI" ``` 您的绑定在您的 Worker 代码中通过 [`env.AI`](/workers/runtime-apis/handlers/fetch/) 可用。 {/* */} 您还可以将 Workers AI 绑定到 Pages 函数。有关更多信息,请参阅[函数绑定](/pages/functions/bindings/#workers-ai)。 ## 3. 在您的 Worker 中运行推理任务 您现在已准备好在您的 Worker 中运行推理任务。在这种情况下,您将使用一个 LLM,[`llama-3.1-8b-instruct`](/workers-ai/models/llama-3.1-8b-instruct/),来回答一个问题。 使用以下代码更新您的 `hello-ai` 应用程序目录中的 `index.ts` 文件: ```ts export interface Env { // 如果您在 Wrangler 配置文件中为 'binding' 设置了另一个名称, // 请将 "AI" 替换为您定义的变量名。 AI: Ai; } export default { async fetch(request, env): Promise { const response = await env.AI.run("@cf/meta/llama-3.1-8b-instruct", { prompt: "What is the origin of the phrase Hello, World", }); return new Response(JSON.stringify(response)); }, } satisfies ExportedHandler; ``` 至此,您已经为您的 Worker 创建了一个 AI 绑定,并配置了您的 Worker 以能够执行 Llama 3.1 模型。现在,您可以在全球部署之前在本地测试您的项目。 ## 4. 使用 Wrangler 进行本地开发 在您的项目目录中,通过运行 [`wrangler dev`](/workers/wrangler/commands/#dev) 在本地测试 Workers AI: ```sh npx wrangler dev ``` 运行 `wrangler dev` 后,系统会提示您登录。当您运行 `npx wrangler dev` 时,Wrangler 会给您一个 URL(很可能是 `localhost:8787`)来审查您的 Worker。在您访问 Wrangler 提供的 URL 后,将呈现一条类似以下示例的消息: ```json { "response": "Ah, a most excellent question, my dear human friend! *adjusts glasses*\n\nThe origin of the phrase \"Hello, World\" is a fascinating tale that spans several decades and multiple disciplines. It all began in the early days of computer programming, when a young man named Brian Kernighan was tasked with writing a simple program to demonstrate the basics of a new programming language called C.\nKernighan, a renowned computer scientist and author, was working at Bell Labs in the late 1970s when he created the program. He wanted to showcase the language's simplicity and versatility, so he wrote a basic \"Hello, World!\" program that printed the familiar greeting to the console.\nThe program was included in Kernighan and Ritchie's influential book \"The C Programming Language,\" published in 1978. The book became a standard reference for C programmers, and the \"Hello, World!\" program became a sort of \"Hello, World!\" for the programming community.\nOver time, the phrase \"Hello, World!\" became a shorthand for any simple program that demonstrated the basics" } ``` ## 5. 部署您的 AI Worker 在将您的 AI Worker 全球部署之前,请通过运行以下命令使用您的 Cloudflare 帐户登录: ```sh npx wrangler login ``` 您将被引导到一个网页,要求您登录 Cloudflare 仪表板。登录后,系统会询问您是否允许 Wrangler 对您的 Cloudflare 帐户进行更改。向下滚动并选择 **允许** 以继续。 最后,部署您的 Worker,使您的项目可以在互联网上访问。要部署您的 Worker,请运行: ```sh npx wrangler deploy ``` ```sh output https://hello-ai..workers.dev ``` 您的 Worker 将被部署到您的自定义 [`workers.dev`](/workers/configuration/routing/workers-dev/) 子域。您现在可以访问该 URL 来运行您的 AI Worker。 完成本教程后,您创建了一个 Worker,通过 AI 绑定将其连接到 Workers AI,并从 Llama 3 模型运行了一个推理任务。 ## 相关资源 - [Discord 上的 Cloudflare 开发者社区](https://discord.cloudflare.com) - 通过加入 Cloudflare Discord 服务器,直接向 Cloudflare 团队提交功能请求、报告错误并分享您的反馈。 - [模型](/workers-ai/models/) - 浏览 Workers AI 模型目录。 - [AI SDK](/workers-ai/configuration/ai-sdk) - 了解如何与 AI 模型集成。 --- # 演示和架构 URL: https://developers.cloudflare.com/workers-ai/guides/demos-architectures/ import { ExternalResources, GlossaryTooltip, ResourcesBySelector, } from "~/components"; Workers AI 可用于构建动态和高性能的服务。以下演示应用程序和参考架构展示了如何在您的架构中最佳地使用 Workers AI。 ## 演示 探索以下 Workers AI 的演示应用程序 ## 参考架构 探索以下使用 Workers AI 的参考架构 --- # 指南 URL: https://developers.cloudflare.com/workers-ai/guides/ import { DirectoryListing } from "~/components"; --- # 模型 URL: https://developers.cloudflare.com/workers-ai/models/ import ModelCatalog from "~/pages/workers-ai/models/index.astro"; --- # 功能 URL: https://developers.cloudflare.com/workers-ai/features/ import { DirectoryListing } from "~/components"; --- # JSON 模式 URL: https://developers.cloudflare.com/workers-ai/features/json-mode/ import { Code } from "~/components"; export const jsonModeSchema = `{ response_format: { title: "JSON 模式", type: "object", properties: { type: { type: "string", enum: ["json_object", "json_schema"], }, json_schema: {}, } } }`; export const jsonModeRequestExample = `{ "messages": [ { "role": "system", "content": "提取有关国家的数据。" }, { "role": "user", "content": "告诉我关于印度的信息。" } ], "response_format": { "type": "json_schema", "json_schema": { "type": "object", "properties": { "name": { "type": "string" }, "capital": { "type": "string" }, "languages": { "type": "array", "items": { "type": "string" } } }, "required": [ "name", "capital", "languages" ] } } }`; export const jsonModeResponseExample = `{ "response": { "name": "印度", "capital": "新德里", "languages": [ "印地语", "英语", "孟加拉语", "泰卢固语", "马拉地语", "泰米尔语", "古吉拉特语", "乌尔都语", "卡纳达语", "奥里亚语", "马拉雅拉姆语", "旁遮普语", "梵语" ] } }`; 当我们希望文本生成 AI 模型以编程方式与数据库、服务和外部系统交互时,通常在使用工具调用或构建 AI 代理时,我们必须使用结构化的响应格式而不是自然语言。 Workers AI 支持 JSON 模式,使应用程序能够在与 AI 模型交互时请求结构化的输出响应。 ## 架构 JSON 模式与 OpenAI 的实现兼容;要启用,请使用以下约定将 `response_format` 属性添加到请求对象中: 其中 `json_schema` 必须是有效的 [JSON 模式](https://json-schema.org/) 声明。 ## JSON 模式示例 使用 JSON 格式时,请将架构作为请求的一部分传递给 LLM,如下例所示。 LLM 将遵循该架构,并返回如下所示的响应: 如您所见,模型正在遵守请求中的 JSON 架构定义,并以经过验证的 JSON 对象进行响应。 ## 支持的模型 以下是现在支持 JSON 模式的模型列表: - [@cf/meta/llama-3.1-8b-instruct-fast](/workers-ai/models/llama-3.1-8b-instruct-fast/) - [@cf/meta/llama-3.1-70b-instruct](/workers-ai/models/llama-3.1-70b-instruct/) - [@cf/meta/llama-3.3-70b-instruct-fp8-fast](/workers-ai/models/llama-3.3-70b-instruct-fp8-fast/) - [@cf/meta/llama-3-8b-instruct](/workers-ai/models/llama-3-8b-instruct/) - [@cf/meta/llama-3.1-8b-instruct](/workers-ai/models/llama-3.1-8b-instruct/) - [@cf/meta/llama-3.2-11b-vision-instruct](/workers-ai/models/llama-3.2-11b-vision-instruct/) - [@hf/nousresearch/hermes-2-pro-mistral-7b](/workers-ai/models/hermes-2-pro-mistral-7b/) - [@hf/thebloke/deepseek-coder-6.7b-instruct-awq](/workers-ai/models/deepseek-coder-6.7b-instruct-awq/) - [@cf/deepseek-ai/deepseek-r1-distill-qwen-32b](/workers-ai/models/deepseek-r1-distill-qwen-32b/) 我们将继续扩展此列表,以跟上新的和被请求的模型。 请注意,Workers AI 不能保证模型会根据请求的 JSON 模式进行响应。根据任务的复杂性和 JSON 模式的充分性,模型在极端情况下可能无法满足请求。如果出现这种情况,则会返回错误 `JSON 模式无法满足`,并且必须进行处理。 JSON 模式目前不支持流式传输。 --- # 提示 URL: https://developers.cloudflare.com/workers-ai/features/prompting/ import { Code } from "~/components"; export const scopedExampleOne = `{ messages: [ { role: "system", content: "你是一个非常有趣的喜剧演员,你喜欢表情符号" }, { role: "user", content: "给我讲个关于 Cloudflare 的笑话" }, ], };`; export const scopedExampleTwo = `{ messages: [ { role: "system", content: "你是一个专业的计算机科学助理" }, { role: "user", content: "WASM 是什么?" }, { role: "assistant", content: "WASM (WebAssembly) 是一种二进制指令格式,旨在成为一个平台无关的格式" }, { role: "user", content: "Python 能编译成 WASM 吗?" }, { role: "assistant", content: "不,Python 不能直接编译成 WebAssembly" }, { role: "user", content: "Rust 呢?" }, ], };`; export const unscopedExampleOne = `{ prompt: "给我讲个关于 Cloudflare 的笑话"; }`; export const unscopedExampleTwo = `{ prompt: "[INST]喜剧演员[/INST]\n[INST]给我讲个关于 Cloudflare 的笑话[/INST]", raw: true };`; 从文本生成模型获得良好结果的一部分是正确地提出问题。LLM 通常使用特定的预定义模板进行训练,然后在进行推理任务时,应将这些模板与模型的标记器一起使用,以获得更好的结果。 使用 Workers AI 提示文本生成模型有两种方法: :::note[重要] 我们建议对 LoRA 的推理使用无范围提示。 ::: ### 有范围的提示 这是**推荐**的方法。通过有范围的提示,Workers AI 承担了了解和使用不同模型不同聊天模板的负担,并在构建提示和创建文本生成任务时为开发人员提供统一的界面。 有范围的提示是一系列消息。每条消息定义了两个键:角色和内容。 通常,角色可以是以下三个选项之一: - system - 系统消息定义了 AI 的个性。您可以使用它们来设置规则以及您期望 AI 的行为方式。 - user - 用户消息是您通过提供问题或对话来实际查询 AI 的地方。 - assistant - 助手消息向 AI 暗示所需的输出格式。并非所有模型都支持此角色。 OpenAI 对他们如何在其 GPT 模型中使用这些角色有[很好的解释](https://platform.openai.com/docs/guides/text-generation#messages-and-roles)。尽管聊天模板是灵活的,但其他文本生成模型倾向于遵循相同的约定。 以下是使用系统和用户角色的有范围提示的输入示例: 以下是在用户和助手之间进行多次迭代的聊天会话的更好示例。 请注意,不同的 LLM 使用不同的模板针对不同的用例进行训练。虽然 Workers AI 尽力通过统一的 API 向开发人员抽象每个 LLM 模板的细节,但您应始终参考模型文档以获取详细信息(我们在上表中提供了链接)。例如,像 Codellama 这样的指令模型经过微调以响应用户提供的指令,而聊天模型则期望以对话片段作为输入。 ### 无范围的提示 您可以使用无范围的提示向模型发送单个问题,而无需担心提供任何上下文。Workers AI 会自动将您的 `prompt` 输入转换为合理的默认有范围提示,以便您获得最佳的预测结果。 您还可以使用无范围的提示来手动构建模型聊天模板。在这种情况下,您可以使用 raw 参数。以下是 [Mistral](https://docs.mistral.ai/models/#chat-template) 聊天模板提示的输入示例: --- # Markdown 转换 URL: https://developers.cloudflare.com/workers-ai/features/markdown-conversion/ import { Code, Type, MetaInfo, Details, Render } from "~/components"; [Markdown](https://en.wikipedia.org/wiki/Markdown) 对于训练和推理中的文本生成和大型语言模型 (LLM)至关重要,因为它可以提供结构化、语义化、人类和机器可读的输入。同样,Markdown 有助于对输入数据进行分块和结构化,以便在 RAG 的上下文中更好地检索和综合,其简单性和易于解析和呈现的特点使其成为 AI 代理的理想选择。 由于这些原因,文档转换在设计和开发 AI 应用程序时扮演着重要角色。Workers AI 提供了 `toMarkdown` 实用方法,开发人员可以从 [`env.AI`](/workers-ai/configuration/bindings/) 绑定或 REST API 中使用该方法,以便快速、轻松、方便地将多种格式的文档转换为 Markdown 语言并进行摘要。 ## 方法和定义 ### async env.AI.toMarkdown() 获取不同格式的文档列表并将其转换为 Markdown。 #### 参数 - documents: - `toMarkdownDocument` 的数组。 #### 返回值 - results: - `toMarkdownDocumentResult` 的数组。 ### `toMarkdownDocument` 定义 - `name` - 要转换的文档的名称。 - `blob` - 一个包含文档内容的新 [Blob](https://developer.mozilla.org/en-US/docs/Web/API/Blob/Blob) 对象。 ### `toMarkdownDocumentResult` 定义 - `name` - 转换后文档的名称。与输入名称匹配。 - `mimetype` - 文档检测到的 [mime 类型](https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/MIME_types/Common_types)。 - `tokens` - 转换后文档的估计令牌数。 - `data` - 转换后文档的内容,格式为 Markdown。 ## 支持的格式 这是支持的格式列表。我们会不断添加新格式并更新此表。 ## 示例 在此示例中,我们从 R2 获取一个 PDF 文档和一张图片,并将它们都提供给 `env.AI.toMarkdown`。结果是一个转换后的文档列表。Workers AI 模型会自动用于检测和总结图像。 ```typescript import { Env } from "./env"; export default { async fetch(request: Request, env: Env, ctx: ExecutionContext) { // https://pub-979cb28270cc461d94bc8a169d8f389d.r2.dev/somatosensory.pdf const pdf = await env.R2.get("somatosensory.pdf"); // https://pub-979cb28270cc461d94bc8a169d8f389d.r2.dev/cat.jpeg const cat = await env.R2.get("cat.jpeg"); return Response.json( await env.AI.toMarkdown([ { name: "somatosensory.pdf", blob: new Blob([await pdf.arrayBuffer()], { type: "application/octet-stream", }), }, { name: "cat.jpeg", blob: new Blob([await cat.arrayBuffer()], { type: "application/octet-stream", }), }, ]), ); }, }; ``` 这是结果: ```json [ { "name": "somatosensory.pdf", "mimeType": "application/pdf", "format": "markdown", "tokens": 0, "data": "# somatosensory.pdf\n## Metadata\n- PDFFormatVersion=1.4\n- IsLinearized=false\n- IsAcroFormPresent=false\n- IsXFAPresent=false\n- IsCollectionPresent=false\n- IsSignaturesPresent=false\n- Producer=Prince 20150210 (www.princexml.com)\n- Title=Anatomy of the Somatosensory System\n\n## Contents\n### Page 1\nThis is a sample document to showcase..." }, { "name": "cat.jpeg", "mimeType": "image/jpeg", "format": "markdown", "tokens": 0, "data": "这张图片是"不爽猫"的特写照片,这只猫以其独特的"不爽"表情和锐利的蓝眼睛而闻名。这只猫的脸是棕色的,鼻子上有一条白色的条纹,耳朵竖立着。它的皮毛是浅棕色的,脸部周围的颜色较深,鼻子和嘴巴是粉红色的。猫的眼睛是蓝色的,向下倾斜,使它看起来永远都是一副"不爽"的样子。背景是模糊的,但看起来是深棕色的。总的来说,这张图片是流行的网络迷因角色"不爽猫"的一个幽默而标志性的代表。猫的面部表情和姿势传达出一种不悦或烦恼的感觉,这使得它对许多人来说是一个既 relatable 又有趣的图片。" } ] ``` ## REST API 除了 Workers AI [绑定](/workers-ai/configuration/bindings/),您还可以使用 [REST API](/workers-ai/get-started/rest-api/): ```bash curl https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/tomarkdown \ -H 'Authorization: Bearer {API_TOKEN}' \ -F "files=@cat.jpeg" \ -F "files=@somatosensory.pdf" ``` ## 定价 `toMarkdown` 对于大多数格式转换是免费的。在某些情况下,例如图像转换,它可以使用 Workers AI 模型进行对象检测和摘要,如果超出 Workers AI 的免费配额限制,可能会产生额外费用。有关更多详细信息,请参阅[定价页面](/workers-ai/platform/pricing/)。 --- # 数据使用 URL: https://developers.cloudflare.com/workers-ai/platform/data-usage/ Cloudflare 为了提供 Workers AI 服务会处理某些客户数据,这受我们的[隐私政策](https://www.cloudflare.com/privacypolicy/)和[自助服务订阅协议](https://www.cloudflare.com/terms/)或[企业订阅协议](https://www.cloudflare.com/enterpriseterms/)(如适用)的约束。 Cloudflare 既不创建也不训练在 Workers AI 上可用的 AI 模型。这些模型构成第三方服务,并可能受您与模型提供商之间的开源或其他许可条款的约束。请务必查看适用于每个模型的许可条款(如有)。 您的输入(例如,文本提示、图像提交、音频文件等)、输出(例如,生成的文本/图像、翻译等)、嵌入和训练数据构成客户内容。 对于 Workers AI: - 您拥有并对您的所有客户内容负责。 - Cloudflare 不会将您的客户内容提供给任何其他 Cloudflare 客户。 - Cloudflare 不会将您的客户内容用于 (1) 训练在 Workers AI 上可用的任何 AI 模型,或 (2) 改进任何 Cloudflare 或第三方服务,并且除非我们收到您的明确同意,否则不会这样做。 - 如果您特别将存储服务(例如,R2、KV、DO、Vectorize 等)与 Workers AI 结合使用,您的 Workers AI 客户内容可能会被 Cloudflare 存储。 --- # 错误 URL: https://developers.cloudflare.com/workers-ai/platform/errors/ 以下是 Workers AI 错误的列表。 | **名称** | **内部代码** | **HTTP 代码** | **描述** | | ---------------------- | ------------ | ------------- | --------------------------------------------------------------------------------------------------- | | 无此模型 | `5007` | `400` | 无此模型 `${model}` 或任务 | | 无效数据 | `5004` | `400` | base64 输入的无效数据类型:`${type}` | | Finetune 缺少必需文件 | `3039` | `400` | Finetune 缺少必需文件 `(model.safetensors and config.json) ` | | 不完整的请求 | `3003` | `400` | 请求缺少标头或正文:`{what}` | | 账户不允许使用私有模型 | `5018` | `403` | 该账户不允许访问此模型 | | 模型协议 | `5016` | `403` | 用户未同意 Llama3.2 模型条款 | | 账户被阻止 | `3023` | `403` | 服务对账户不可用 | | 账户不允许使用私有模型 | `3041` | `403` | 该账户不允许访问此模型 | | 已弃用的 SDK 版本 | `5019` | `405` | 请求尝试使用已弃用的 SDK 版本 | | 不支持 LoRa | `5005` | `405` | 模型 `${this.model}` 不支持 LoRa 推理 | | 无效的模型 ID | `3042` | `404` | 模型名称无效 | | 请求过大 | `3006` | `413` | 请求过大 | | 超时 | `3007` | `408` | 请求超时 | | 已中止 | `3008` | `408` | 请求已中止 | | 账户受限 | `3036` | `429` | 您已用完每日 10,000 个神经元的免费配额。如果您想继续使用,请升级到 Cloudflare 的 Workers 付费计划。 | | 容量不足 | `3040` | `429` | 没有更多的数据中心可以转发请求 | --- # 术语表 URL: https://developers.cloudflare.com/workers-ai/platform/glossary/ import { Glossary } from "~/components"; 查看 Cloudflare Workers AI 文档中使用的术语的定义。 --- # 平台 URL: https://developers.cloudflare.com/workers-ai/platform/ import { DirectoryListing } from "~/components"; --- # 限制 URL: https://developers.cloudflare.com/workers-ai/platform/limits/ import { Render } from "~/components"; Workers AI 现已正式发布。我们更新了速率限制以反映这一点。 请注意,使用 Wrangler 在本地模式下进行的模型推理也将计入这些限制。在我们致力于性能和规模的同时,Beta 模型的速率限制可能会较低。 速率限制默认为每个任务类型,一些模型的限制定义如下: ## 按任务类型划分的速率限制 ### [自动语音识别](/workers-ai/models/) - 每分钟 720 个请求 ### [图像分类](/workers-ai/models/) - 每分钟 3000 个请求 ### [图像到文本](/workers-ai/models/) - 每分钟 720 个请求 ### [对象检测](/workers-ai/models/) - 每分钟 3000 个请求 ### [摘要](/workers-ai/models/) - 每分钟 1500 个请求 ### [文本分类](/workers-ai/models/) - 每分钟 2000 个请求 ### [文本嵌入](/workers-ai/models/) - 每分钟 3000 个请求 - [@cf/baai/bge-large-en-v1.5](/workers-ai/models/bge-large-en-v1.5/) 为每分钟 1500 个请求 ### [文本生成](/workers-ai/models/) - 每分钟 300 个请求 - [@hf/thebloke/mistral-7b-instruct-v0.1-awq](/workers-ai/models/mistral-7b-instruct-v0.1-awq/) 为每分钟 400 个请求 - [@cf/microsoft/phi-2](/workers-ai/models/phi-2/) 为每分钟 720 个请求 - [@cf/qwen/qwen1.5-0.5b-chat](/workers-ai/models/qwen1.5-0.5b-chat/) 为每分钟 1500 个请求 - [@cf/qwen/qwen1.5-1.8b-chat](/workers-ai/models/qwen1.5-1.8b-chat/) 为每分钟 720 个请求 - [@cf/qwen/qwen1.5-14b-chat-awq](/workers-ai/models/qwen1.5-14b-chat-awq/) 为每分钟 150 个请求 - [@cf/tinyllama/tinyllama-1.1b-chat-v1.0](/workers-ai/models/tinyllama-1.1b-chat-v1.0/) 为每分钟 720 个请求 ### [文本到图像](/workers-ai/models/) - 每分钟 720 个请求 - [@cf/runwayml/stable-diffusion-v1-5-img2img](/workers-ai/models/stable-diffusion-v1-5-img2img/) 为每分钟 1500 个请求 ### [翻译](/workers-ai/models/) - 每分钟 720 个请求 --- # 定价 URL: https://developers.cloudflare.com/workers-ai/platform/pricing/ :::note Workers AI 更新了定价,使其更加细化,提供了基于每个模型单元的定价,但后端仍以神经元计费。 ::: Workers AI 包含在[免费和付费 Workers 计划](/workers/platform/pricing/)中,定价为**每 1,000 个神经元 $0.011**。 我们的免费配额允许任何人**每天免费使用总计 10,000 个神经元**。要每天使用超过 10,000 个神经元,您需要注册 [Workers 付费计划](/workers/platform/pricing/#workers)。在 Workers 付费计划中,任何超过每日 10,000 个神经元免费配额的使用量将按每 1,000 个神经元 $0.011 收费。 您可以在 [Cloudflare Workers AI 仪表板](https://dash.cloudflare.com/?to=/:account/ai/workers-ai)中监控您的神经元使用情况。 所有限制在每天 00:00 UTC 重置。如果您超过上述任何限制,进一步的操作将失败并显示错误。 | | 免费
配额 | 定价 | | ------------ | -------------------- | ------------------------------ | | Workers 免费 | 每天 10,000 个神经元 | 不适用 - 升级到 Workers 付费版 | | Workers 付费 | 每天 10,000 个神经元 | $0.011 / 1,000 个神经元 | ## 什么是神经元? 神经元是我们衡量不同模型 AI 输出的方式,代表执行您请求所需的 GPU 计算能力。我们的无服务器模型让您只需为使用的部分付费,而无需担心租用、管理或扩展 GPU。 :::note “以令牌计价”列等同于“以神经元计价”列 - 显示不同的单位是为了让您轻松比较和理解定价。 ::: ## LLM 模型定价 | 模型 | 以令牌计价 | 以神经元计价 | | -------------------------------------------- | ------------------------------------------------- | ------------------------------------------------------------------ | | @cf/meta/llama-3.2-1b-instruct | 每百万输入令牌 $0.027
每百万输出令牌 $0.201 | 每百万输入令牌 2457 个神经元
每百万输出令牌 18252 个神经元 | | @cf/meta/llama-3.2-3b-instruct | 每百万输入令牌 $0.051
每百万输出令牌 $0.335 | 每百万输入令牌 4625 个神经元
每百万输出令牌 30475 个神经元 | | @cf/meta/llama-3.1-8b-instruct-fp8-fast | 每百万输入令牌 $0.045
每百万输出令牌 $0.384 | 每百万输入令牌 4119 个神经元
每百万输出令牌 34868 个神经元 | | @cf/meta/llama-3.2-11b-vision-instruct | 每百万输入令牌 $0.049
每百万输出令牌 $0.676 | 每百万输入令牌 4410 个神经元
每百万输出令牌 61493 个神经元 | | @cf/meta/llama-3.1-70b-instruct-fp8-fast | 每百万输入令牌 $0.293
每百万输出令牌 $2.253 | 每百万输入令牌 26668 个神经元
每百万输出令牌 204805 个神经元 | | @cf/meta/llama-3.3-70b-instruct-fp8-fast | 每百万输入令牌 $0.293
每百万输出令牌 $2.253 | 每百万输入令牌 26668 个神经元
每百万输出令牌 204805 个神经元 | | @cf/deepseek-ai/deepseek-r1-distill-qwen-32b | 每百万输入令牌 $0.497
每百万输出令牌 $4.881 | 每百万输入令牌 45170 个神经元
每百万输出令牌 443756 个神经元 | | @cf/mistral/mistral-7b-instruct-v0.1 | 每百万输入令牌 $0.110
每百万输出令牌 $0.190 | 每百万输入令牌 10000 个神经元
每百万输出令牌 17300 个神经元 | | @cf/mistralai/mistral-small-3.1-24b-instruct | 每百万输入令牌 $0.351
每百万输出令牌 $0.555 | 每百万输入令牌 31876 个神经元
每百万输出令牌 50488 个神经元 | | @cf/meta/llama-3.1-8b-instruct | 每百万输入令牌 $0.282
每百万输出令牌 $0.827 | 每百万输入令牌 25608 个神经元
每百万输出令牌 75147 个神经元 | | @cf/meta/llama-3.1-8b-instruct-fp8 | 每百万输入令牌 $0.152
每百万输出令牌 $0.287 | 每百万输入令牌 13778 个神经元
每百万输出令牌 26128 个神经元 | | @cf/meta/llama-3.1-8b-instruct-awq | 每百万输入令牌 $0.123
每百万输出令牌 $0.266 | 每百万输入令牌 11161 个神经元
每百万输出令牌 24215 个神经元 | | @cf/meta/llama-3-8b-instruct | 每百万输入令牌 $0.282
每百万输出令牌 $0.827 | 每百万输入令牌 25608 个神经元
每百万输出令牌 75147 个神经元 | | @cf/meta/llama-3-8b-instruct-awq | 每百万输入令牌 $0.123
每百万输出令牌 $0.266 | 每百万输入令牌 11161 个神经元
每百万输出令牌 24215 个神经元 | | @cf/meta/llama-2-7b-chat-fp16 | 每百万输入令牌 $0.556
每百万输出令牌 $6.667 | 每百万输入令牌 50505 个神经元
每百万输出令牌 606061 个神经元 | | @cf/meta/llama-guard-3-8b | 每百万输入令牌 $0.484
每百万输出令牌 $0.030 | 每百万输入令牌 44003 个神经元
每百万输出令牌 2730 个神经元 | | @cf/meta/llama-4-scout-17b-16e-instruct | 每百万输入令牌 $0.270
每百万输出令牌 $0.850 | 每百万输入令牌 24545 个神经元
每百万输出令牌 77273 个神经元 | | @cf/google/gemma-3-12b-it | 每百万输入令牌 $0.345
每百万输出令牌 $0.556 | 每百万输入令牌 31371 个神经元
每百万输出令牌 50560 个神经元 | | @cf/qwen/qwq-32b | 每百万输入令牌 $0.660
每百万输出令牌 $1.000 | 每百万输入令牌 60000 个神经元
每百万输出令牌 90909 个神经元 | | @cf/qwen/qwen2.5-coder-32b-instruct | 每百万输入令牌 $0.660
每百万输出令牌 $1.000 | 每百万输入令牌 60000 个神经元
每百万输出令牌 90909 个神经元 | ## 嵌入模型定价 | 模型 | 以令牌计价 | 以神经元计价 | | -------------------------- | --------------------- | ----------------------------- | | @cf/baai/bge-small-en-v1.5 | 每百万输入令牌 $0.020 | 每百万输入令牌 1841 个神经元 | | @cf/baai/bge-base-en-v1.5 | 每百万输入令牌 $0.067 | 每百万输入令牌 6058 个神经元 | | @cf/baai/bge-large-en-v1.5 | 每百万输入令牌 $0.204 | 每百万输入令牌 18582 个神经元 | | @cf/baai/bge-m3 | 每百万输入令牌 $0.012 | 每百万输入令牌 1075 个神经元 | ## 其他模型定价 | 模型 | 以令牌计价 | 以神经元计价 | | ------------------------------------- | -------------------------------------------------- | ----------------------------------------------------------------- | | @cf/black-forest-labs/flux-1-schnell | 每个 512x512 图块 $0.0000528
每步 $0.0001056 | 每个 512x512 图块 4.80 个神经元
每步 9.60 个神经元 | | @cf/huggingface/distilbert-sst-2-int8 | 每百万输入令牌 $0.026 | 每百万输入令牌 2394 个神经元 | | @cf/baai/bge-reranker-base | 每百万输入令牌 $0.003 | 每百万输入令牌 283 个神经元 | | @cf/meta/m2m100-1.2b | 每百万输入令牌 $0.342
每百万输出令牌 $0.342 | 每百万输入令牌 31050 个神经元
每百万输出令牌 31050 个神经元 | | @cf/microsoft/resnet-50 | 每百万张图像 $2.51 | 每百万张图像 228055 个神经元 | | @cf/openai/whisper | 每音频分钟 $0.0005 | 每音频分钟 41.14 个神经元 | | @cf/openai/whisper-large-v3-turbo | 每音频分钟 $0.0005 | 每音频分钟 46.63 个神经元 | | @cf/myshell-ai/melotts | 每音频分钟 $0.0002 | 每音频分钟 18.63 个神经元 | --- # 构建检索增强生成 (RAG) AI URL: https://developers.cloudflare.com/workers-ai/guides/tutorials/build-a-retrieval-augmented-generation-ai/ import { Details, Render, PackageManagers, WranglerConfig } from "~/components"; 本指南将指导您设置和部署您的第一个 Cloudflare AI 应用程序。您将使用 Workers AI、Vectorize、D1 和 Cloudflare Workers 等工具构建一个功能齐全的 AI 驱动的应用程序。 :::note[寻找托管选项?] [AutoRAG](/autorag) 提供了一种完全托管的方式来在 Cloudflare 上构建 RAG 管道,开箱即用地处理摄取、索引和查询。[开始使用](/autorag/get-started/)。 ::: 在本教程结束时,您将构建一个 AI 工具,允许您存储信息并使用大型语言模型进行查询。这种模式被称为检索增强生成(RAG),是您可以结合 Cloudflare AI 工具包的多个方面构建的一个有用的项目。您无需具备使用 AI 工具的经验即可构建此应用程序。 您还需要访问 [Vectorize](/vectorize/platform/pricing/)。在本教程中,我们将展示如何选择性地与 [Anthropic Claude](http://anthropic.com) 集成。您需要一个 [Anthropic API 密钥](https://docs.anthropic.com/en/api/getting-started) 才能这样做。 ## 1. 创建一个新的 Worker 项目 C3 (`create-cloudflare-cli`) 是一个命令行工具,旨在帮助您尽快设置和部署 Workers 到 Cloudflare。 打开一个终端窗口并运行 C3 来创建您的 Worker 项目: 在您的项目目录中,C3 生成了几个文件。
1. `wrangler.jsonc`: 您的 [Wrangler](/workers/wrangler/configuration/#sample-wrangler-configuration) 配置文件。 2. `worker.js` (在 `/src` 中): 一个用 [ES 模块](/workers/reference/migrate-to-module-workers/) 语法编写的最小化 `'Hello World!'` Worker。 3. `package.json`: 一个最小化的 Node 依赖项配置文件。 4. `package-lock.json`: 请参阅 [`npm` 关于 `package-lock.json` 的文档](https://docs.npmjs.com/cli/v9/configuring-npm/package-lock-json)。 5. `node_modules`: 请参阅 [`npm` 关于 `node_modules` 的文档](https://docs.npmjs.com/cli/v7/configuring-npm/folders#node-modules)。
现在,移动到您新创建的目录中: ```sh cd rag-ai-tutorial ``` ## 2. 使用 Wrangler CLI 进行开发 Workers 命令行界面 [Wrangler](/workers/wrangler/install-and-update/) 允许您 [创建](/workers/wrangler/commands/#init)、[测试](/workers/wrangler/commands/#dev) 和 [部署](/workers/wrangler/commands/#deploy) 您的 Workers 项目。C3 将默认在项目中安装 Wrangler。 创建您的第一个 Worker 后,在项目目录中运行 [`wrangler dev`](/workers/wrangler/commands/#dev) 命令以启动本地服务器来开发您的 Worker。这将允许您在开发过程中本地测试您的 Worker。 ```sh npx wrangler dev --remote ``` :::note 如果您以前没有使用过 Wrangler,它会尝试打开您的 Web 浏览器以使用您的 Cloudflare 帐户登录。 如果此步骤出现问题或者您无法访问浏览器界面,请参阅 [`wrangler login`](/workers/wrangler/commands/#login) 文档以获取更多信息。 ::: 您现在可以访问 [http://localhost:8787](http://localhost:8787) 来查看您的 Worker 正在运行。您对代码的任何更改都将触发重新构建,重新加载页面将显示您的 Worker 的最新输出。 ## 3. 添加 AI 绑定 要开始使用 Cloudflare 的 AI 产品,您可以将 `ai` 块添加到 [Wrangler 配置文件](/workers/wrangler/configuration/) 中。这将在您的代码中设置一个到 Cloudflare AI 模型的绑定,您可以使用它与平台上的可用 AI 模型进行交互。 此示例使用了 [`@cf/meta/llama-3-8b-instruct` 模型](/workers-ai/models/llama-3-8b-instruct/),该模型可以生成文本。 ```toml [ai] binding = "AI" ``` 现在,找到 `src/index.js` 文件。在 `fetch` 处理程序中,您可以查询 `AI` 绑定: ```js export default { async fetch(request, env, ctx) { const answer = await env.AI.run("@cf/meta/llama-3-8b-instruct", { messages: [{ role: "user", content: `9 的平方根是多少?` }], }); return new Response(JSON.stringify(answer)); }, }; ``` 通过 `AI` 绑定查询 LLM,我们可以直接在代码中与 Cloudflare AI 的大型语言模型进行交互。在此示例中,我们使用的是 [`@cf/meta/llama-3-8b-instruct` 模型](/workers-ai/models/llama-3-8b-instruct/),该模型可以生成文本。 您可以使用 `wrangler` 部署您的 Worker: ```sh npx wrangler deploy ``` 向您的 Worker 发出请求现在将从 LLM 生成文本响应,并将其作为 JSON 对象返回。 ```sh curl https://example.username.workers.dev ``` ```sh output {"response":"答案:9的平方根是3。"} ``` ## 4. 使用 Cloudflare D1 和 Vectorize 添加嵌入 嵌入允许您向 Cloudflare AI 项目中使用的语言模型添加附加功能。这是通过 **Vectorize**(Cloudflare 的向量数据库)完成的。 要开始使用 Vectorize,请使用 `wrangler` 创建一个新的嵌入索引。此索引将存储具有 768 个维度的向量,并将使用余弦相似度来确定哪些向量彼此最相似: ```sh npx wrangler vectorize create vector-index --dimensions=768 --metric=cosine ``` 然后,将新 Vectorize 索引的配置详细信息添加到 [Wrangler 配置文件](/workers/wrangler/configuration/)中: ```toml # ... existing wrangler configuration [[vectorize]] binding = "VECTOR_INDEX" index_name = "vector-index" ``` 向量索引允许您存储维度集合,维度是用于表示数据的浮点数。当您要查询向量数据库时,您也可以将查询转换为维度。**Vectorize** 旨在高效地确定哪些存储的向量与您的查询最相似。 要实现搜索功能,您必须设置一个 Cloudflare 的 D1 数据库。在 D1 中,您可以存储应用程序的数据。然后,您将此数据更改为向量格式。当有人搜索并与向量匹配时,您可以向他们显示匹配的数据。 使用 `wrangler` 创建一个新的 D1 数据库: ```sh npx wrangler d1 create database ``` 然后,将上一个命令输出的配置详细信息粘贴到 [Wrangler 配置文件](/workers/wrangler/configuration/) 中: ```toml # ... existing wrangler configuration [[d1_databases]] binding = "DB" # 在您的 Worker 的 env.DB 中可用 database_name = "database" database_id = "abc-def-geh" # 将此替换为真实的 database_id (UUID) ``` 在此应用程序中,我们将在 D1 中创建一个 `notes` 表,这将允许我们存储笔记并稍后在 Vectorize 中检索它们。要创建此表,请使用 `wrangler d1 execute` 运行一个 SQL 命令: ```sh npx wrangler d1 execute database --remote --command "CREATE TABLE IF NOT EXISTS notes (id INTEGER PRIMARY KEY, text TEXT NOT NULL)" ``` 现在,我们可以使用 `wrangler d1 execute` 向我们的数据库中添加一个新笔记: ```sh npx wrangler d1 execute database --remote --command "INSERT INTO notes (text) VALUES ('最好的披萨配料是意大利辣香肠')" ``` ## 5. 创建工作流 在我们开始创建笔记之前,我们将引入一个 [Cloudflare 工作流](/workflows)。这将允许我们定义一个持久的工作流,可以安全、稳健地执行 RAG 过程的所有步骤。 首先,将一个新的 `[[workflows]]` 块添加到您的 [Wrangler 配置文件](/workers/wrangler/configuration/) 中: ```toml # ... existing wrangler configuration [[workflows]] name = "rag" binding = "RAG_WORKFLOW" class_name = "RAGWorkflow" ``` 在 `src/index.js` 中,添加一个名为 `RAGWorkflow` 的新类,它扩展了 `WorkflowEntrypoint`: ```js import { WorkflowEntrypoint } from "cloudflare:workers"; export class RAGWorkflow extends WorkflowEntrypoint { async run(event, step) { await step.do("example step", async () => { console.log("Hello World!"); }); } } ``` 此类将定义一个工作流步骤,该步骤将在控制台中记录“Hello World!”。您可以根据需要向工作流中添加任意数量的步骤。 就其本身而言,此工作流不会执行任何操作。要执行工作流,我们将调用 `RAG_WORKFLOW` 绑定,并传入工作流正常完成所需的任何参数。以下是我们如何调用工作流的示例: ```js env.RAG_WORKFLOW.create({ params: { text } }); ``` ## 6. 创建笔记并将其添加到 Vectorize 为了扩展您的 Workers 函数以处理多个路由,我们将添加 `hono`,这是一个用于 Workers 的路由库。这将允许我们为向数据库中添加笔记创建一个新路由。使用 `npm` 安装 `hono`: 然后,将 `hono` 导入您的 `src/index.js` 文件中。您还应该更新 `fetch` 处理程序以使用 `hono`: ```js import { Hono } from "hono"; const app = new Hono(); app.get("/", async (c) => { const answer = await c.env.AI.run("@cf/meta/llama-3-8b-instruct", { messages: [{ role: "user", content: `9 的平方根是多少?` }], }); return c.json(answer); }); export default app; ``` 这将在根路径 `/` 处建立一个路由,其功能与先前版本的应用程序相同。 现在,我们可以更新工作流以开始将笔记添加到数据库中,并生成它们的相关嵌入。 此示例使用了 [`@cf/baai/bge-base-en-v1.5` 模型](/workers-ai/models/bge-base-en-v1.5/),该模型可用于创建嵌入。嵌入存储在 [Vectorize](/vectorize/) 中,这是 Cloudflare 的向量数据库。用户查询也会转换为嵌入,以便在 Vectorize 中进行搜索。 ```js import { WorkflowEntrypoint } from "cloudflare:workers"; export class RAGWorkflow extends WorkflowEntrypoint { async run(event, step) { const env = this.env; const { text } = event.payload; const record = await step.do(`create database record`, async () => { const query = "INSERT INTO notes (text) VALUES (?) RETURNING *"; const { results } = await env.DB.prepare(query).bind(text).run(); const record = results[0]; if (!record) throw new Error("Failed to create note"); return record; }); const embedding = await step.do(`generate embedding`, async () => { const embeddings = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: text, }); const values = embeddings.data[0]; if (!values) throw new Error("Failed to generate vector embedding"); return values; }); await step.do(`insert vector`, async () => { return env.VECTOR_INDEX.upsert([ { id: record.id.toString(), values: embedding, }, ]); }); } } ``` 工作流执行以下操作: 1. 接受一个 `text` 参数。 2. 在 D1 的 `notes` 表中插入一个新行,并检索新行的 `id`。 3. 使用 LLM 绑定的 `embeddings` 模型将 `text` 转换为向量。 4. 将 `id` 和 `vectors` 上传到 Vectorize 中的 `vector-index` 索引。 通过这样做,您将创建一个新的向量表示形式的笔记,可以用于稍后检索该笔记。 要完成代码,我们将添加一个路由,允许用户向数据库提交笔记。此路由将解析 JSON 请求正文,获取 `note` 参数,并创建一个新的工作流实例,传递参数: ```js app.post("/notes", async (c) => { const { text } = await c.req.json(); if (!text) return c.text("Missing text", 400); await c.env.RAG_WORKFLOW.create({ params: { text } }); return c.text("Created note", 201); }); ``` ## 7. 查询 Vectorize 以检索笔记 要完成您的代码,您可以更新根路径(`/`)以查询 Vectorize。您将把查询转换为向量,然后使用 `vector-index` 索引来查找最相似的向量。 `topK` 参数限制了函数返回的向量数量。例如,提供 `topK` 为 1 将仅返回基于查询的 _最相似_ 向量。将 `topK` 设置为 5 将返回 5 个最相似的向量。 给定一组相似的向量,您可以检索与存储在这些向量旁边的记录 ID 匹配的笔记。在这种情况下,我们只检索一个笔记 - 但您可以根据需要自定义此设置。 您可以将这些笔记的文本插入 LLM 绑定的提示中。这是检索增强生成(RAG)的基础:在 LLM 的提示中提供来自数据外部的附加上下文,以增强 LLM 生成的文本。 我们将更新提示以包含上下文,并要求 LLM 在回应时使用上下文: ```js import { Hono } from "hono"; const app = new Hono(); // Existing post route... // app.post('/notes', async (c) => { ... }) app.get("/", async (c) => { const question = c.req.query("text") || "9 的平方根是多少?"; const embeddings = await c.env.AI.run("@cf/baai/bge-base-en-v1.5", { text: question, }); const vectors = embeddings.data[0]; const vectorQuery = await c.env.VECTOR_INDEX.query(vectors, { topK: 1 }); let vecId; if ( vectorQuery.matches && vectorQuery.matches.length > 0 && vectorQuery.matches[0] ) { vecId = vectorQuery.matches[0].id; } else { console.log("No matching vector found or vectorQuery.matches is empty"); } let notes = []; if (vecId) { const query = `SELECT * FROM notes WHERE id = ?`; const { results } = await c.env.DB.prepare(query).bind(vecId).all(); if (results) notes = results.map((vec) => vec.text); } const contextMessage = notes.length ? `Context:\n${notes.map((note) => `- ${note}`).join("\n")}` : ""; const systemPrompt = `When answering the question or responding, use the context provided, if it is provided and relevant.`; const { response: answer } = await c.env.AI.run( "@cf/meta/llama-3-8b-instruct", { messages: [ ...(notes.length ? [{ role: "system", content: contextMessage }] : []), { role: "system", content: systemPrompt }, { role: "user", content: question }, ], }, ); return c.text(answer); }); app.onError((err, c) => { return c.text(err); }); export default app; ``` ## 8. 添加 Anthropic Claude 模型(可选) 如果您正在处理较大的文档,您有选择使用 Anthropic 的 [Claude 模型](https://claude.ai/),这些模型具有大型上下文窗口,非常适合 RAG 工作流。 要开始,安装 `@anthropic-ai/sdk` 包: 在 `src/index.js` 中,您可以更新 `GET /` 路由以检查 `ANTHROPIC_API_KEY` 环境变量。如果设置了该变量,我们可以使用 Anthropic SDK 生成文本。如果没有设置,我们将回退到现有的 Workers AI 代码: ```js import Anthropic from '@anthropic-ai/sdk'; app.get('/', async (c) => { // ... Existing code const systemPrompt = `When answering the question or responding, use the context provided, if it is provided and relevant.` let modelUsed: string = "" let response = null if (c.env.ANTHROPIC_API_KEY) { const anthropic = new Anthropic({ apiKey: c.env.ANTHROPIC_API_KEY }) const model = "claude-3-5-sonnet-latest" modelUsed = model const message = await anthropic.messages.create({ max_tokens: 1024, model, messages: [ { role: 'user', content: question } ], system: [systemPrompt, notes ? contextMessage : ''].join(" ") }) response = { response: message.content.map(content => content.text).join("\n") } } else { const model = "@cf/meta/llama-3.1-8b-instruct" modelUsed = model response = await c.env.AI.run( model, { messages: [ ...(notes.length ? [{ role: 'system', content: contextMessage }] : []), { role: 'system', content: systemPrompt }, { role: 'user', content: question } ] } ) } if (response) { c.header('x-model-used', modelUsed) return c.text(response.response) } else { return c.text("We were unable to generate output", 500) } }) ``` 最后,您需要在 Workers 应用程序中设置 `ANTHROPIC_API_KEY` 环境变量。您可以使用 `wrangler secret put` 来实现: ```sh $ npx wrangler secret put ANTHROPIC_API_KEY ``` ## 9. 删除笔记和向量 如果您不再需要笔记,可以从数据库中删除它。每次删除笔记时,您还需要从 Vectorize 中删除相应的向量。您可以通过在 `src/index.js` 文件中构建 `DELETE /notes/:id` 路由来实现这一点: ```js app.delete("/notes/:id", async (c) => { const { id } = c.req.param(); const query = `DELETE FROM notes WHERE id = ?`; await c.env.DB.prepare(query).bind(id).run(); await c.env.VECTOR_INDEX.deleteByIds([id]); return c.status(204); }); ``` ## 10. 文本分割(可选) 对于较大的文本块,建议将文本分割成较小的块。这允许 LLM 更有效地收集相关上下文,而无需检索大块文本。 为了实现这一点,我们将向项目中添加一个新的 NPM 包,`@langchain/textsplitters`: 此包提供的 `RecursiveCharacterTextSplitter` 类将文本分割成较小的块。它可以根据需要进行自定义,但默认配置在大多数情况下都有效: ```js import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters"; const text = "Some long piece of text..."; const splitter = new RecursiveCharacterTextSplitter({ // These can be customized to change the chunking size // chunkSize: 1000, // chunkOverlap: 200, }); const output = await splitter.createDocuments([text]); console.log(output); // [{ pageContent: 'Some long piece of text...' }] ``` 要使用此分割器,我们将更新工作流以将文本分割成较小的块。然后,我们将遍历这些块,并为每个文本块运行工作流的其余部分: ```js export class RAGWorkflow extends WorkflowEntrypoint { async run(event, step) { const env = this.env; const { text } = event.payload; let texts = await step.do("split text", async () => { const splitter = new RecursiveCharacterTextSplitter(); const output = await splitter.createDocuments([text]); return output.map((doc) => doc.pageContent); }); console.log( "RecursiveCharacterTextSplitter generated ${texts.length} chunks", ); for (const index in texts) { const text = texts[index]; const record = await step.do( `create database record: ${index}/${texts.length}`, async () => { const query = "INSERT INTO notes (text) VALUES (?) RETURNING *"; const { results } = await env.DB.prepare(query).bind(text).run(); const record = results[0]; if (!record) throw new Error("Failed to create note"); return record; }, ); const embedding = await step.do( `generate embedding: ${index}/${texts.length}`, async () => { const embeddings = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: text, }); const values = embeddings.data[0]; if (!values) throw new Error("Failed to generate vector embedding"); return values; }, ); await step.do(`insert vector: ${index}/${texts.length}`, async () => { return env.VECTOR_INDEX.upsert([ { id: record.id.toString(), values: embedding, }, ]); }); } } } ``` 现在,当向 `/notes` 端点提交大块文本时,它们将被分割成较小的块,并且每个块将由工作流处理。 ## 11. 部署您的项目 如果您在[第 1 步](/workers/get-started/guide/#1-create-a-new-worker-project)中没有部署您的 Worker,请使用 Wrangler 将您的 Worker 部署到 `*.workers.dev` 子域、[自定义域](/workers/configuration/routing/custom-domains/)(如果您已配置),或者如果您没有配置任何子域或域,Wrangler 将在发布过程中提示您设置一个。 ```sh npx wrangler deploy ``` 在 `..workers.dev` 预览您的 Worker。 :::note[注意] 当首次将您的 Worker 推送到 `*.workers.dev` 子域时,您可能会看到 [`523` 错误](/support/troubleshooting/http-status-codes/cloudflare-5xx-errors/error-523/),因为 DNS 正在传播。这些错误应在一分钟左右解决。 ::: ## 相关资源 完整版本的此代码库可在 GitHub 上找到。它包括一个前端 UI 用于查询、添加和删除笔记,以及一个后端 API 用于与数据库和向量索引进行交互。您可以在这里找到它:[github.com/kristianfreeman/cloudflare-retrieval-augmented-generation-example](https://github.com/kristianfreeman/cloudflare-retrieval-augmented-generation-example/)。 要做更多: - 探索 [检索增强生成(RAG)架构](/reference-architecture/diagrams/ai/ai-rag/) 的参考图表。 - 查看 Cloudflare 的 [AI 文档](/workers-ai)。 - 查看 [教程](/workers/tutorials/) 以在 Workers 上构建项目。 - 探索 [示例](/workers/examples/) 以尝试复制和粘贴 Worker 代码。 - 了解 Workers 的工作原理 [参考](/workers/reference/)。 - 了解 Workers 的功能和功能 [平台](/workers/platform/)。 - 设置 [Wrangler](/workers/wrangler/install-and-update/) 以编程方式创建、测试和部署您的 Worker 项目。 --- # 使用 Workers AI 构建带自动转录功能的语音笔记应用 URL: https://developers.cloudflare.com/workers-ai/guides/tutorials/build-a-voice-notes-app-with-auto-transcription/ import { Render, PackageManagers, Tabs, TabItem } from "~/components"; 在本教程中,您将学习如何创建一个带有语音录音自动转录和可选后处理功能的语音笔记应用。构建该应用将使用以下工具: - Workers AI 用于转录语音录音和可选的后处理 - D1 数据库用于存储笔记 - R2 存储用于存储语音录音 - Nuxt 框架用于构建全栈应用 - Workers 用于部署项目 ## 先决条件 要继续,您需要: ## 1. 创建一个新的 Worker 项目 使用带有 `nuxt` 框架预设的 `c3` CLI 创建一个新的 Worker 项目。 ### 安装附加依赖项 切换到新创建的项目目录 ```sh cd voice-notes ``` 并安装以下依赖项: 然后将 `@nuxt/ui` 模块添加到 `nuxt.config.ts` 文件中: ```ts title="nuxt.config.ts" export default defineNuxtConfig({ //.. modules: ["nitro-cloudflare-dev", "@nuxt/ui"], //.. }); ``` ### [可选] 迁移到 Nuxt 4 兼容模式 迁移到 Nuxt 4 兼容模式可确保您的应用程序与 Nuxt 的未来更新保持向前兼容。 在项目的根目录中创建一个新的 `app` 文件夹,并将 `app.vue` 文件移动到其中。此外,将以下内容添加到您的 `nuxt.config.ts` 文件中: ```ts title="nuxt.config.ts" export default defineNuxtConfig({ //.. future: { compatibilityVersion: 4, }, //.. }); ``` :::note 本教程的其余部分将使用 `app` 文件夹来存放客户端代码。如果您没有进行此更改,您应该继续使用项目的根目录。 ::: ### 启动本地开发服务器 此时,您可以通过启动本地开发服务器来测试您的应用程序: 如果一切设置正确,您应该在 `http://localhost:3000` 上看到一个 Nuxt 欢迎页面。 ## 2. 创建转录 API 端点 此 API 利用 Workers AI 来转录语音录音。要在项目中使用 Workers AI,您首先需要将其绑定到 Worker。 将 `AI` 绑定添加到 Wrangler 文件中。 ```toml title="wrangler.toml" [ai] binding = "AI" ``` 配置 `AI` 绑定后,运行 `cf-typegen` 命令以生成必要的 Cloudflare 类型定义。这使得类型定义在服务器事件上下文中可用。 通过在 `/server/api` 目录中创建 `transcribe.post.ts` 文件来创建一个转录 `POST` 端点。 ```ts title="server/api/transcribe.post.ts" export default defineEventHandler(async (event) => { const { cloudflare } = event.context; const form = await readFormData(event); const blob = form.get("audio") as Blob; if (!blob) { throw createError({ statusCode: 400, message: "缺少要转录的音频 blob", }); } try { const response = await cloudflare.env.AI.run("@cf/openai/whisper", { audio: [...new Uint8Array(await blob.arrayBuffer())], }); return response.text; } catch (err) { console.error("转录音频时出错:", err); throw createError({ statusCode: 500, message: "转录音频失败。请重试。", }); } }); ``` 上述代码执行以下操作: 1. 从事件中提取音频 blob。 2. 使用 `@cf/openai/whisper` 模型转录 blob 并将转录文本作为响应返回。 ## 3. 为将音频录音上传到 R2 创建 API 端点 在将音频录音上传到 `R2` 之前,您需要先创建一个存储桶。您还需要将 R2 绑定添加到您的 Wrangler 文件并重新生成 Cloudflare 类型定义。 创建一个 `R2` 存储桶。 将存储绑定添加到您的 Wrangler 文件中。 ```toml title="wrangler.toml" [[r2_buckets]] binding = "R2" bucket_name = "" ``` 最后,通过重新运行 `cf-typegen` 脚本生成类型定义。 现在您已准备好创建上传端点。在您的 `server/api` 目录中创建一个新的 `upload.put.ts` 文件,并向其添加以下代码: ```ts title="server/api/upload.put.ts" export default defineEventHandler(async (event) => { const { cloudflare } = event.context; const form = await readFormData(event); const files = form.getAll("files") as File[]; if (!files) { throw createError({ statusCode: 400, message: "缺少文件" }); } const uploadKeys: string[] = []; for (const file of files) { const obj = await cloudflare.env.R2.put(`recordings/${file.name}`, file); if (obj) { uploadKeys.push(obj.key); } } return uploadKeys; }); ``` 上述代码执行以下操作: 1. `files` 变量使用 `form.getAll()` 检索客户端发送的所有文件,这允许在单个请求中进行多次上传。 2. 使用您之前创建的绑定 (`R2`) 将文件上传到 R2 存储桶。 :::note `recordings/` 前缀将上传的文件组织到存储桶中的专用文件夹中。这在向客户端提供这些录音时也会派上用场(稍后介绍)。 ::: ## 4. 创建 API 端点以保存笔记条目 在创建端点之前,您需要执行与 R2 存储桶类似但有一些额外步骤的步骤,以准备一个笔记表。 创建一个 `D1` 数据库。 将 D1 绑定添加到 Wrangler 文件。您可以从 `d1 create` 命令的输出中获取 `DB_ID`。 ```toml title="wrangler.toml" [[d1_databases]] binding = "DB" database_name = "" database_id = "" ``` 和以前一样,重新运行 `cf-typegen` 命令以生成类型。 接下来,创建一个数据库迁移。 "create notes table"`} /> 这将在项目的根目录中创建一个新的 `migrations` 文件夹,并向其中添加一个空的 `0001_create_notes_table.sql` 文件。用下面的代码替换此文件的内容。 ```sql CREATE TABLE IF NOT EXISTS notes ( id INTEGER PRIMARY KEY AUTOINCREMENT, text TEXT NOT NULL, created_at DATETIME DEFAULT CURRENT_TIMESTAMP, updated_at DATETIME DEFAULT CURRENT_TIMESTAMP, audio_urls TEXT ); ``` 然后应用此迁移以创建 `notes` 表。 :::note 上述命令将在本地创建笔记表。要在您的远程生产数据库上应用迁移,请使用 `--remote` 标志。 ::: 现在您可以创建 API 端点。在 `server/api/notes` 目录中创建一个新文件 `index.post.ts`,并将其内容更改为以下内容: ```ts title="server/api/notes/index.post.ts" export default defineEventHandler(async (event) => { const { cloudflare } = event.context; const { text, audioUrls } = await readBody(event); if (!text) { throw createError({ statusCode: 400, message: "Missing note text", }); } try { await cloudflare.env.DB.prepare( "INSERT INTO notes (text, audio_urls) VALUES (?1, ?2)", ) .bind(text, audioUrls ? JSON.stringify(audioUrls) : null) .run(); return setResponseStatus(event, 201); } catch (err) { console.error("Error creating note:", err); throw createError({ statusCode: 500, message: "Failed to create note. Please try again.", }); } }); ``` The above does the following: 1. Extracts the text, and optional audioUrls from the event. 2. Saves it to the database after converting the audioUrls to a `JSON` string. ## 5. Handle note creation on the client-side Now you're ready to work on the client side. Let's start by tackling the note creation part first. ### Recording user audio Create a composable to handle audio recording using the MediaRecorder API. This will be used to record notes through the user's microphone. Create a new file `useMediaRecorder.ts` in the `app/composables` folder, and add the following code to it: ```ts title="app/composables/useMediaRecorder.ts" interface MediaRecorderState { isRecording: boolean; recordingDuration: number; audioData: Uint8Array | null; updateTrigger: number; } export function useMediaRecorder() { const state = ref({ isRecording: false, recordingDuration: 0, audioData: null, updateTrigger: 0, }); let mediaRecorder: MediaRecorder | null = null; let audioContext: AudioContext | null = null; let analyser: AnalyserNode | null = null; let animationFrame: number | null = null; let audioChunks: Blob[] | undefined = undefined; const updateAudioData = () => { if (!analyser || !state.value.isRecording || !state.value.audioData) { if (animationFrame) { cancelAnimationFrame(animationFrame); animationFrame = null; } return; } analyser.getByteTimeDomainData(state.value.audioData); state.value.updateTrigger += 1; animationFrame = requestAnimationFrame(updateAudioData); }; const startRecording = async () => { try { const stream = await navigator.mediaDevices.getUserMedia({ audio: true }); audioContext = new AudioContext(); analyser = audioContext.createAnalyser(); const source = audioContext.createMediaStreamSource(stream); source.connect(analyser); mediaRecorder = new MediaRecorder(stream); audioChunks = []; mediaRecorder.ondataavailable = (e: BlobEvent) => { audioChunks?.push(e.data); state.value.recordingDuration += 1; }; state.value.audioData = new Uint8Array(analyser.frequencyBinCount); state.value.isRecording = true; state.value.recordingDuration = 0; state.value.updateTrigger = 0; mediaRecorder.start(1000); updateAudioData(); } catch (err) { console.error("Error accessing microphone:", err); throw err; } }; const stopRecording = async () => { return await new Promise((resolve) => { if (mediaRecorder && state.value.isRecording) { mediaRecorder.onstop = () => { const blob = new Blob(audioChunks, { type: "audio/webm" }); audioChunks = undefined; state.value.recordingDuration = 0; state.value.updateTrigger = 0; state.value.audioData = null; resolve(blob); }; state.value.isRecording = false; mediaRecorder.stop(); mediaRecorder.stream.getTracks().forEach((track) => track.stop()); if (animationFrame) { cancelAnimationFrame(animationFrame); animationFrame = null; } audioContext?.close(); audioContext = null; } }); }; onUnmounted(() => { stopRecording(); }); return { state: readonly(state), startRecording, stopRecording, }; } ``` The above code does the following: 1. Exposes functions to start and stop audio recordings in a Vue application. 2. Captures audio input from the user's microphone using MediaRecorder API. 3. Processes real-time audio data for visualization using AudioContext and AnalyserNode. 4. Stores recording state including duration and recording status. 5. Maintains chunks of audio data and combines them into a final audio blob when recording stops. 6. Updates audio visualization data continuously using animation frames while recording. 7. Automatically cleans up all audio resources when recording stops or component unmounts. 8. Returns audio recordings in webm format for further processing. ### Create a component for note creation This component allows users to create notes by either typing or recording audio. It also handles audio transcription and uploading the recordings to the server. Create a new file named `CreateNote.vue` inside the `app/components` folder. Add the following template code to the newly created file: ```vue title="app/components/CreateNote.vue" ``` The above template results in the following: 1. A panel with a `textarea` inside to type the note manually. 2. Another panel to manage start/stop of an audio recording, and show the recordings done already. 3. A bottom panel to reset or save the note (along with the recordings). Now, add the following code below the template code in the same file: ```vue title="app/components/CreateNote.vue" ``` The above code does the following: 1. When a recording is stopped by calling `handleRecordingStop` function, the audio blob is sent for transcribing to the transcribe API endpoint. 2. The transcription response text is appended to the existing textarea content. 3. When the note is saved by calling the `saveNote` function, the audio recordings are uploaded first to R2 by using the upload endpoint we created earlier. Then, the actual note content along with the audioUrls (the R2 object keys) are saved by calling the notes post endpoint. ### Create a new page route for showing the component You can use this component in a Nuxt page to show it to the user. But before that you need to modify your `app.vue` file. Update the content of your `app.vue` to the following: ```vue title="/app/app.vue" ``` The above code allows for a nuxt page to be shown to the user, apart from showing an app header and a navigation sidebar. Next, add a new file named `new.vue` inside the `app/pages` folder, add the following code to it: ```vue title="app/pages/new.vue" ``` The above code shows the `CreateNote` component inside a modal, and navigates back to the home page on successful note creation. ## 6. Showing the notes on the client side To show the notes from the database on the client side, create an API endpoint first that will interact with the database. ### Create an API endpoint to fetch notes from the database Create a new file named `index.get.ts` inside the `server/api/notes` directory, and add the following code to it: ```ts title="server/api/index.get.ts" import type { Note } from "~~/types"; export default defineEventHandler(async (event) => { const { cloudflare } = event.context; const res = await cloudflare.env.DB.prepare( `SELECT id, text, audio_urls AS audioUrls, created_at AS createdAt, updated_at AS updatedAt FROM notes ORDER BY created_at DESC LIMIT 50;`, ).all & { audioUrls: string | null }>(); return res.results.map((note) => ({ ...note, audioUrls: note.audioUrls ? JSON.parse(note.audioUrls) : undefined, })); }); ``` The above code fetches the last 50 notes from the database, ordered by their creation date in descending order. The `audio_urls` field is stored as a string in the database, but it's converted to an array using `JSON.parse` to handle multiple audio files seamlessly on the client side. Next, create a page named `index.vue` inside the `app/pages` directory. This will be the home page of the application. Add the following code to it: ```vue title="app/pages/index.vue" ``` The above code fetches the notes from the database by calling the `/api/notes` endpoint you created just now, and renders them as note cards. ### Serving the saved recordings from R2 To be able to play the audio recordings of these notes, you need to serve the saved recordings from the R2 storage. Create a new file named `[...pathname].get.ts` inside the `server/routes/recordings` directory, and add the following code to it: :::note The `...` prefix in the file name makes it a catch all route. This allows it to receive all events that are meant for paths starting with `/recordings` prefix. This is where the `recordings` prefix that was added previously while saving the recordings becomes helpful. ::: ```ts title="server/routes/recordings/[...pathname].get.ts" export default defineEventHandler(async (event) => { const { cloudflare, params } = event.context; const { pathname } = params || {}; return cloudflare.env.R2.get(`recordings/${pathname}`); }); ``` The above code extracts the path name from the event params, and serves the saved recording matching that object key from the R2 bucket. ## 7. [Optional] Post Processing the transcriptions Even though the speech-to-text transcriptions models perform satisfactorily, sometimes you want to post process the transcriptions for various reasons. It could be to remove any discrepancy, or to change the tone/style of the final text. ### Create a settings page Create a new file named `settings.vue` in the `app/pages` folder, and add the following code to it: ```vue title="app/pages/settings.vue" ``` The above code renders a toggle button that enables/disables the post processing of transcriptions. If enabled, users can change the prompt that will used while post processing the transcription with an AI model. The transcription settings are saved using useStorageAsync, which utilizes the browser's local storage. This ensures that users' preferences are retained even after refreshing the page. ### Send the post processing prompt with recorded audio Modify the `CreateNote` component to send the post processing prompt along with the audio blob, while calling the `transcribe` API endpoint. ```vue title="app/components/CreateNote.vue" ins={2, 6-9, 17-22} ``` The code blocks added above checks for the saved post processing setting. If enabled, and there is a defined prompt, it sends the prompt to the `transcribe` API endpoint. ### Handle post processing in the transcribe API endpoint Modify the transcribe API endpoint, and update it to the following: ```ts title="server/api/transcribe.post.ts" ins={9-20, 22} export default defineEventHandler(async (event) => { const { cloudflare } = event.context; const form = await readFormData(event); const blob = form.get("audio") as Blob; if (!blob) { throw createError({ statusCode: 400, message: "缺少要转录的音频 blob", }); } try { const response = await cloudflare.env.AI.run("@cf/openai/whisper", { audio: [...new Uint8Array(await blob.arrayBuffer())], }); const postProcessingPrompt = form.get("prompt") as string; if (postProcessingPrompt && response.text) { const postProcessResult = await cloudflare.env.AI.run( "@cf/meta/llama-3.1-8b-instruct", { temperature: 0.3, prompt: `${postProcessingPrompt}.\n\nText:\n\n${response.text}\n\nResponse:`, }, ); return (postProcessResult as { response?: string }).response; } else { return response.text; } } catch (err) { console.error("转录音频时出错:", err); throw createError({ statusCode: 500, message: "转录音频失败。请重试。", }); } }); ``` The above code does the following: 1. Extracts the post processing prompt from the event FormData. 2. If present, it calls the Workers AI API to process the transcription text using the `@cf/meta/llama-3.1-8b-instruct` model. 3. Finally, it returns the response from Workers AI to the client. ## 8. Deploy the application Now you are ready to deploy the project to a `.workers.dev` sub-domain by running the deploy command. You can preview your application at `..workers.dev`. :::note If you used `pnpm` as your package manager, you may face build errors like `"stdin" is not exported by "node_modules/.pnpm/unenv@1.10.0/node_modules/unenv/runtime/node/process/index.mjs"`. To resolve it, you can try hoisting your node modules with the [`shamefully-hoist-true`](https://pnpm.io/npmrc) option. ::: ## Conclusion In this tutorial, you have gone through the steps of building a voice notes application using Nuxt 3, Cloudflare Workers, D1, and R2 storage. You learnt to: - Set up the backend to store and manage notes - Create API endpoints to fetch and display notes - Handle audio recordings - Implement optional post-processing for transcriptions - Deploy the application using the Cloudflare module syntax The complete source code of the project is available on GitHub. You can go through it to see the code for various frontend components not covered in the article. You can find it here: [github.com/ra-jeev/vnotes](https://github.com/ra-jeev/vnotes). --- # 使用 Cloudflare Workers AI 的 Whisper-large-v3-turbo URL: https://developers.cloudflare.com/workers-ai/guides/tutorials/build-a-workers-ai-whisper-with-chunking/ 在本教程中,您将学习如何: - **转录大型音频文件:** 使用 Cloudflare Workers AI 的 [Whisper-large-v3-turbo](/workers-ai/models/whisper-large-v3-turbo/) 模型执行自动语音识别(ASR)或翻译。 - **处理大型文件:** 将大型音频文件分割成更小的块进行处理,这有助于克服内存和执行时间的限制。 - **使用 Cloudflare Workers 进行部署:** 在无服务器环境中创建可扩展、低延迟的转录管道。 ## 1:创建一个新的 Cloudflare Worker 项目 import { Render, PackageManagers, WranglerConfig } from "~/components"; 您将使用 `create-cloudflare` CLI (C3) 创建一个新的 Worker 项目。[C3](https://github.com/cloudflare/workers-sdk/tree/main/packages/create-cloudflare) 是一个命令行工具,旨在帮助您设置和部署新的应用程序到 Cloudflare。 通过运行以下命令创建一个名为 `whisper-tutorial` 的新项目: 运行 `npm create cloudflare@latest` 将提示您安装 [`create-cloudflare` 包](https://www.npmjs.com/package/create-cloudflare),并引导您完成设置。C3 还将安装 [Wrangler](/workers/wrangler/),即 Cloudflare 开发者平台 CLI。 这将创建一个新的 `whisper-tutorial` 目录。您的新 `whisper-tutorial` 目录将包括: - `src/index.ts` 中的一个 `"Hello World"` [Worker](/workers/get-started/guide/#3-write-code)。 - 一个 [`wrangler.jsonc`](/workers/wrangler/configuration/) 配置文件。 转到您的应用程序目录: ```sh cd whisper-tutorial ``` ## 2. 将您的 Worker 连接到 Workers AI 您必须为您的 Worker 创建一个 AI 绑定以连接到 Workers AI。[绑定](/workers/runtime-apis/bindings/)允许您的 Workers 与 Cloudflare 开发者平台上的资源(如 Workers AI)进行交互。 要将 Workers AI 绑定到您的 Worker,请将以下内容添加到 `wrangler.toml` 文件的末尾: ```toml [ai] binding = "AI" ``` 您的绑定在您的 Worker 代码中的 [`env.AI`](/workers/runtime-apis/handlers/fetch/) 上[可用](/workers/reference/migrate-to-module-workers/#bindings-in-es-modules-format)。 ## 3. 配置 Wrangler 在您的 wrangler 文件中,添加或更新以下设置以启用 Node.js API 和 polyfill(兼容性日期为 2024-09-23 或更晚): ```toml title="wrangler.toml" compatibility_flags = [ "nodejs_compat" ] compatibility_date = "2024-09-23" ``` ## 4. 使用分块处理大型音频文件 将 `src/index.ts` 文件的内容替换为以下集成代码。此示例演示了如何: (1) 从查询参数中提取音频文件 URL。 (2) 在明确遵循重定向的情况下获取音频文件。 (3) 将音频文件分割成更小的块(例如 1 MB 的块)。 (4) 通过 Cloudflare AI 绑定使用 Whisper-large-v3-turbo 模型转录每个块。 (5) 以纯文本形式返回聚合的转录。 ```ts import { Buffer } from "node:buffer"; import type { Ai } from "workers-ai"; export interface Env { AI: Ai; // 如果需要,添加您的 KV 命名空间以存储转录。 // MY_KV_NAMESPACE: KVNamespace; } /** * 从提供的 URL 获取音频文件并将其分割成块。 * 此函数明确遵循重定向。 * * @param audioUrl - 音频文件的 URL。 * @returns 一个 ArrayBuffer 数组,每个代表一个音频块。 */ async function getAudioChunks(audioUrl: string): Promise { const response = await fetch(audioUrl, { redirect: "follow" }); if (!response.ok) { throw new Error(`获取音频失败:${response.status}`); } const arrayBuffer = await response.arrayBuffer(); // 示例:将音频分割成 1MB 的块。 const chunkSize = 1024 * 1024; // 1MB const chunks: ArrayBuffer[] = []; for (let i = 0; i < arrayBuffer.byteLength; i += chunkSize) { const chunk = arrayBuffer.slice(i, i + chunkSize); chunks.push(chunk); } return chunks; } /** * 使用 Whisper‑large‑v3‑turbo 模型转录单个音频块。 * 该函数将音频块转换为 Base64 编码的字符串,并 * 通过 AI 绑定将其发送到模型。 * * @param chunkBuffer - 作为 ArrayBuffer 的音频块。 * @param env - Cloudflare Worker 环境,包括 AI 绑定。 * @returns 来自模型的转录文本。 */ async function transcribeChunk( chunkBuffer: ArrayBuffer, env: Env, ): Promise { const base64 = Buffer.from(chunkBuffer, "binary").toString("base64"); const res = await env.AI.run("@cf/openai/whisper-large-v3-turbo", { audio: base64, // 可选参数(如果需要,取消注释并设置): // task: "transcribe", // 或 "translate" // language: "en", // vad_filter: "false", // initial_prompt: "如果需要,提供上下文。", // prefix: "转录:", }); return res.text; // 假设转录结果包括一个 "text" 属性。 } /** * 主 fetch 处理程序。它提取 'url' 查询参数,获取音频, * 以块为单位处理它,并返回完整的转录。 */ export default { async fetch( request: Request, env: Env, ctx: ExecutionContext, ): Promise { // 从查询参数中提取音频 URL。 const { searchParams } = new URL(request.url); const audioUrl = searchParams.get("url"); if (!audioUrl) { return new Response("缺少 'url' 查询参数", { status: 400 }); } // 获取音频块。 const audioChunks: ArrayBuffer[] = await getAudioChunks(audioUrl); let fullTranscript = ""; // 处理每个块并构建完整的转录。 for (const chunk of audioChunks) { try { const transcript = await transcribeChunk(chunk, env); fullTranscript += transcript + "\n"; } catch (error) { fullTranscript += "[转录块时出错]\n"; } } return new Response(fullTranscript, { headers: { "Content-Type": "text/plain" }, }); }, } satisfies ExportedHandler; ``` ## 5. 部署您的 Worker 1. **在本地运行 Worker:** 使用 wrangler 的开发模式在本地测试您的 Worker: ```sh npx wrangler dev ``` 打开您的浏览器并转到 [http://localhost:8787](http://localhost:8787),或使用 curl: ```sh curl "http://localhost:8787?url=https://raw.githubusercontent.com/your-username/your-repo/main/your-audio-file.mp3" ``` 将 URL 查询参数替换为您的音频文件的直接链接。(对于 GitHub 托管的文件,请确保使用原始文件 URL。) 2. **部署 Worker:** 测试完成后,使用以下命令部署您的 Worker: ```sh npx wrangler deploy ``` 3. **测试已部署的 Worker:** 部署后,通过将音频 URL 作为查询参数传递来测试您的 Worker: ```sh curl "https://.workers.dev?url=https://raw.githubusercontent.com/your-username/your-repo/main/your-audio-file.mp3" ``` 确保将 ``、`your-username`、`your-repo` 和 `your-audio-file.mp3` 替换为您的实际详细信息。 如果成功,Worker 将返回音频文件的转录: ```sh 这是音频的转录... ``` --- # 使用 Workers AI 构建面试练习工具 URL: https://developers.cloudflare.com/workers-ai/guides/tutorials/build-ai-interview-practice-tool/ import { Render, PackageManagers } from "~/components"; 求职面试可能会让人感到压力,而练习是建立信心的关键。虽然与朋友或导师进行的传统模拟面试很有价值,但并非总能在需要时获得。在本教程中,您将学习如何构建一个由 AI 驱动的面试练习工具,该工具可提供实时反馈以帮助提高面试技巧。 在本教程结束时,您将构建一个完整的面试练习工具,具有以下核心功能: - 使用 WebSocket 连接的实时面试模拟工具 - 将音频转换为文本的 AI 驱动的语音处理管道 - 提供类似面试官互动的智能响应系统 - 使用 Durable Objects 管理面试会话和历史记录的持久存储系统 ### 先决条件 本教程演示了如何使用多个 Cloudflare 产品,虽然许多功能在免费套餐中可用,但 Workers AI 的某些组件可能会产生基于使用量的费用。在继续之前,请查看 Workers AI 的定价文档。 ## 1. 创建一个新的 Worker 项目 使用 Create Cloudflare CLI (C3) 工具和 Hono 框架创建一个 Cloudflare Workers 项目。 :::note [Hono](https://hono.dev) 是一个轻量级的 Web 框架,有助于构建 API 端点和处理 HTTP 请求。本教程使用 Hono 来创建和管理应用程序的路由和中间件组件。 ::: 通过运行以下命令创建一个新的 Worker 项目,使用 `ai-interview-tool` 作为 Worker 名称: 要在本地开发和测试您的 Cloudflare Workers 应用程序: 1. 在您的终端中导航到您的 Workers 项目目录: ```sh cd ai-interview-tool ``` 2. 通过运行以下命令启动开发服务器: ```sh npx wrangler dev ``` 当您运行 `wrangler dev` 时,该命令会启动一个本地开发服务器,并提供一个 `localhost` URL,您可以在其中预览您的应用程序。 您现在可以对代码进行更改,并在提供的 localhost 地址上实时查看它们。 ## 2. 为面试系统定义 TypeScript 类型 项目设置好后,创建将构成面试系统基础的 TypeScript 类型。这些类型将帮助您维护类型安全,并为应用程序的不同组件提供清晰的接口。 创建一个新的 `types.ts` 文件,其中将包含以下内容的基本类型和枚举: - 可以评估的面试技能(JavaScript、React 等) - 不同的面试职位(初级开发人员、高级开发人员等) - 面试状态跟踪 - 用户与 AI 之间的消息处理 - 核心面试数据结构 ```typescript title="src/types.ts" import { Context } from "hono"; // API 端点的上下文类型,包括环境绑定和用户信息 export interface ApiContext { Bindings: CloudflareBindings; Variables: { username: string; }; } export type HonoCtx = Context; // 您可以在模拟面试期间评估的技术技能列表。 // 此应用程序侧重于在真实面试中通常测试的流行 Web 技术和编程语言。 export enum InterviewSkill { JavaScript = "JavaScript", TypeScript = "TypeScript", React = "React", NodeJS = "NodeJS", Python = "Python", } // 基于不同工程职位的可用面试类型。 // 这有助于根据候选人的目标职位定制面试体验和问题。 export enum InterviewTitle { JuniorDeveloper = "初级开发人员面试", SeniorDeveloper = "高级开发人员面试", FullStackDeveloper = "全栈开发人员面试", FrontendDeveloper = "前端开发人员面试", BackendDeveloper = "后端开发人员面试", SystemArchitect = "系统架构师面试", TechnicalLead = "技术主管面试", } // 跟踪面试会话的当前状态。 // 这将帮助您管理面试流程,并在流程的每个阶段显示适当的 UI/操作。 export enum InterviewStatus { Created = "created", // 面试已创建但未开始 Pending = "pending", // 等待面试官/系统 InProgress = "in_progress", // 进行中的面试会话 Completed = "completed", // 面试成功完成 Cancelled = "cancelled", // 面试提前终止 } // 定义在面试聊天中发送消息的人 export type MessageRole = "user" | "assistant" | "system"; // 面试期间交换的单个消息的结构 export interface Message { messageId: string; // 消息的唯一标识符 interviewId: string; // 将消息链接到特定面试 role: MessageRole; // 谁发送了消息 content: string; // 实际消息内容 timestamp: number; // 消息发送时间 } // 保存有关面试会话的所有信息的主要数据结构。 // 这包括元数据、交换的消息和当前状态。 export interface InterviewData { interviewId: string; title: InterviewTitle; skills: InterviewSkill[]; messages: Message[]; status: InterviewStatus; createdAt: number; updatedAt: number; } // 创建新面试会话的输入格式。 // 简化接口,接受开始面试所需的基本参数。 export interface InterviewInput { title: string; skills: string[]; } ``` ## 3. 为不同服务配置错误类型 接下来,设置自定义错误类型以处理应用程序中可能发生的不同类型的错误。这包括: - 数据库错误(例如,连接问题、查询失败) - 与面试相关的错误(例如,无效输入、转录失败) - 身份验证错误(例如,无效会话) 创建以下 `errors.ts` 文件: ```typescript title="src/errors.ts" export const ErrorCodes = { INVALID_MESSAGE: "INVALID_MESSAGE", TRANSCRIPTION_FAILED: "TRANSCRIPTION_FAILED", LLM_FAILED: "LLM_FAILED", DATABASE_ERROR: "DATABASE_ERROR", } as const; export class AppError extends Error { constructor( message: string, public statusCode: number, ) { super(message); this.name = this.constructor.name; } } export class UnauthorizedError extends AppError { constructor(message: string) { super(message, 401); } } export class BadRequestError extends AppError { constructor(message: string) { super(message, 400); } } export class NotFoundError extends AppError { constructor(message: string) { super(message, 404); } } export class InterviewError extends Error { constructor( message: string, public code: string, public statusCode: number = 500, ) { super(message); this.name = "InterviewError"; } } ``` ## 4. 配置身份验证中间件和用户路由 在此步骤中,您将实现一个基本的身份验证系统,以跟踪和识别与您的 AI 面试练习工具交互的用户。该系统使用仅 HTTP 的 cookie 来存储用户名,使您能够识别请求发送者及其相应的 Durable Object。这种直接的身份验证方法要求用户提供一个用户名,然后将其安全地存储在 cookie 中。这种方法使您能够: - 跨请求识别用户 - 将面试会话与特定用户关联 - 保护对与面试相关的端点的访问 ### 创建身份验证中间件 创建一个中间件函数,用于检查是否存在有效的身份验证 cookie。此中间件将用于保护需要身份验证的路由。 创建一个新的中间件文件 `middleware/auth.ts`: ```typescript title="src/middleware/auth.ts" import { Context } from "hono"; import { getCookie } from "hono/cookie"; import { UnauthorizedError } from "../errors"; export const requireAuth = async (ctx: Context, next: () => Promise) => { // Get username from cookie const username = getCookie(ctx, "username"); if (!username) { throw new UnauthorizedError("User is not logged in"); } // Make username available to route handlers ctx.set("username", username); await next(); }; ``` This middleware: - Checks for a `username` cookie - Throws an `Error` if the cookie is missing - Makes the username available to downstream handlers via the context ### Create Authentication Routes Next, create the authentication routes that will handle user login. Create a new file `routes/auth.ts`: ```typescript title="src/routes/auth.ts" import { Context, Hono } from "hono"; import { setCookie } from "hono/cookie"; import { BadRequestError } from "../errors"; import { ApiContext } from "../types"; export const authenticateUser = async (ctx: Context) => { // Extract username from request body const { username } = await ctx.req.json(); // Make sure username was provided if (!username) { throw new BadRequestError("Username is required"); } // Create a secure cookie to track the user's session // This cookie will: // - Be HTTP-only for security (no JS access) // - Work across all routes via path="/" // - Last for 24 hours // - Only be sent in same-site requests to prevent CSRF setCookie(ctx, "username", username, { httpOnly: true, path: "/", maxAge: 60 * 60 * 24, sameSite: "Strict", }); // Let the client know login was successful return ctx.json({ success: true }); }; // Set up authentication-related routes export const configureAuthRoutes = () => { const router = new Hono(); // POST /login - Authenticate user and create session router.post("/login", authenticateUser); return router; }; ``` Finally, update main application file to include the authentication routes. Modify `src/index.ts`: ```typescript title="src/index.ts" import { configureAuthRoutes } from "./routes/auth"; import { Hono } from "hono"; import { logger } from "hono/logger"; import type { ApiContext } from "./types"; import { requireAuth } from "./middleware/auth"; // Create our main Hono app instance with proper typing const app = new Hono(); // Create a separate router for API endpoints to keep things organized const api = new Hono(); // Set up global middleware that runs on every request // - Logger gives us visibility into what is happening app.use("*", logger()); // Wire up all our authentication routes (login, etc) // These will be mounted under /api/v1/auth/ api.route("/auth", configureAuthRoutes()); // Mount all API routes under the version prefix (for example, /api/v1) // This allows us to make breaking changes in v2 without affecting v1 users app.route("/api/v1", api); export default app; ``` Now we have a basic authentication system that: 1. Provides a login endpoint at `/api/v1/auth/login` 2. Securely stores the username in a cookie 3. Includes middleware to protect authenticated routes ## 5. Create a Durable Object to manage interviews Now that you have your authentication system in place, create a Durable Object to manage interview sessions. Durable Objects are perfect for this interview practice tool because they provide the following functionalities: - Maintains states between connections, so users can reconnect without losing progress. - Provides a SQLite database to store all interview Q&A, feedback and metrics. - Enables smooth real-time interactions between the interviewer AI and candidate. - Handles multiple interview sessions efficiently without performance issues. - Creates a dedicated instance for each user, giving them their own isolated environment. First, you will need to configure the Durable Object in Wrangler file. Add the following configuration: ```toml title="wrangler.toml" [[durable_objects.bindings]] name = "INTERVIEW" class_name = "Interview" [[migrations]] tag = "v1" new_sqlite_classes = ["Interview"] ``` Next, create a new file `interview.ts` to define our Interview Durable Object: ```typescript title="src/interview.ts" import { DurableObject } from "cloudflare:workers"; export class Interview extends DurableObject { // We will use it to keep track of all active WebSocket connections for real-time communication private sessions: Map; constructor(state: DurableObjectState, env: CloudflareBindings) { super(state, env); // Initialize empty sessions map - we will add WebSocket connections as users join this.sessions = new Map(); } // Entry point for all HTTP requests to this Durable Object // This will handle both initial setup and WebSocket upgrades async fetch(request: Request) { // For now, just confirm the object is working // We'll add WebSocket upgrade logic and request routing later return new Response("Interview object initialized"); } // Broadcasts a message to all connected WebSocket clients. private broadcast(message: string) { this.ctx.getWebSockets().forEach((ws) => { try { if (ws.readyState === WebSocket.OPEN) { ws.send(message); } } catch (error) { console.error( "Error broadcasting message to a WebSocket client:", error, ); } }); } } ``` Now we need to export the Durable Object in our main `src/index.ts` file: ```typescript title="src/index.ts" import { Interview } from "./interview"; // ... previous code ... export { Interview }; export default app; ``` Since the Worker code is written in TypeScript, you should run the following command to add the necessary type definitions: ```sh npm run cf-typegen ``` ### Set up SQLite database schema to store interview data Now you will use SQLite at the Durable Object level for data persistence. This gives each user their own isolated database instance. You will need two main tables: - `interviews`: Stores interview session data - `messages`: Stores all messages exchanged during interviews Before you create these tables, create a service class to handle your database operations. This encapsulates database logic and helps you: - Manage database schema changes - Handle errors consistently - Keep database queries organized Create a new file called `services/InterviewDatabaseService.ts`: ```typescript title="src/services/InterviewDatabaseService.ts" import { InterviewData, Message, InterviewStatus, InterviewTitle, InterviewSkill, } from "../types"; import { InterviewError, ErrorCodes } from "../errors"; const CONFIG = { database: { tables: { interviews: "interviews", messages: "messages", }, indexes: { messagesByInterview: "idx_messages_interviewId", }, }, } as const; export class InterviewDatabaseService { constructor(private sql: SqlStorage) {} /** * Sets up the database schema by creating tables and indexes if they do not exist. * This is called when initializing a new Durable Object instance to ensure * we have the required database structure. * * The schema consists of: * - interviews table: Stores interview metadata like title, skills, and status * - messages table: Stores the conversation history between user and AI * - messages index: Helps optimize queries when fetching messages for a specific interview */ createTables() { try { // Get list of existing tables to avoid recreating them const cursor = this.sql.exec(`PRAGMA table_list`); const existingTables = new Set([...cursor].map((table) => table.name)); // The interviews table is our main table storing interview sessions. // We only create it if it does not exist yet. if (!existingTables.has(CONFIG.database.tables.interviews)) { this.sql.exec(InterviewDatabaseService.QUERIES.CREATE_INTERVIEWS_TABLE); } // The messages table stores the actual conversation history. // It references interviews table via foreign key for data integrity. if (!existingTables.has(CONFIG.database.tables.messages)) { this.sql.exec(InterviewDatabaseService.QUERIES.CREATE_MESSAGES_TABLE); } // Add an index on interviewId to speed up message retrieval. // This is important since we will frequently query messages by interview. this.sql.exec(InterviewDatabaseService.QUERIES.CREATE_MESSAGE_INDEX); } catch (error: unknown) { const message = error instanceof Error ? error.message : String(error); throw new InterviewError( `Failed to initialize database: ${message}`, ErrorCodes.DATABASE_ERROR, ); } } private static readonly QUERIES = { CREATE_INTERVIEWS_TABLE: ` CREATE TABLE IF NOT EXISTS interviews ( interviewId TEXT PRIMARY KEY, title TEXT NOT NULL, skills TEXT NOT NULL, createdAt INTEGER NOT NULL DEFAULT (strftime('%s','now') * 1000), updatedAt INTEGER NOT NULL DEFAULT (strftime('%s','now') * 1000), status TEXT NOT NULL DEFAULT 'pending' ) `, CREATE_MESSAGES_TABLE: ` CREATE TABLE IF NOT EXISTS messages ( messageId TEXT PRIMARY KEY, interviewId TEXT NOT NULL, role TEXT NOT NULL, content TEXT NOT NULL, timestamp INTEGER NOT NULL, FOREIGN KEY (interviewId) REFERENCES interviews(interviewId) ) `, CREATE_MESSAGE_INDEX: ` CREATE INDEX IF NOT EXISTS idx_messages_interview ON messages(interviewId) `, }; } ``` Update the `Interview` Durable Object to use the database service by modifying `src/interview.ts`: ```typescript title="src/interview.ts" import { InterviewDatabaseService } from "./services/InterviewDatabaseService"; export class Interview extends DurableObject { // Database service for persistent storage of interview data and messages private readonly db: InterviewDatabaseService; private sessions: Map; constructor(state: DurableObjectState, env: CloudflareBindings) { // ... previous code ... // Set up our database connection using the DO's built-in SQLite instance this.db = new InterviewDatabaseService(state.storage.sql); // First-time setup: ensure our database tables exist // This is idempotent so safe to call on every instantiation this.db.createTables(); } } ``` Add methods to create and retrieve interviews in `services/InterviewDatabaseService.ts`: ```typescript title="src/services/InterviewDatabaseService.ts" export class InterviewDatabaseService { /** * Creates a new interview session in the database. * * This is the main entry point for starting a new interview. It handles all the * initial setup like: * - Generating a unique ID using crypto.randomUUID() for reliable uniqueness * - Recording the interview title and required skills * - Setting up timestamps for tracking interview lifecycle * - Setting the initial status to "Created" * */ createInterview(title: InterviewTitle, skills: InterviewSkill[]): string { try { const interviewId = crypto.randomUUID(); const currentTime = Date.now(); this.sql.exec( InterviewDatabaseService.QUERIES.INSERT_INTERVIEW, interviewId, title, JSON.stringify(skills), // Store skills as JSON for flexibility InterviewStatus.Created, currentTime, currentTime, ); return interviewId; } catch (error: unknown) { const message = error instanceof Error ? error.message : String(error); throw new InterviewError( `Failed to create interview: ${message}`, ErrorCodes.DATABASE_ERROR, ); } } /** * Fetches all interviews from the database, ordered by creation date. * * This is useful for displaying interview history and letting users * resume previous sessions. We order by descending creation date since * users typically want to see their most recent interviews first. * * Returns an array of InterviewData objects with full interview details * including metadata and message history. */ getAllInterviews(): InterviewData[] { try { const cursor = this.sql.exec( InterviewDatabaseService.QUERIES.GET_ALL_INTERVIEWS, ); return [...cursor].map(this.parseInterviewRecord); } catch (error) { const message = error instanceof Error ? error.message : String(error); throw new InterviewError( `Failed to retrieve interviews: ${message}`, ErrorCodes.DATABASE_ERROR, ); } } // Retrieves an interview and its messages by ID getInterview(interviewId: string): InterviewData | null { try { const cursor = this.sql.exec( InterviewDatabaseService.QUERIES.GET_INTERVIEW, interviewId, ); const record = [...cursor][0]; if (!record) return null; return this.parseInterviewRecord(record); } catch (error: unknown) { const message = error instanceof Error ? error.message : String(error); throw new InterviewError( `Failed to retrieve interview: ${message}`, ErrorCodes.DATABASE_ERROR, ); } } addMessage( interviewId: string, role: Message["role"], content: string, messageId: string, ): Message { try { const timestamp = Date.now(); this.sql.exec( InterviewDatabaseService.QUERIES.INSERT_MESSAGE, messageId, interviewId, role, content, timestamp, ); return { messageId, interviewId, role, content, timestamp, }; } catch (error: unknown) { const message = error instanceof Error ? error.message : String(error); throw new InterviewError( `Failed to add message: ${message}`, ErrorCodes.DATABASE_ERROR, ); } } /** * Transforms raw database records into structured InterviewData objects. * * This helper does the heavy lifting of: * - Type checking critical fields to catch database corruption early * - Converting stored JSON strings back into proper objects * - Filtering out any null messages that might have snuck in * - Ensuring timestamps are proper numbers * * If any required data is missing or malformed, it throws an error * rather than returning partially valid data that could cause issues * downstream. */ private parseInterviewRecord(record: any): InterviewData { const interviewId = record.interviewId as string; const createdAt = Number(record.createdAt); const updatedAt = Number(record.updatedAt); if (!interviewId || !createdAt || !updatedAt) { throw new InterviewError( "Invalid interview data in database", ErrorCodes.DATABASE_ERROR, ); } return { interviewId, title: record.title as InterviewTitle, skills: JSON.parse(record.skills as string) as InterviewSkill[], messages: record.messages ? JSON.parse(record.messages) .filter((m: any) => m !== null) .map((m: any) => ({ messageId: m.messageId, role: m.role, content: m.content, timestamp: m.timestamp, })) : [], status: record.status as InterviewStatus, createdAt, updatedAt, }; } // Add these SQL queries to the QUERIES object private static readonly QUERIES = { // ... previous queries ... INSERT_INTERVIEW: ` INSERT INTO ${CONFIG.database.tables.interviews} (interviewId, title, skills, status, createdAt, updatedAt) VALUES (?, ?, ?, ?, ?, ?) `, GET_ALL_INTERVIEWS: ` SELECT interviewId, title, skills, createdAt, updatedAt, status FROM ${CONFIG.database.tables.interviews} ORDER BY createdAt DESC `, INSERT_MESSAGE: ` INSERT INTO ${CONFIG.database.tables.messages} (messageId, interviewId, role, content, timestamp) VALUES (?, ?, ?, ?, ?) `, GET_INTERVIEW: ` SELECT i.interviewId, i.title, i.skills, i.status, i.createdAt, i.updatedAt, COALESCE( json_group_array( CASE WHEN m.messageId IS NOT NULL THEN json_object( 'messageId', m.messageId, 'role', m.role, 'content', m.content, 'timestamp', m.timestamp ) END ), '[]' ) as messages FROM ${CONFIG.database.tables.interviews} i LEFT JOIN ${CONFIG.database.tables.messages} m ON i.interviewId = m.interviewId WHERE i.interviewId = ? GROUP BY i.interviewId `, }; } ``` Add RPC methods to the `Interview` Durable Object to expose database operations through API. Add this code to `src/interview.ts`: ```typescript title="src/interview.ts" import { InterviewData, InterviewTitle, InterviewSkill, Message, } from "./types"; export class Interview extends DurableObject { // Creates a new interview session createInterview(title: InterviewTitle, skills: InterviewSkill[]): string { return this.db.createInterview(title, skills); } // Retrieves all interview sessions getAllInterviews(): InterviewData[] { return this.db.getAllInterviews(); } // Adds a new message to the 'messages' table and broadcasts it to all connected WebSocket clients. addMessage( interviewId: string, role: "user" | "assistant", content: string, messageId: string, ): Message { const newMessage = this.db.addMessage( interviewId, role, content, messageId, ); this.broadcast( JSON.stringify({ ...newMessage, type: "message", }), ); return newMessage; } } ``` ## 6. Create REST API endpoints With your Durable Object and database service ready, create REST API endpoints to manage interviews. You will need endpoints to: - Create new interviews - Retrieve all interviews for a user Create a new file for your interview routes at `routes/interview.ts`: ```typescript title="src/routes/interview.ts" import { Hono } from "hono"; import { BadRequestError } from "../errors"; import { InterviewInput, ApiContext, HonoCtx, InterviewTitle, InterviewSkill, } from "../types"; import { requireAuth } from "../middleware/auth"; /** * Gets the Interview Durable Object instance for a given user. * We use the username as a stable identifier to ensure each user * gets their own dedicated DO instance that persists across requests. */ const getInterviewDO = (ctx: HonoCtx) => { const username = ctx.get("username"); const id = ctx.env.INTERVIEW.idFromName(username); return ctx.env.INTERVIEW.get(id); }; /** * Validates the interview creation payload. * Makes sure we have all required fields in the correct format: * - title must be present * - skills must be a non-empty array * Throws an error if validation fails. */ const validateInterviewInput = (input: InterviewInput) => { if ( !input.title || !input.skills || !Array.isArray(input.skills) || input.skills.length === 0 ) { throw new BadRequestError("Invalid input"); } }; /** * GET /interviews * Retrieves all interviews for the authenticated user. * The interviews are stored and managed by the user's DO instance. */ const getAllInterviews = async (ctx: HonoCtx) => { const interviewDO = getInterviewDO(ctx); const interviews = await interviewDO.getAllInterviews(); return ctx.json(interviews); }; /** * POST /interviews * Creates a new interview session with the specified title and skills. * Each interview gets a unique ID that can be used to reference it later. * Returns the newly created interview ID on success. */ const createInterview = async (ctx: HonoCtx) => { const body = await ctx.req.json(); validateInterviewInput(body); const interviewDO = getInterviewDO(ctx); const interviewId = await interviewDO.createInterview( body.title as InterviewTitle, body.skills as InterviewSkill[], ); return ctx.json({ success: true, interviewId }); }; /** * Sets up all interview-related routes. * Currently supports: * - GET / : List all interviews * - POST / : Create a new interview */ export const configureInterviewRoutes = () => { const router = new Hono(); router.use("*", requireAuth); router.get("/", getAllInterviews); router.post("/", createInterview); return router; }; ``` The `getInterviewDO` helper function uses the username from our authentication cookie to create a unique Durable Object ID. This ensures each user has their own isolated interview state. Update your main application file to include the routes and protect them with authentication middleware. Update `src/index.ts`: ```typescript title="src/index.ts" import { configureAuthRoutes } from "./routes/auth"; import { configureInterviewRoutes } from "./routes/interview"; import { Hono } from "hono"; import { Interview } from "./interview"; import { logger } from "hono/logger"; import type { ApiContext } from "./types"; const app = new Hono(); const api = new Hono(); app.use("*", logger()); api.route("/auth", configureAuthRoutes()); api.route("/interviews", configureInterviewRoutes()); app.route("/api/v1", api); export { Interview }; export default app; ``` Now you have two new API endpoints: - `POST /api/v1/interviews`: Creates a new interview session - `GET /api/v1/interviews`: Retrieves all interviews for the authenticated user You can test these endpoints running the following command: 1. Create a new interview: ```sh curl -X POST http://localhost:8787/api/v1/interviews \ -H "Content-Type: application/json" \ -H "Cookie: username=testuser; HttpOnly" \ -d '{"title":"Frontend Developer Interview","skills":["JavaScript","React","CSS"]}' ``` 2. Get all interviews: ```sh curl http://localhost:8787/api/v1/interviews \ -H "Cookie: username=testuser; HttpOnly" ``` ## 7. Set up WebSockets to handle real-time communication With the basic interview management system in place, you will now implement Durable Objects to handle real-time message processing and maintain WebSocket connections. Update the `Interview` Durable Object to handle WebSocket connections by adding the following code to `src/interview.ts`: ```typescript title="src/interview.ts" export class Interview extends DurableObject { // Services for database operations and managing WebSocket sessions private readonly db: InterviewDatabaseService; private sessions: Map; constructor(state: DurableObjectState, env: CloudflareBindings) { // ... previous code ... // Keep WebSocket connections alive by automatically responding to pings // This prevents timeouts and connection drops this.ctx.setWebSocketAutoResponse( new WebSocketRequestResponsePair("ping", "pong"), ); } async fetch(request: Request): Promise { // Check if this is a WebSocket upgrade request const upgradeHeader = request.headers.get("Upgrade"); if (upgradeHeader?.toLowerCase().includes("websocket")) { return this.handleWebSocketUpgrade(request); } // If it is not a WebSocket request, we don't handle it return new Response("Not found", { status: 404 }); } private async handleWebSocketUpgrade(request: Request): Promise { // Extract the interview ID from the URL - it should be the last segment const url = new URL(request.url); const interviewId = url.pathname.split("/").pop(); if (!interviewId) { return new Response("Missing interviewId parameter", { status: 400 }); } // Create a new WebSocket connection pair - one for the client, one for the server const pair = new WebSocketPair(); const [client, server] = Object.values(pair); // Keep track of which interview this WebSocket is connected to // This is important for routing messages to the right interview session this.sessions.set(server, { interviewId }); // Tell the Durable Object to start handling this WebSocket this.ctx.acceptWebSocket(server); // Send the current interview state to the client right away // This helps initialize their UI with the latest data const interviewData = await this.db.getInterview(interviewId); if (interviewData) { server.send( JSON.stringify({ type: "interview_details", data: interviewData, }), ); } // Return the client WebSocket as part of the upgrade response return new Response(null, { status: 101, webSocket: client, }); } async webSocketClose( ws: WebSocket, code: number, reason: string, wasClean: boolean, ) { // Clean up when a connection closes to prevent memory leaks // This is especially important in long-running Durable Objects console.log( `WebSocket closed: Code ${code}, Reason: ${reason}, Clean: ${wasClean}`, ); } } ``` Next, update the interview routes to include a WebSocket endpoint. Add the following to `routes/interview.ts`: ```typescript title="src/routes/interview.ts" // ... previous code ... const streamInterviewProcess = async (ctx: HonoCtx) => { const interviewDO = getInterviewDO(ctx); return await interviewDO.fetch(ctx.req.raw); }; export const configureInterviewRoutes = () => { const router = new Hono(); router.get("/", getAllInterviews); router.post("/", createInterview); // Add WebSocket route router.get("/:interviewId", streamInterviewProcess); return router; }; ``` The WebSocket system provides real-time communication features for interview practice tool: - Each interview session gets its own dedicated WebSocket connection, allowing seamless communication between the candidate and AI interviewer - The Durable Object maintains the connection state, ensuring no messages are lost even if the client temporarily disconnects - To keep connections stable, it automatically responds to ping messages with pongs, preventing timeouts - Candidates and interviewers receive instant updates as the interview progresses, creating a natural conversational flow ## 8. Add audio processing capabilities with Workers AI Now that WebSocket connection set up, the next step is to add speech-to-text capabilities using Workers AI. Let's use Cloudflare's Whisper model to transcribe audio in real-time during the interview. The audio processing pipeline will work like this: 1. Client sends audio through the WebSocket connection 2. Our Durable Object receives the binary audio data 3. We pass the audio to Whisper for transcription 4. The transcribed text is saved as a new message 5. We immediately send the transcription back to the client 6. The client receives a notification that the AI interviewer is generating a response ### Create audio processing pipeline In this step you will update the Interview Durable Object to handle the following: 1. Detect binary audio data sent through WebSocket 2. Create a unique message ID for tracking the processing status 3. Notify clients that audio processing has begun 4. Include error handling for failed audio processing 5. Broadcast status updates to all connected clients First, update Interview Durable Object to handle binary WebSocket messages. Add the following methods to your `src/interview.ts` file: ```typescript title="src/interview.ts" // ... previous code ... /** * Handles incoming WebSocket messages, both binary audio data and text messages. * This is the main entry point for all WebSocket communication. */ async webSocketMessage(ws: WebSocket, eventData: ArrayBuffer | string): Promise { try { // Handle binary audio data from the client's microphone if (eventData instanceof ArrayBuffer) { await this.handleBinaryAudio(ws, eventData); return; } // Text messages will be handled by other methods } catch (error) { this.handleWebSocketError(ws, error); } } /** * Processes binary audio data received from the client. * Converts audio to text using Whisper and broadcasts processing status. */ private async handleBinaryAudio(ws: WebSocket, audioData: ArrayBuffer): Promise { try { const uint8Array = new Uint8Array(audioData); // Retrieve the associated interview session const session = this.sessions.get(ws); if (!session?.interviewId) { throw new Error("No interview session found"); } // Generate unique ID to track this message through the system const messageId = crypto.randomUUID(); // Let the client know we're processing their audio this.broadcast( JSON.stringify({ type: "message", status: "processing", role: "user", messageId, interviewId: session.interviewId, }), ); // TODO: Implement Whisper transcription in next section // For now, just log the received audio data size console.log(`Received audio data of length: ${uint8Array.length}`); } catch (error) { console.error("Audio processing failed:", error); this.handleWebSocketError(ws, error); } } /** * Handles WebSocket errors by logging them and notifying the client. * Ensures errors are properly communicated back to the user. */ private handleWebSocketError(ws: WebSocket, error: unknown): void { const errorMessage = error instanceof Error ? error.message : "An unknown error occurred."; console.error("WebSocket error:", errorMessage); if (ws.readyState === WebSocket.OPEN) { ws.send( JSON.stringify({ type: "error", message: errorMessage, }), ); } } ``` Your `handleBinaryAudio` method currently logs when it receives audio data. Next, you'll enhance it to transcribe speech using Workers AI's Whisper model. ### Configure speech-to-text Now that audio processing pipeline is set up, you will now integrate Workers AI's Whisper model for speech-to-text transcription. Configure the Worker AI binding in your Wrangler file by adding: ```toml # ... previous configuration ... [ai] binding = "AI" ``` Next, generate TypeScript types for our AI binding. Run the following command: ```sh npm run cf-typegen ``` You will need a new service class for AI operations. Create a new file called `services/AIService.ts`: ```typescript title="src/services/AIService.ts" import { InterviewError, ErrorCodes } from "../errors"; export class AIService { constructor(private readonly AI: Ai) {} async transcribeAudio(audioData: Uint8Array): Promise { try { // Call the Whisper model to transcribe the audio const response = await this.AI.run("@cf/openai/whisper-tiny-en", { audio: Array.from(audioData), }); if (!response?.text) { throw new Error("Failed to transcribe audio content."); } return response.text; } catch (error) { throw new InterviewError( "Failed to transcribe audio content", ErrorCodes.TRANSCRIPTION_FAILED, ); } } } ``` You will need to update the `Interview` Durable Object to use this new AI service. To do this, update the handleBinaryAudio method in `src/interview.ts`: ```typescript title="src/interview.ts" import { AIService } from "./services/AIService"; export class Interview extends DurableObject { private readonly aiService: AIService; constructor(state: DurableObjectState, env: Env) { // ... previous code ... // Initialize the AI service with the Workers AI binding this.aiService = new AIService(this.env.AI); } private async handleBinaryAudio(ws: WebSocket, audioData: ArrayBuffer): Promise { try { const uint8Array = new Uint8Array(audioData); const session = this.sessions.get(ws); if (!session?.interviewId) { throw new Error("No interview session found"); } // Create a message ID for tracking const messageId = crypto.randomUUID(); // Send processing state to client this.broadcast( JSON.stringify({ type: "message", status: "processing", role: "user", messageId, interviewId: session.interviewId, }), ); // NEW: Use AI service to transcribe the audio const transcribedText = await this.aiService.transcribeAudio(uint8Array); // Store the transcribed message await this.addMessage(session.interviewId, "user", transcribedText, messageId); } catch (error) { console.error("Audio processing failed:", error); this.handleWebSocketError(ws, error); } } ``` :::note The Whisper model `@cf/openai/whisper-tiny-en` is optimized for English speech recognition. If you need support for other languages, you can use different Whisper model variants available through Workers AI. ::: When users speak during the interview, their audio will be automatically transcribed and stored as messages in the interview session. The transcribed text will be immediately available to both the user and the AI interviewer for generating appropriate responses. ## 9. Integrate AI response generation Now that you have audio transcription working, let's implement AI interviewer response generation using Workers AI's LLM capabilities. You'll create an interview system that: - Maintains context of the conversation - Provides relevant follow-up questions - Gives constructive feedback - Stays in character as a professional interviewer ### Set up Workers AI LLM integration First, update the `AIService` class to handle LLM interactions. You will need to add methods for: - Processing interview context - Generating appropriate responses - Handling conversation flow Update the `services/AIService.ts` class to include LLM functionality: ```typescript title="src/services/AIService.ts" import { InterviewData, Message } from "../types"; export class AIService { async processLLMResponse(interview: InterviewData): Promise { const messages = this.prepareLLMMessages(interview); try { const { response } = await this.AI.run("@cf/meta/llama-2-7b-chat-int8", { messages, }); if (!response) { throw new Error("Failed to generate a response from the LLM model."); } return response; } catch (error) { throw new InterviewError("Failed to generate a response from the LLM model.", ErrorCodes.LLM_FAILED); } } private prepareLLMMessages(interview: InterviewData) { const messageHistory = interview.messages.map((msg: Message) => ({ role: msg.role, content: msg.content, })); return [ { role: "system", content: this.createSystemPrompt(interview), }, ...messageHistory, ]; } ``` :::note The @cf/meta/llama-2-7b-chat-int8 model is optimized for chat-like interactions and provides good performance while maintaining reasonable resource usage. ::: ### Create the conversation prompt Prompt engineering is crucial for getting high-quality responses from the LLM. Next, you will create a system prompt that: - Sets the context for the interview - Defines the interviewer's role and behavior - Specifies the technical focus areas - Guides the conversation flow Add the following method to your `services/AIService.ts` class: ```typescript title="src/services/AIService.ts" private createSystemPrompt(interview: InterviewData): string { const basePrompt = "You are conducting a technical interview."; const rolePrompt = `The position is for ${interview.title}.`; const skillsPrompt = `Focus on topics related to: ${interview.skills.join(", ")}.`; const instructionsPrompt = "Ask relevant technical questions and provide constructive feedback."; return `${basePrompt} ${rolePrompt} ${skillsPrompt} ${instructionsPrompt}`; } ``` ### Implement response generation logic Finally, integrate the LLM response generation into the interview flow. Update the `handleBinaryAudio` method in the `src/interview.ts` Durable Object to: - Process transcribed user responses - Generate appropriate AI interviewer responses - Maintain conversation context Update the `handleBinaryAudio` method in `src/interview.ts`: ```typescript title="src/interview.ts" private async handleBinaryAudio(ws: WebSocket, audioData: ArrayBuffer): Promise { try { // Convert raw audio buffer to uint8 array for processing const uint8Array = new Uint8Array(audioData); const session = this.sessions.get(ws); if (!session?.interviewId) { throw new Error("No interview session found"); } // Generate a unique ID to track this message through the system const messageId = crypto.randomUUID(); // Let the client know we're processing their audio // This helps provide immediate feedback while transcription runs this.broadcast( JSON.stringify({ type: "message", status: "processing", role: "user", messageId, interviewId: session.interviewId, }), ); // Convert the audio to text using our AI transcription service // This typically takes 1-2 seconds for normal speech const transcribedText = await this.aiService.transcribeAudio(uint8Array); // Save the user's message to our database so we maintain chat history await this.addMessage(session.interviewId, "user", transcribedText, messageId); // Look up the full interview context - we need this to generate a good response const interview = await this.db.getInterview(session.interviewId); if (!interview) { throw new Error(`Interview not found: ${session.interviewId}`); } // Now it's the AI's turn to respond // First generate an ID for the assistant's message const assistantMessageId = crypto.randomUUID(); // Let the client know we're working on the AI response this.broadcast( JSON.stringify({ type: "message", status: "processing", role: "assistant", messageId: assistantMessageId, interviewId: session.interviewId, }), ); // Generate the AI interviewer's response based on the conversation history const llmResponse = await this.aiService.processLLMResponse(interview); await this.addMessage(session.interviewId, "assistant", llmResponse, assistantMessageId); } catch (error) { // Something went wrong processing the audio or generating a response // Log it and let the client know there was an error console.error("Audio processing failed:", error); this.handleWebSocketError(ws, error); } } ``` ## Conclusion You have successfully built an AI-powered interview practice tool using Cloudflare's Workers AI. In summary, you have: - Created a real-time WebSocket communication system using Durable Objects - Implemented speech-to-text processing with Workers AI Whisper model - Built an intelligent interview system using Workers AI LLM capabilities - Designed a persistent storage system with SQLite in Durable Objects The complete source code for this tutorial is available on GitHub: [ai-interview-practice-tool](https://github.com/berezovyy/ai-interview-practice-tool) --- # 使用 DeepSeek Coder 模型探索代码生成 URL: https://developers.cloudflare.com/workers-ai/guides/tutorials/explore-code-generation-using-deepseek-coder-models/ import { Stream } from "~/components"; 探索 [Workers AI](/workers-ai) 上所有可用模型的一个便捷方法是使用 [Jupyter Notebook](https://jupyter.org/)。 您可以[下载 DeepSeek Coder 笔记本](/workers-ai/static/documentation/notebooks/deepseek-coder-exploration.ipynb)或查看下面嵌入的笔记本。 [comment]: <> "下面的 markdown 是从 https://github.com/craigsdennis/notebooks-cloudflare-workers-ai 自动生成的" --- ## 使用 DeepSeek Coder 探索代码生成 能够生成代码的 AI 模型开启了各种用例。现在 [Workers AI](/workers-ai) 上提供了 [DeepSeek Coder](https://github.com/deepseek-ai/DeepSeek-Coder) 模型 `@hf/thebloke/deepseek-coder-6.7b-base-awq` 和 `@hf/thebloke/deepseek-coder-6.7b-instruct-awq`。 让我们使用 API 来探索它们! ```python import sys !{sys.executable} -m pip install requests python-dotenv ``` ``` Requirement already satisfied: requests in ./venv/lib/python3.12/site-packages (2.31.0) Requirement already satisfied: python-dotenv in ./venv/lib/python3.12/site-packages (1.0.1) Requirement already satisfied: charset-normalizer<4,>=2 in ./venv/lib/python3.12/site-packages (from requests) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in ./venv/lib/python3.12/site-packages (from requests) (3.6) Requirement already satisfied: urllib3<3,>=1.21.1 in ./venv/lib/python3.12/site-packages (from requests) (2.1.0) Requirement already satisfied: certifi>=2017.4.17 in ./venv/lib/python3.12/site-packages (from requests) (2023.11.17) ``` ```python import os from getpass import getpass from IPython.display import display, Image, Markdown, Audio import requests ``` ```python %load_ext dotenv %dotenv ``` ### 配置您的环境 要使用 API,您需要您的 [Cloudflare 帐户 ID](https://dash.cloudflare.com)(前往 Workers & Pages > 概述 > 帐户详细信息 > 帐户 ID)和一个[已启用 Workers AI 的 API 令牌](https://dash.cloudflare.com/profile/api-tokens)。 如果您想将这些文件添加到您的环境中,可以创建一个名为 `.env` 的新文件 ```bash CLOUDFLARE_API_TOKEN="您的令牌" CLOUDFLARE_ACCOUNT_ID="您的帐户 ID" ``` ```python if "CLOUDFLARE_API_TOKEN" in os.environ: api_token = os.environ["CLOUDFLARE_API_TOKEN"] else: api_token = getpass("输入您的 Cloudflare API 令牌") ``` ```python if "CLOUDFLARE_ACCOUNT_ID" in os.environ: account_id = os.environ["CLOUDFLARE_ACCOUNT_ID"] else: account_id = getpass("输入您的帐户 ID") ``` ### 从注释生成代码 一个常见的用例是在用户提供描述性注释后为其完成代码。 ````python model = "@hf/thebloke/deepseek-coder-6.7b-base-awq" prompt = "# 一个检查给定单词是否为回文的函数" response = requests.post( f"https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model}", headers={"Authorization": f"Bearer {api_token}"}, json={"messages": [ {"role": "user", "content": prompt} ]} ) inference = response.json() code = inference["result"]["response"] display(Markdown(f""" ```python {prompt} {code.strip()} ``` """)) ```` ```python # 一个检查给定单词是否为回文的函数 def is_palindrome(word): # 将单词转换为小写 word = word.lower() # 反转单词 reversed_word = word[::-1] # 检查反转后的单词是否与原始单词相同 if word == reversed_word: return True else: return False # 测试函数 print(is_palindrome("racecar")) # 输出:True print(is_palindrome("hello")) # 输出:False ``` ### 协助调试 我们都遇到过这种情况,bug 总会发生。有时那些堆栈跟踪可能非常吓人,而使用代码生成的一个很好的用例是帮助解释问题。 ```python model = "@hf/thebloke/deepseek-coder-6.7b-instruct-awq" system_message = "用户会给您一些无法工作的代码。请向用户解释可能出了什么问题" code = """# 欢迎我们的用户 def hello_world(first_name="World"): print(f"Hello, {name}!") """ response = requests.post( f"https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model}", headers={"Authorization": f"Bearer {api_token}"}, json={"messages": [ {"role": "system", "content": system_message}, {"role": "user", "content": code}, ]} ) inference = response.json() response = inference["result"]["response"] display(Markdown(response)) ``` 您的代码中的错误是您正在尝试使用一个在函数中任何地方都没有定义的变量 `name`。应该使用的正确变量是 `first_name`。所以,您应该将 `f"Hello, {name}!"` 更改为 `f"Hello, {first_name}!"`。 这是更正后的代码: ```python # 欢迎我们的用户 def hello_world(first_name="World"): print(f"Hello, {first_name}") ``` 现在,当您调用 `hello_world()` 时,它将默认打印“Hello, World”。如果您调用 `hello_world("John")`,它将打印“Hello, John”。 ### 编写测试! 编写单元测试是一种常见的最佳实践。在有足够上下文的情况下,编写单元测试是可能的。 ```python model = "@hf/thebloke/deepseek-coder-6.7b-instruct-awq" system_message = "用户会给您一些代码,并希望用 Python 的 unittest 模块编写测试。" code = """ class User: def __init__(self, first_name, last_name=None): self.first_name = first_name self.last_name = last_name if last_name is None: self.last_name = "Mc" + self.first_name def full_name(self): return self.first_name + " " + self.last_name """ response = requests.post( f"https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model}", headers={"Authorization": f"Bearer {api_token}"}, json={"messages": [ {"role": "system", "content": system_message}, {"role": "user", "content": code}, ]} ) inference = response.json() response = inference["result"]["response"] display(Markdown(response)) ``` 这是一个针对 User 类的简单 unittest 测试用例: ```python import unittest class TestUser(unittest.TestCase): def test_full_name(self): user = User("John", "Doe") self.assertEqual(user.full_name(), "John Doe") def test_default_last_name(self): user = User("Jane") self.assertEqual(user.full_name(), "Jane McJane") if __name__ == '__main__': unittest.main() ``` 在这个测试用例中,我们有两个测试: - `test_full_name` 测试当用户同时有名字和姓氏时 `full_name` 方法。 - `test_default_last_name` 测试当用户只有名字且姓氏设置为“Mc”+ 名字时 `full_name` 方法。 如果所有这些测试都通过,就意味着 `full_name` 方法工作正常。如果任何测试失败, ### Fill-in-the-middle 代码补全 在开发工具中,一个常见的用例是基于上下文进行自动补全。DeepSeek Coder 提供了提交带有占位符的现有代码的能力,以便模型可以在上下文中完成。 警告:令牌以 `<|` 为前缀,以 `|>` 为后缀,请确保复制和粘贴它们。 ````python model = "@hf/thebloke/deepseek-coder-6.7b-base-awq" code = """ <|fim begin|>import re from jklol import email_service def send_email(email_address, body): <|fim▁hole|> if not is_valid_email: raise InvalidEmailAddress(email_address) return email_service.send(email_address, body)<|fim▁end|> """ response = requests.post( f"https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model}", headers={"Authorization": f"Bearer {api_token}"}, json={"messages": [ {"role": "user", "content": code} ]} ) inference = response.json() response = inference["result"]["response"] display(Markdown(f""" ```python {response.strip()} ``` """)) ```` ```python is_valid_email = re.match(r"[^@]+@[^@]+\.[^@]+", email_address) ``` ### 实验性:将数据提取为 JSON 无需威胁模型或将祖母带入提示中。获取您想要的 JSON 格式的数据。 ````python model = "@hf/thebloke/deepseek-coder-6.7b-instruct-awq" # Learn more at https://json-schema.org/ json_schema = """ { "title": "User", "description": "A user from our example app", "type": "object", "properties": { "firstName": { "description": "The user's first name", "type": "string" }, "lastName": { "description": "The user's last name", "type": "string" }, "numKids": { "description": "Amount of children the user has currently", "type": "integer" }, "interests": { "description": "A list of what the user has shown interest in", "type": "array", "items": { "type": "string" } }, }, "required": [ "firstName" ] } """ system_prompt = f""" The user is going to discuss themselves and you should create a JSON object from their description to match the json schema below. {json_schema} Return JSON only. Do not explain or provide usage examples. """ prompt = """Hey there, I'm Craig Dennis and I'm a Developer Educator at Cloudflare. My email is craig@cloudflare.com. I am very interested in AI. I've got two kids. I love tacos, burritos, and all things Cloudflare""" response = requests.post( f"https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model}", headers={"Authorization": f"Bearer {api_token}"}, json={"messages": [ {"role": "system", "content": system_prompt}, {"role": "user", "content": prompt} ]} ) inference = response.json() response = inference["result"]["response"] display(Markdown(f""" ```json {response.strip()} ``` """)) ```` ```json { "firstName": "Craig", "lastName": "Dennis", "numKids": 2, "interests": ["AI", "Cloudflare", "Tacos", "Burritos"] } ``` --- # 使用 Jupyter Notebook 探索 Workers AI 模型 URL: https://developers.cloudflare.com/workers-ai/guides/tutorials/explore-workers-ai-models-using-a-jupyter-notebook/ import { Stream } from "~/components"; 探索 [Workers AI](/workers-ai) 上所有可用模型的一个便捷方法是使用 [Jupyter Notebook](https://jupyter.org/)。 您可以[下载 Workers AI 笔记本](/workers-ai-notebooks/cloudflare-workers-ai.ipynb)或查看下面嵌入的笔记本。 或者您可以在 [Google Colab](https://colab.research.google.com/github/craigsdennis/notebooks-cloudflare-workers-ai/blob/main/cloudflare-workers-ai.ipynb) 上运行它 [comment]: <> "下面的 markdown 是从 https://github.com/craigsdennis/notebooks-cloudflare-workers-ai 自动生成的,