# 更新日志

URL: https://developers.cloudflare.com/ai-gateway/changelog/

import { ProductReleaseNotes } from "~/components";

{/* <!-- 实际内容位于 /src/content/release-notes/ai-gateway.yaml。在该文件中更新新条目以便在此处显示。有关更多详细信息，请参阅 https://developers.cloudflare.com/style-guide/documentation-content-strategy/content-types/changelog/#yaml-file --> */}

<ProductReleaseNotes />

---

# OpenAI 兼容性

URL: https://developers.cloudflare.com/ai-gateway/chat-completion/

Cloudflare 的 AI 网关提供了一个与 OpenAI 兼容的 `/chat/completions` 端点，可以使用单一 URL 集成多个 AI 提供商。此功能简化了集成过程，允许在不同模型之间无缝切换，而无需进行重大代码修改。

## 端点 URL

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions
```

将 `{account_id}` 和 `{gateway_id}` 替换为您的 Cloudflare 账户和网关 ID。

## 参数

通过更改 `model` 和 `apiKey` 参数来切换提供商。

使用 `{provider}/{model}` 格式指定模型。例如：

- `openai/gpt-4o-mini`
- `google-ai-studio/gemini-2.0-flash`
- `anthropic/claude-3-haiku`

## 示例

### OpenAI SDK

```js
import OpenAI from "openai";
const client = new OpenAI({
	apiKey: "YOUR_PROVIDER_API_KEY", // 提供商 API 密钥
	// 注意：OpenAI 客户端会自动在 URL 末尾添加 /chat/completions，您不应该自己添加。
	baseURL:
		"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat",
});

const response = await client.chat.completions.create({
	model: "google-ai-studio/gemini-2.0-flash",
	messages: [{ role: "user", content: "What is Cloudflare?" }],
});

console.log(response.choices[0].message.content);
```

### cURL

```bash
curl -X POST https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions \
  --header 'Authorization: Bearer {openai_token}' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "google-ai-studio/gemini-2.0-flash",
    "messages": [
      {
        "role": "user",
        "content": "What is Cloudflare?"
      }
    ]
  }'
```

### 通用提供商

您还可以将此模式与[通用端点](/ai-gateway/universal/)结合使用，以在多个提供商之间添加[回退](/ai-gateway/configuration/fallbacks/)。当与通用端点结合使用时，无论来自主模型还是回退模型，每个请求都将返回相同的标准化格式。这种行为意味着您不必在应用中添加额外的解析逻辑。

```ts title="index.ts"
export interface Env {
	AI: Ai;
}

export default {
	async fetch(request: Request, env: Env) {
		return env.AI.gateway("default").run({
			provider: "compat",
			endpoint: "chat/completions",
			headers: {
				authorization: "Bearer ",
			},
			query: {
				model: "google-ai-studio/gemini-2.0-flash",
				messages: [
					{
						role: "user",
						content: "What is Cloudflare?",
					},
				],
			},
		});
	},
};
```

## 支持的提供商

与 OpenAI 兼容的端点支持以下提供商的模型：

- [Anthropic](/ai-gateway/providers/anthropic/)
- [OpenAI](/ai-gateway/providers/openai/)
- [Groq](/ai-gateway/providers/groq/)
- [Mistral](/ai-gateway/providers/mistral/)
- [Cohere](/ai-gateway/providers/cohere/)
- [Perplexity](/ai-gateway/providers/perplexity/)
- [Workers AI](/ai-gateway/providers/workersai/)
- [Google-AI-Studio](/ai-gateway/providers/google-ai-studio/)
- [Grok](/ai-gateway/providers/grok/)
- [DeepSeek](/ai-gateway/providers/deepseek/)
- [Cerebras](/ai-gateway/providers/cerebras/)

---

# 架构

URL: https://developers.cloudflare.com/ai-gateway/demos/

import { GlossaryTooltip, ResourcesBySelector } from "~/components";

了解如何在现有架构中使用 AI 网关。

## 参考架构

探索以下使用 AI 网关的<GlossaryTooltip term="reference architecture">参考架构</GlossaryTooltip>：

<ResourcesBySelector
	types={[
		"reference-architecture",
		"design-guide",
		"reference-architecture-diagram",
	]}
	products={["AI Gateway"]}
/>

---

# 快速开始

URL: https://developers.cloudflare.com/ai-gateway/get-started/

import { Details, DirectoryListing, LinkButton, Render } from "~/components";

在本指南中，您将学习如何创建您的第一个 AI 网关。您可以创建多个网关来控制不同的应用。

## 前提条件

在开始之前，您需要一个 Cloudflare 账户。

<LinkButton variant="primary" href="https://dash.cloudflare.com/sign-up">
	注册
</LinkButton>

## 创建网关

然后，创建一个新的 AI 网关。

<Render file="create-gateway" />

## 选择网关身份验证

设置新网关时，您可以在已验证和未验证的网关之间进行选择。启用已验证的网关需要每个请求都包含有效的授权令牌，这增加了一层额外的安全性。我们建议在存储日志时使用已验证的网关，以防止未经授权的访问，并防范可能增加日志存储使用量并使您难以找到所需数据的无效请求。了解更多关于设置[已验证网关](/ai-gateway/configuration/authentication/)的信息。

## 连接应用

接下来，将您的 AI 提供商连接到您的网关。

AI 网关为您创建的每个网关提供多个端点 - 每个提供商一个端点，以及一个通用端点。要使用 AI 网关，您需要为每个提供商创建自己的账户并提供您的 API 密钥。AI 网关充当这些请求的代理，实现可观察性、缓存等功能。

此外，AI 网关还有一个 [WebSockets API](/ai-gateway/websockets-api/)，它提供单一持久连接，实现持续通信。此 API 支持连接到 AI 网关的所有 AI 提供商，包括那些本身不支持 WebSockets 的提供商。

以下是我们支持的模型提供商列表：

<DirectoryListing folder="ai-gateway/providers" />

如果您没有提供商偏好，请从我们的专门教程之一开始：

- [OpenAI](/ai-gateway/tutorials/deploy-aig-worker/)
- [Workers AI](/ai-gateway/tutorials/create-first-aig-workers/)

## 查看分析

现在您的提供商已连接到 AI 网关，您可以查看通过网关的请求分析。

<Render file="analytics-overview" /> <br />

<Render file="analytics-dashboard" />

:::note[注意]

成本指标是基于请求中发送和接收的令牌数量的估算。虽然此指标可以帮助您监控和预测成本趋势，但请参考您提供商的仪表板获取最准确的成本详情。

:::

## 下一步

- 了解更多关于[缓存](/ai-gateway/configuration/caching/)以实现更快的请求和成本节省，以及[速率限制](/ai-gateway/configuration/rate-limiting/)来控制应用的扩展方式。
- 探索如何为弹性指定模型或提供商[回退](/ai-gateway/configuration/fallbacks/)。
- 了解如何在 [Workers AI](/ai-gateway/providers/workersai/) 上使用低成本的开源模型 - 我们的 AI 推理服务。

---

# 标头术语表

URL: https://developers.cloudflare.com/ai-gateway/glossary/

import { Glossary } from "~/components";

AI 网关支持各种标头来帮助您配置、自定义和管理您的 API 请求。本页面提供了所有支持的标头的完整列表，以及简短描述。

<Glossary product="ai-gateway" />

## 配置层次结构

AI 网关中的设置可以在三个级别配置：**提供商**、**请求**和**网关**。由于相同的设置可以在多个位置配置，以下层次结构确定应用哪个值：

1. **提供商级别标头**：
   仅在使用[通用端点](/ai-gateway/universal/)时相关，这些标头优先于所有其他配置。
2. **请求级别标头**：
   如果未设置提供商级别标头，则应用此级别。
3. **网关级别设置**：
   仅在提供商或请求级别未设置标头时作为默认值。

此层次结构确保一致的行为，优先考虑最具体的配置。使用提供商级别和请求级别标头进行更精细的控制，使用网关设置作为通用默认值。

---

# Cloudflare AI 网关

URL: https://developers.cloudflare.com/ai-gateway/

import {
	CardGrid,
	Description,
	Feature,
	LinkTitleCard,
	Plan,
	RelatedProduct,
} from "~/components";

<Description>

观察和控制您的 AI 应用。

</Description>

<Plan type="all" />

Cloudflare 的 AI 网关让您能够观察和控制您的 AI 应用。通过将应用连接到 AI 网关，您可以通过分析和日志记录深入了解用户如何使用您的应用，然后通过缓存、速率限制以及请求重试、模型回退等功能来控制应用的扩展方式。更好的是 - 只需一行代码即可开始使用。

查看[快速开始指南](/ai-gateway/get-started/)了解如何为您的应用配置 AI 网关。

## 功能特性

<Feature header="分析" href="/ai-gateway/observability/analytics/" cta="查看分析">

查看请求数量、令牌数量以及运行应用所需成本等指标。

</Feature>

<Feature header="日志记录" href="/ai-gateway/observability/logging/" cta="查看日志记录">

深入了解请求和错误信息。

</Feature>

<Feature header="缓存" href="/ai-gateway/configuration/caching/">

直接从 Cloudflare 的缓存提供请求服务，而不是从原始模型提供商，以实现更快的请求和成本节省。

</Feature>

<Feature header="速率限制" href="/ai-gateway/configuration/rate-limiting">

通过限制应用接收的请求数量来控制应用的扩展方式。

</Feature>

<Feature header="请求重试和回退" href="/ai-gateway/configuration/fallbacks/">

通过定义请求重试和模型回退来提高弹性，以防出现错误。

</Feature>

<Feature header="您喜欢的提供商" href="/ai-gateway/providers/">

Workers AI、OpenAI、Azure OpenAI、HuggingFace、Replicate 等都支持 AI 网关。

</Feature>

---

## 相关产品

<RelatedProduct header="Workers AI" href="/workers-ai/" product="workers-ai">

在 Cloudflare 的全球网络上运行由无服务器 GPU 驱动的机器学习模型。

</RelatedProduct>

<RelatedProduct header="Vectorize" href="/vectorize/" product="vectorize">

使用 Vectorize（Cloudflare 的向量数据库）构建全栈 AI 应用。添加 Vectorize 使您能够执行语义搜索、推荐、异常检测等任务，或者可用于为 LLM 提供上下文和记忆。

</RelatedProduct>

## 更多资源

<CardGrid>

<LinkTitleCard
	title="开发者社区 Discord"
	href="https://discord.cloudflare.com"
	icon="discord"
>
	在 Discord 上与 Workers
	社区联系，提出问题，展示您正在构建的内容，并与其他开发者讨论平台。
</LinkTitleCard>

<LinkTitleCard title="使用案例" href="/use-cases/ai/" icon="document">
	了解如何在 Cloudflare 的全球网络上构建和部署雄心勃勃的 AI 应用。
</LinkTitleCard>

<LinkTitleCard
	title="@CloudflareDev"
	href="https://x.com/cloudflaredev"
	icon="x.com"
>
	在 Twitter 上关注 @CloudflareDev，了解产品公告以及 Cloudflare Workers
	的新动态。
</LinkTitleCard>

</CardGrid>

---

# 通用端点

URL: https://developers.cloudflare.com/ai-gateway/universal/

import { Render, Badge } from "~/components";

您可以使用通用端点与每个提供商进行交互。

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}
```

AI 网关为您创建的每个网关提供多个端点 - 每个提供商一个端点，以及一个通用端点。通用端点需要对您的架构进行一些调整，但支持其他功能。这些功能包括，例如，如果请求首次失败时重试请求，或配置[回退模型/提供商](/ai-gateway/configuration/fallbacks/)。

您可以使用通用端点与每个提供商进行交互。载荷期望一个消息数组，每个消息都是具有以下参数的对象：

- `provider`：您想要将此消息定向到的提供商名称。可以是 OpenAI、workers-ai 或我们支持的任何提供商。
- `endpoint`：您尝试访问的提供商 API 的路径名。例如，在 OpenAI 上可以是 `chat/completions`，对于 Workers AI 可能是 [`@cf/meta/llama-3.1-8b-instruct`](/workers-ai/models/llama-3.1-8b-instruct/)。在[每个提供商](/ai-gateway/providers/)的特定部分中查看更多信息。
- `authorization`：联系此提供商时应使用的授权 HTTP 标头内容。这通常以 'Token' 或 'Bearer' 开头。
- `query`：提供商在其官方 API 中期望的载荷。

## cURL 示例

<Render file="universal-gateway-example" />

上述请求将发送到 Workers AI 推理 API，如果失败，它将继续发送到 OpenAI。您可以通过在数组中添加另一个 JSON 来添加任意数量的回退。

## WebSockets API <Badge text="beta" variant="tip" size="small" />

通用端点还可以通过 [WebSockets API](/ai-gateway/websockets-api/) 访问，该 API 提供单一持久连接，实现持续通信。此 API 支持连接到 AI 网关的所有 AI 提供商，包括那些本身不支持 WebSockets 的提供商。

## WebSockets 示例

```javascript
import WebSocket from "ws";
const ws = new WebSocket(
	"wss://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/",
	{
		headers: {
			"cf-aig-authorization": "Bearer AI_GATEWAY_TOKEN",
		},
	},
);

ws.send(
	JSON.stringify({
		type: "universal.create",
		request: {
			eventId: "my-request",
			provider: "workers-ai",
			endpoint: "@cf/meta/llama-3.1-8b-instruct",
			headers: {
				Authorization: "Bearer WORKERS_AI_TOKEN",
				"Content-Type": "application/json",
			},
			query: {
				prompt: "tell me a joke",
			},
		},
	}),
);

ws.on("message", function incoming(message) {
	console.log(message.toString());
});
```

## Workers 绑定示例

import { WranglerConfig } from "~/components";

<WranglerConfig>

```toml title="wrangler.toml"
[ai]
binding = "AI"
```

</WranglerConfig>

```typescript title="src/index.ts"
type Env = {
	AI: Ai;
};

export default {
	async fetch(request: Request, env: Env) {
		return env.AI.gateway("my-gateway").run({
			provider: "workers-ai",
			endpoint: "@cf/meta/llama-3.1-8b-instruct",
			headers: {
				authorization: "Bearer my-api-token",
			},
			query: {
				prompt: "tell me a joke",
			},
		});
	},
};
```

## 标头配置层次结构

通用端点允许您设置回退模型或提供商，并为每个提供商或请求自定义标头。您可以在三个级别配置标头：

1. **提供商级别**：特定于特定提供商的标头。
2. **请求级别**：包含在各个请求中的标头。
3. **网关设置**：在网关仪表板中配置的默认标头。

由于相同的设置可以在多个位置配置，AI 网关应用层次结构来确定哪个配置优先：

- **提供商级别标头**覆盖所有其他配置。
- **请求级别标头**在未设置提供商级别标头时使用。
- **网关级别设置**仅在提供商或请求级别未配置标头时使用。

此层次结构确保一致的行为，优先考虑最具体的配置。使用提供商级别和请求级别标头进行精细控制，使用网关设置作为通用默认值。

## 层次结构示例

此示例演示了在不同级别设置的标头如何影响缓存行为：

- **请求级别标头**：`cf-aig-cache-ttl` 设置为 `3600` 秒，默认情况下将此缓存持续时间应用于请求。
- **提供商级别标头**：对于回退提供商（OpenAI），`cf-aig-cache-ttl` 明确设置为 `0` 秒，覆盖请求级别标头，并在使用 OpenAI 作为提供商时禁用响应缓存。

这显示了提供商级别标头如何优先于请求级别标头，允许对缓存行为进行精细控制。

```bash
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id} \
  --header 'Content-Type: application/json' \
  --header 'cf-aig-cache-ttl: 3600' \
  --data '[
    {
      "provider": "workers-ai",
      "endpoint": "@cf/meta/llama-3.1-8b-instruct",
      "headers": {
        "Authorization": "Bearer {cloudflare_token}",
        "Content-Type": "application/json"
      },
      "query": {
        "messages": [
          {
            "role": "system",
            "content": "You are a friendly assistant"
          },
          {
            "role": "user",
            "content": "What is Cloudflare?"
          }
        ]
      }
    },
    {
      "provider": "openai",
      "endpoint": "chat/completions",
      "headers": {
        "Authorization": "Bearer {open_ai_token}",
        "Content-Type": "application/json",
        "cf-aig-cache-ttl": "0"
      },
      "query": {
        "model": "gpt-4o-mini",
        "stream": true,
        "messages": [
          {
            "role": "user",
            "content": "What is Cloudflare?"
          }
        ]
      }
    }
  ]'
```

---

# 使用 API 添加人工反馈

URL: https://developers.cloudflare.com/ai-gateway/evaluations/add-human-feedback-api/

本指南将引导您完成使用 Cloudflare API 将人工反馈添加到 AI 网关请求的步骤。您将学习如何检索相关的请求日志，并使用 API 提交反馈。

如果您希望通过仪表板添加人工反馈，请参阅[添加人工反馈](/ai-gateway/evaluations/add-human-feedback/)。

## 1. 创建 API 令牌

1. [创建 API 令牌](/fundamentals/api/get-started/create-token/)，具有以下权限：

- `AI 网关 - 读取`
- `AI 网关 - 编辑`

2. 获取您的[账户 ID](/fundamentals/account/find-account-and-zone-ids/)。
3. 使用该 API 令牌和账户 ID，向 Cloudflare API 发送 [`POST` 请求](/api/resources/ai_gateway/methods/create/)。

## 2. 使用 API 令牌

获得令牌后，您可以通过将其作为持有者令牌添加到授权标头中，在 API 请求中使用它。以下是在请求中使用它的示例：

```bash
curl "https://api.cloudflare.com/client/v4/accounts/{account_id}/ai-gateway/gateways/{gateway_id}/logs" \
--header "Authorization: Bearer {your_api_token}"
```

在上面的请求中：

- 将 `{account_id}` 和 `{gateway_id}` 替换为您的特定 Cloudflare 账户和网关详细信息。
- 将 `{your_api_token}` 替换为您刚刚创建的 API 令牌。

## 3. 检索 `cf-aig-log-id`

`cf-aig-log-id` 是您要添加反馈的特定日志条目的唯一标识符。以下是获取此标识符的两种方法。

### 方法 1：在请求响应中定位 `cf-aig-log-id`

此方法允许您直接在 AI 网关返回的响应标头中找到 `cf-aig-log-id`。如果您可以访问原始 API 响应，这是最直接的方法。

以下步骤概述了如何执行此操作。

1. **向 AI 网关发出请求**：这可能是您的应用程序发送到 AI 网关的请求。一旦发出请求，响应将包含各种元数据。
2. **检查响应标头**：响应将包含一个名为 `cf-aig-log-id` 的标头。这是您提交反馈所需的标识符。

在下面的示例中，`cf-aig-log-id` 是 `01JADMCQQQBWH3NXZ5GCRN98DP`。

```json
{
	"status": "success",
	"headers": {
		"cf-aig-log-id": "01JADMCQQQBWH3NXZ5GCRN98DP"
	},
	"data": {
		"response": "Sample response data"
	}
}
```

### 方法 2：通过 API 检索 `cf-aig-log-id` (GET 请求)

如果您在响应正文中没有 `cf-aig-log-id`，或者您需要在事后访问它，您可以通过使用 [Cloudflare API](/api/resources/ai_gateway/subresources/logs/methods/list/) 查询日志来检索它。

以下步骤概述了如何执行此操作。

1. **发送 GET 请求以检索日志**：您可以查询特定时间范围或特定请求的 AI 网关日志。该请求将返回一个日志列表，每个日志都包含其自己的 `id`。
   以下是示例请求：

```bash
GET https://api.cloudflare.com/client/v4/accounts/{account_id}/ai-gateway/gateways/{gateway_id}/logs
```

将 `{account_id}` 和 `{gateway_id}` 替换为您的特定账户和网关详细信息。

2. **搜索相关日志**：在 GET 请求的响应中，找到您希望提交反馈的特定日志条目。每个日志条目都将包含 `id`。

在下面的示例中，`id` 是 `01JADMCQQQBWH3NXZ5GCRN98DP`。

```json
{
	"result": [
		{
			"id": "01JADMCQQQBWH3NXZ5GCRN98DP",
			"cached": true,
			"created_at": "2019-08-24T14:15:22Z",
			"custom_cost": true,
			"duration": 0,
			"id": "string",
			"metadata": "string",
			"model": "string",
			"model_type": "string",
			"path": "string",
			"provider": "string",
			"request_content_type": "string",
			"request_type": "string",
			"response_content_type": "string",
			"status_code": 0,
			"step": 0,
			"success": true,
			"tokens_in": 0,
			"tokens_out": 0
		}
	],
	"result_info": {
		"count": 0,
		"max_cost": 0,
		"max_duration": 0,
		"max_tokens_in": 0,
		"max_tokens_out": 0,
		"max_total_tokens": 0,
		"min_cost": 0,
		"min_duration": 0,
		"min_tokens_in": 0,
		"min_tokens_out": 0,
		"min_total_tokens": 0,
		"page": 0,
		"per_page": 0,
		"total_count": 0
	},
	"success": true
}
```

### 方法 3：通过绑定检索 `cf-aig-log-id`

您还可以使用绑定来检索 `cf-aig-log-id`，这简化了过程。以下是如何直接检索日志 ID：

```js
const resp = await env.AI.run(
	"@cf/meta/llama-3-8b-instruct",
	{
		prompt: "tell me a joke",
	},
	{
		gateway: {
			id: "my_gateway_id",
		},
	},
);

const myLogId = env.AI.aiGatewayLogId;
```

:::note[注意：]

`aiGatewayLogId` 属性将仅保存最后一次推理调用的日志 ID。

:::

## 4. 通过 PATCH 请求提交反馈

一旦您有了 API 令牌和 `cf-aig-log-id`，您就可以发送 PATCH 请求来提交反馈。使用以下 URL 格式，将 `{account_id}`、`{gateway_id}` 和 `{log_id}` 替换为您的特定详细信息：

```bash
PATCH https://api.cloudflare.com/client/v4/accounts/{account_id}/ai-gateway/gateways/{gateway_id}/logs/{log_id}
```

在请求正文中添加以下内容以提交正面反馈：

```json
{
	"feedback": 1
}
```

在请求正文中添加以下内容以提交负面反馈：

```json
{
	"feedback": -1
}
```

## 5. 验证反馈提交

您可以通过两种方式验证反馈提交：

- **通过 [Cloudflare 仪表板](https://dash.cloudflare.com)**：在 AI 网关界面上检查更新的反馈。
- **通过 API**：发送另一个 GET 请求以检索更新的日志条目并确认反馈已记录。

---

# 使用 Worker 绑定添加人工反馈

URL: https://developers.cloudflare.com/ai-gateway/evaluations/add-human-feedback-bindings/

本指南解释了如何使用 Worker 绑定为 AI 网关评估提供人工反馈。

## 1. 运行 AI 评估

首先通过您的 AI 网关向 AI 模型发送一个提示。

```javascript
const resp = await env.AI.run(
	"@cf/meta/llama-3.1-8b-instruct",
	{
		prompt: "tell me a joke",
	},
	{
		gateway: {
			id: "my-gateway",
		},
	},
);

const myLogId = env.AI.aiGatewayLogId;
```

让用户与 AI 响应互动或评估它。这种互动将为您发送回 AI 网关的反馈提供信息。

## 2. 发送人工反馈

使用 [`patchLog()`](/ai-gateway/integrations/worker-binding-methods/#31-patchlog-send-feedback) 方法为 AI 评估提供反馈。

```javascript
await env.AI.gateway("my-gateway").patchLog(myLogId, {
	feedback: 1, // 所有字段都是可选的；设置适合您用例的值
	score: 100,
	metadata: {
		user: "123", // 可选的元数据以提供额外的上下文
	},
});
```

## 反馈参数解释

- `feedback`: `-1` 表示负面，`1` 表示正面，`0` 被认为未评估。
- `score`: 介于 0 和 100 之间的数字。
- `metadata`: 包含额外上下文信息的对象。

### patchLog: 发送反馈

`patchLog` 方法允许您为特定的日志 ID 发送反馈、分数和元数据。所有对象属性都是可选的，因此您可以包含参数的任意组合：

```javascript
gateway.patchLog("my-log-id", {
	feedback: 1,
	score: 100,
	metadata: {
		user: "123",
	},
});
```

返回：`Promise<void>` (确保 `await` 请求。)

---

# 使用仪表板添加人工反馈

URL: https://developers.cloudflare.com/ai-gateway/evaluations/add-human-feedback/

人工反馈是评估 AI 模型性能的一个宝贵指标。通过整合人工反馈，您可以更深入地了解模型的响应如何被感知，以及从以用户为中心的角度看其表现如何。这些反馈随后可用于评估中以计算性能指标，从而推动优化，最终增强您的 AI 应用程序的可靠性、准确性和效率。

人工反馈基于直接的人工输入来衡量数据集的性能。该指标计算为日志中收到的正面反馈（赞）的百分比，这些日志在 Cloudflare 仪表板的日志选项卡中进行注释。这种反馈通过考虑对其输出的真实世界评估来帮助改进模型性能。

本教程将指导您完成使用 [Cloudflare 仪表板](https://dash.cloudflare.com/) 在 AI 网关的评估中添加人工反馈的过程。

在下一个指南中，您可以[了解如何通过 API 添加人工反馈](/ai-gateway/evaluations/add-human-feedback-api/)。

## 1. 登录仪表板

1. 登录 [Cloudflare 仪表板](https://dash.cloudflare.com/) 并选择您的账户。
2. 转到 **AI** > **AI 网关**。

## 2. 访问日志选项卡

1. 转到**日志**。
2. 日志选项卡显示与您的数据集相关的所有日志。这些日志显示关键信息，包括：
   - 时间戳：交互发生的时间。
   - 状态：请求是成功、缓存还是失败。
   - 模型：请求中使用的模型。
   - 令牌：响应消耗的令牌数。
   - 成本：基于令牌使用量的成本。
   - 持续时间：完成响应所花费的时间。
   - 反馈：您可以在此处为每个日志提供人工反馈。

## 3. 提供人工反馈

1. 单击您要查看的日志条目。这将展开日志，让您看到更详细的信息。
2. 在展开的日志中，您可以查看其他详细信息，例如：
   - 用户提示。
   - 模型响应。
   - HTTP 响应详细信息。
   - 端点信息。
3. 您将看到两个图标：
   - 赞：表示正面反馈。
   - 踩：表示负面反馈。
4. 根据您对该特定日志条目的模型响应的评价，单击赞或踩图标。

## 4. 评估人工反馈

在为您的日志提供反馈后，它将成为评估过程的一部分。

当您运行评估时（如[设置评估](/ai-gateway/evaluations/set-up-evaluations/)指南中所述），人工反馈指标将根据收到赞反馈的日志百分比计算。

:::note[注意]

您需要选择人工反馈作为评估器才能接收其指标。

:::

## 5. 查看结果

运行评估后，在评估选项卡上查看结果。
您将能够根据成本、速度以及现在的人工反馈（表示为正面反馈（赞）的百分比）查看模型的性能。

人工反馈分数以百分比显示，显示数据库中正面评价响应的分布。

有关运行评估的更多信息，请参阅文档[设置评估](/ai-gateway/evaluations/set-up-evaluations/)。

---

# 评估

URL: https://developers.cloudflare.com/ai-gateway/evaluations/

了解应用程序的性能对优化至关重要。开发者通常有不同的优先级，找到最优解决方案涉及平衡成本、延迟和准确性等关键因素。一些人优先考虑低延迟响应，而其他人则专注于准确性或成本效率。

AI 网关的评估提供了在如何优化您的 AI 应用程序方面做出明智决策所需的数据。无论是调整模型、提供商还是提示，此功能都能提供关于性能、速度和成本关键指标的洞察。它使开发者能够更好地理解其应用程序的行为，确保提高准确性、可靠性和客户满意度。

评估使用数据集，数据集是为分析而存储的日志集合。您可以通过在日志选项卡中应用过滤器来创建数据集，这有助于缩小特定日志的范围以进行评估。

我们朝着全面 AI 评估迈出的第一步始于人工反馈（目前处于开放测试版）。我们将继续构建和扩展 AI 网关，添加更多评估器。

[了解如何设置评估](/ai-gateway/evaluations/set-up-evaluations/)，包括创建数据集、选择评估器和运行评估过程。

---

# 设置评估

URL: https://developers.cloudflare.com/ai-gateway/evaluations/set-up-evaluations/

本指南将引导您完成在 AI 网关中设置评估的过程。这些步骤在 [Cloudflare 仪表板](https://dash.cloudflare.com/) 中完成。

## 1. 选择或创建数据集

数据集是为分析而存储的日志集合，可用于评估。您可以通过在日志选项卡中应用过滤器来创建数据集。数据集将根据设置的过滤器自动更新。

### 从日志选项卡设置数据集

1. 应用过滤器以缩小日志范围。过滤器选项包括提供商、令牌数量、请求状态等。
2. 选择**创建数据集**以存储过滤后的日志以供将来分析。

您可以通过从日志选项卡中选择**管理数据集**来管理数据集。

:::note[注意]

请记住，数据集当前使用 `AND` 连接，因此每个过滤器只能有一项（例如，一个模型或一个提供商）。未来的更新将允许在数据集创建方面具有更大的灵活性。

:::

### 可用过滤器列表

| 过滤器类别 | 过滤器选项                        | 过滤器描述               |
| ---------- | --------------------------------- | ------------------------ |
| 状态       | 错误，状态                        | 错误类型或状态。         |
| 缓存       | 已缓存，未缓存                    | 基于是否被缓存。         |
| 提供商     | 特定提供商                        | 选定的 AI 提供商。       |
| AI 模型    | 特定模型                          | 选定的 AI 模型。         |
| 成本       | 小于，大于                        | 成本，指定阈值。         |
| 请求类型   | 通用，Workers AI 绑定，WebSockets | 请求的类型。             |
| 令牌       | 总令牌，输入令牌，输出令牌        | 令牌计数（小于或大于）。 |
| 持续时间   | 小于，大于                        | 请求持续时间。           |
| 反馈       | 等于，不等于（赞，踩，无反馈）    | 反馈类型。               |
| 元数据键   | 等于，不等于                      | 特定元数据键。           |
| 元数据值   | 等于，不等于                      | 特定元数据值。           |
| 日志 ID    | 等于，不等于                      | 特定日志 ID。            |
| 事件 ID    | 等于，不等于                      | 特定事件 ID。            |

## 2. 选择评估器

创建数据集后，选择评估参数：

- 成本：计算数据集中推理请求的平均成本（仅适用于具有[成本数据](/ai-gateway/observability/costs/)的请求）。
- 速度：计算数据集中推理请求的平均持续时间。
- 性能：
  - 人工反馈：基于人工反馈衡量性能，通过日志中赞成票的百分比计算，从日志选项卡中注释。

:::note[注意]

未来的更新将引入更多评估器以扩展性能分析能力。

:::

## 3. 命名、审查和运行评估

1. 为您的评估创建一个唯一的名称，以便在仪表板中引用它。
2. 审查所选的数据集和评估器。
3. 选择**运行**以开始该过程。

## 4. 审查和分析结果

评估结果将显示在评估选项卡中。结果显示评估的状态（例如，进行中、已完成或错误）。将显示所选评估器的指标，不包括任何缺少字段的日志。您还将看到用于计算每个指标的日志数量。

虽然数据集会根据过滤器自动更新，但评估不会。如果要评估新日志，您必须创建新的评估。

使用这些见解根据您应用程序的优先级进行优化。根据结果，您可以选择：

- 更改模型或[提供商](/ai-gateway/providers/)
- 调整您的提示
- 探索进一步的优化，例如设置[检索增强生成 (RAG)](/reference-architecture/diagrams/ai/ai-rag/)

---

# 设置护栏

URL: https://developers.cloudflare.com/ai-gateway/guardrails/set-up-guardrail/

将护栏添加到任何网关，以开始评估并可能修改响应。

1. 登录 [Cloudflare 仪表板](https://dash.cloudflare.com/) 并选择您的账户。
2. 转到 **AI** > **AI 网关**。
3. 选择一个网关。
4. 转到**护栏**。
5. 将开关切换到**开启**。
6. 要自定义类别，请选择**更改** > **配置特定类别**。
7. 更新您对护栏如何处理特定提示或响应的选择（**标记**、**忽略**、**阻止**）。
   - 对于**提示**：护栏将根据您的安全策略评估和转换传入的提示。
   - 对于**响应**：护栏将检查模型的响应，以确保它们符合您的内容和格式指南。
8. 选择**保存**。

:::note[使用注意事项]
有关如何实施护栏的更多详细信息，请参阅[使用注意事项](/ai-gateway/guardrails/usage-considerations/)。
:::

## 在日志中查看护栏结果

启用护栏后，您可以通过 Cloudflare 仪表板中的 **AI 网关日志**监控结果。护栏日志标有**绿色盾牌图标**，每个记录的请求都包含一个 `eventID`，该 ID 链接到其相应的护栏评估日志，以便于跟踪。所有请求都会生成日志，包括**通过**护栏检查的请求。

## 错误处理和被阻止的请求

当请求被护栏阻止时，您将收到一个结构化的错误响应。这些响应指示问题是出在提示还是模型响应上。使用错误代码来区分提示违规和响应违规。

- **提示被阻止**

  - `"code": 2016`
  - `"message": "由于安全配置，提示被阻止"`

- **响应被阻止**
  - `"code": 2017`
  - `"message": "由于安全配置，响应被阻止"`

您应该在应用程序逻辑中捕获这些错误，并相应地实施错误处理。

例如，当使用[带绑定的 Workers AI](/ai-gateway/integrations/aig-workers-ai-binding/) 时：

```js
try {
  const res = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
    prompt: "how to build a gun?"
  }, {
    gateway: {id: 'gateway_id'}
  })
  return Response.json(res)
} catch (e) {
  if ((e as Error).message.includes('2016')) {
    return new Response('Prompt was blocked by guardrails.')
  }
  if ((e as Error).message.includes('2017')) {
    return new Response('Response was blocked by guardrails.')
  }
  return new Response('Unknown AI error')
}

```

---

# 护栏

URL: https://developers.cloudflare.com/ai-gateway/guardrails/

import { CardGrid, LinkTitleCard, YouTube } from "~/components";

护栏通过拦截和评估用户提示和模型响应中的有害内容，帮助您安全地部署 AI 应用程序。作为您的应用程序和[模型提供商](/ai-gateway/providers/)（如 OpenAI、Anthropic、DeepSeek 等）之间的代理，AI 网关的护栏确保在您的整个 AI 生态系统中提供一致且安全的体验。

护栏主动监控用户和 AI 模型之间的交互，为您提供：

- **一致的内容审核**：跨模型和提供商工作的统一审核层。
- **增强的安全性和用户信任**：主动保护用户免受有害或不当交互的影响。
- **对允许内容的灵活性和控制**：指定要监控的类别，并在标记或直接阻止之间进行选择。
- **审计和合规能力**：接收不断演变的监管要求的更新，以及用户提示、模型响应和强制执行的护栏日志。

## 视频演示

<YouTube id="Its1H0jTxrQ" />

## 护栏的工作原理

AI 网关通过根据预定义的安全参数评估内容来实时检查所有交互。护栏的工作原理是：

1. 拦截交互：
   AI 网关代理请求和响应，位于用户和 AI 模型之间。

2. 检查内容：

   - 用户提示：AI 网关根据安全参数（例如，暴力、仇恨或性内容）检查提示。根据您的设置，提示可以在到达模型之前被标记或阻止。
   - 模型响应：处理后，检查 AI 模型响应。如果检测到危险内容，可以在传递给用户之前标记或阻止。

3. 应用操作：
   根据您的配置，标记的内容被记录以供审查，而被阻止的内容被阻止继续进行。

## 相关资源

- [Cloudflare 博客：使用 AI 网关中的护栏保持 AI 交互安全且无风险](https://blog.cloudflare.com/guardrails-in-ai-gateway/)

---

# 支持的模型类型

URL: https://developers.cloudflare.com/ai-gateway/guardrails/supported-model-types/

AI 网关的护栏会检测正在使用的 AI 模型类型，并相应地应用安全检查：

- **文本生成模型**：对提示和响应都进行评估。
- **嵌入模型**：仅评估提示，因为响应由数字嵌入组成，对内容审核没有意义。
- **未知模型**：如果无法确定模型类型，则仅评估提示，而响应会绕过护栏。

:::note[注意]

护栏尚不支持流式响应。计划在未来的更新中支持流式传输。

:::

---

# 使用注意事项

URL: https://developers.cloudflare.com/ai-gateway/guardrails/usage-considerations/

护栏目前在 [Workers AI](/workers-ai/) 上使用 [Llama Guard 3 8B](https://ai.meta.com/research/publications/llama-guard-llm-based-input-output-safeguard-for-human-ai-conversations/) 来执行内容评估。底层模型将来可能会更新，我们将在护栏中反映这些更改。

由于护栏在 Workers AI 上运行，启用它会产生 Workers AI 的使用量。您可以通过 Workers AI 仪表板监控使用情况。

## 其他注意事项

- **模型可用性**：如果至少一个危险类别设置为 `block`，但 AI 网关无法从 Workers AI 收到响应，则请求将被阻止。相反，如果一个危险类别设置为 `flag` 并且 AI 网关无法从 Workers AI 获得响应，则请求将继续进行而不进行评估。这种方法优先考虑可用性，即使在无法进行内容评估时也允许请求继续。
- **延迟影响**：启用护栏会增加一些延迟。启用护栏会给请求增加额外的延迟。通常，在 Workers AI 上使用 Llama Guard 3 8B 的评估会给每个请求增加大约 500 毫秒的延迟。然而，较大的请求可能会经历增加的延迟，尽管这种增加不是线性的。在平衡安全性和性能时请考虑这一点。
- **处理长内容**：在评估长提示或响应时，护栏会自动将内容分段成较小的块，通过单独的护栏请求处理每个块。这种方法确保了全面的审核，但可能会导致较长输入的延迟增加。
- **支持的语言**：Llama Guard 3.3 8B 支持以下语言的内容安全分类：英语、法语、德语、印地语、意大利语、葡萄牙语、西班牙语和泰语。
- **流式支持**：使用护栏时不支持流式传输。

:::note

Llama Guard 按“原样”提供，不作任何陈述、保证或担保。博客、开发者文档或其他参考资料中包含的任何规则或示例仅供参考。您承认并理解，您对使用 AI 网关的结果和成果负责。

:::

---

# Workers AI

URL: https://developers.cloudflare.com/ai-gateway/integrations/aig-workers-ai-binding/

import { Render, PackageManagers, WranglerConfig } from "~/components";

本指南将引导您完成 Workers AI 项目的设置和部署。您将使用 [Workers](/workers/)、AI 网关绑定和一个大型语言模型 (LLM)，在 Cloudflare 全球网络上部署您的第一个由 AI 驱动的应用程序。

## 先决条件

<Render file="prereqs" product="workers" />

## 1. 创建一个 Worker 项目

您将使用 create-Cloudflare CLI (C3) 创建一个新的 Worker 项目。C3 是一个命令行工具，旨在帮助您设置和部署新的应用程序到 Cloudflare。

通过运行以下命令创建一个名为 `hello-ai` 的新项目：

<PackageManagers type="create" pkg="cloudflare@latest" args={"hello-ai"} />

运行 `npm create cloudflare@latest` 将提示您安装 create-cloudflare 包并引导您完成设置。C3 还将安装 [Wrangler](/workers/wrangler/)，即 Cloudflare 开发者平台 CLI。

<Render
	file="c3-post-run-steps"
	product="workers"
	params={{
		category: "hello-world",
		type: "Worker only",
		lang: "TypeScript",
	}}
/>

这将创建一个新的 `hello-ai` 目录。您的新 `hello-ai` 目录将包括：

- 一个位于 `src/index.ts` 的 "Hello World" Worker。
- 一个 [Wrangler 配置文件](/workers/wrangler/configuration/)

进入您的应用程序目录：

```bash
cd hello-ai
```

## 2. 将您的 Worker 连接到 Workers AI

您必须为您的 Worker 创建一个 AI 绑定以连接到 Workers AI。绑定允许您的 Worker 与 Cloudflare 开发者平台上的资源（如 Workers AI）进行交互。

要将 Workers AI 绑定到您的 Worker，请将以下内容添加到您的 [Wrangler 配置文件](/workers/wrangler/configuration/)的末尾：

<WranglerConfig>

```toml title="wrangler.toml"
[ai]
binding = "AI"
```

</WranglerConfig>

您的绑定在您的 Worker 代码中通过 [`env.AI`](/workers/runtime-apis/handlers/fetch/) 可用。

下一步您将需要您的 `gateway id`。您可以在[本教程中了解如何创建 AI 网关](/ai-gateway/get-started/)。

## 3. 在您的 Worker 中运行包含 AI 网关的推理任务

您现在已准备好在您的 Worker 中运行推理任务。在这种情况下，您将使用一个 LLM，[`llama-3.1-8b-instruct-fast`](/workers-ai/models/llama-3.1-8b-instruct-fast/)，来回答一个问题。您的网关 ID 可以在仪表板上找到。

使用以下代码更新您的 `hello-ai` 应用程序目录中的 `index.ts` 文件：

```typescript title="src/index.ts" {78-81}
export interface Env {
	// 如果您在 [Wrangler 配置文件](/workers/wrangler/configuration/) 中为 'binding' 设置了另一个名称，
	// 请将 "AI" 替换为您定义的变量名。
	AI: Ai;
}

export default {
	async fetch(request, env): Promise<Response> {
		// 在此处指定网关标签和其他选项
		const response = await env.AI.run(
			"@cf/meta/llama-3.1-8b-instruct-fast",
			{
				prompt: "What is the origin of the phrase Hello, World",
			},
			{
				gateway: {
					id: "GATEWAYID", // 在此处使用您的网关标签
					skipCache: true, // 可选：如果需要，跳过缓存
				},
			},
		);

		// 将 AI 响应作为 JSON 对象返回
		return new Response(JSON.stringify(response), {
			headers: { "Content-Type": "application/json" },
		});
	},
} satisfies ExportedHandler<Env>;
```

至此，您已经为您的 Worker 创建了一个 AI 绑定，并配置了您的 Worker 以能够执行 Llama 3.1 模型。现在，您可以在全球部署之前在本地测试您的项目。

## 4. 使用 Wrangler 进行本地开发

在您的项目目录中，通过运行 [`wrangler dev`](/workers/wrangler/commands/#dev) 在本地测试 Workers AI：

```bash
npx wrangler dev
```

<Render file="ai-local-usage-charges" product="workers" />

运行 `wrangler dev` 后，系统会提示您登录。当您运行 `npx wrangler dev` 时，Wrangler 会给您一个 URL（很可能是 `localhost:8787`）来审查您的 Worker。在您访问 Wrangler 提供的 URL 后，您将看到类似以下示例的消息：

````json
{
  "response": "A fascinating question!\n\nThe phrase \"Hello, World!\" originates from a simple computer program written in the early days of programming. It is often attributed to Brian Kernighan, a Canadian computer scientist and a pioneer in the field of computer programming.\n\nIn the early 1970s, Kernighan, along with his colleague Dennis Ritchie, were working on the C programming language. They wanted to create a simple program that would output a message to the screen to demonstrate the basic structure of a program. They chose the phrase \"Hello, World!\" because it was a simple and recognizable message that would illustrate how a program could print text to the screen.\n\nThe exact code was written in the 5th edition of Kernighan and Ritchie's book \"The C Programming Language,\" published in 1988. The code, literally known as \"Hello, World!\" is as follows:\n\n```
main()
{
  printf(\"Hello, World!\");
}
```\n\nThis code is still often used as a starting point for learning programming languages, as it demonstrates how to output a simple message to the console.\n\nThe phrase \"Hello, World!\" has since become a catch-all phrase to indicate the start of a new program or a small test program, and is widely used in computer science and programming education.\n\nSincerely, I'm glad I could help clarify the origin of this iconic phrase for you!"
}
````

## 5. 部署您的 AI Worker

在将您的 AI Worker 全球部署之前，请通过运行以下命令使用您的 Cloudflare 账户登录：

```bash
npx wrangler login
```

您将被引导到一个网页，要求您登录 Cloudflare 仪表板。登录后，系统会询问您是否允许 Wrangler 对您的 Cloudflare 账户进行更改。向下滚动并选择 **允许** 以继续。

最后，部署您的 Worker，使您的项目可以在互联网上访问。要部署您的 Worker，请运行：

```bash
npx wrangler deploy
```

部署后，您的 Worker 将在类似以下的 URL 上可用：

```bash
https://hello-ai.<YOUR_SUBDOMAIN>.workers.dev
```

您的 Worker 将被部署到您的自定义 [`workers.dev`](/workers/configuration/routing/workers-dev/) 子域。您现在可以访问该 URL 来运行您的 AI Worker。

通过完成本教程，您创建了一个 Worker，通过 AI 网关绑定将其连接到 Workers AI，并成功使用 Llama 3.1 模型运行了一个推理任务。

---

# Vercel AI SDK

URL: https://developers.cloudflare.com/ai-gateway/integrations/vercel-ai-sdk/

[Vercel AI SDK](https://sdk.vercel.ai/) 是一个用于构建 AI 应用程序的 TypeScript 库。该 SDK 支持许多不同的 AI 提供商、流式完成工具等等。

要在 AI SDK 内部使用 Cloudflare AI 网关，您可以为大多数支持的提供商配置自定义"网关 URL"。以下是一些工作示例。

## 示例

### OpenAI

如果您在 AI SDK 中使用 `openai` 提供商，您可以使用 `createOpenAI` 创建自定义设置，传递您的 OpenAI 兼容 AI 网关 URL：

```typescript
import { createOpenAI } from "@ai-sdk/openai";

const openai = createOpenAI({
	baseURL: `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai`,
});
```

### Anthropic

如果您在 AI SDK 中使用 `anthropic` 提供商，您可以使用 `createAnthropic` 创建自定义设置，传递您的 Anthropic 兼容 AI 网关 URL：

```typescript
import { createAnthropic } from "@ai-sdk/anthropic";

const anthropic = createAnthropic({
	baseURL: `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/anthropic`,
});
```

### Google AI Studio

如果您在 AI SDK 中使用 Google AI Studio 提供商，您需要在 Google AI Studio 兼容的 AI 网关 URL 后附加 `/v1beta` 以避免错误。需要 `/v1beta` 路径，因为 Google AI Studio 的 API 在其端点结构中包含此内容，而 AI SDK 单独设置模型名称。这确保了与 Google 的 API 版本控制的兼容性。

```typescript
import { createGoogleGenerativeAI } from "@ai-sdk/google";

const google = createGoogleGenerativeAI({
	baseURL: `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/google-ai-studio/v1beta`,
});
```

### 从 AI SDK 检索 `log id`

当调用 SDK 时，您可以从响应标头访问 AI 网关 `log id`。

```typescript
const result = await generateText({
	model: anthropic("claude-3-sonnet-20240229"),
	messages: [],
});
console.log(result.response.headers["cf-aig-log-id"]);
```

### 其他提供商

对于上面未列出的其他提供商，您可以通过为任何 AI 提供商创建自定义实例并传递您的 AI 网关 URL 来遵循类似的模式。有关查找特定于您的提供商的 AI 网关 URL 的帮助，请参阅[支持的提供商页面](/ai-gateway/providers)。

---

# AI 网关绑定方法

URL: https://developers.cloudflare.com/ai-gateway/integrations/worker-binding-methods/

import { Render, PackageManagers } from "~/components";

本指南概述了如何使用最新的 Cloudflare Workers AI 网关绑定方法。您将学习如何设置 AI 网关绑定、访问新方法以及将它们集成到您的 Worker 中。

## 1. 将 AI 绑定添加到您的 Worker

要将您的 Worker 连接到 Workers AI，请将以下内容添加到您的 [Wrangler 配置文件](/workers/wrangler/configuration/)中：

import { WranglerConfig } from "~/components";

<WranglerConfig>

```toml title="wrangler.toml"
[ai]
binding = "AI"
```

</WranglerConfig>

此配置设置了可在您的 Worker 代码中作为 `env.AI` 访问的 AI 绑定。

<Render file="wrangler-typegen" product="workers" />

## 2. Workers AI + 网关的基本用法

要使用 Workers AI 和 AI 网关执行推理任务，您可以使用以下代码：

```typescript title="src/index.ts"
const resp = await env.AI.run(
	"@cf/meta/llama-3.1-8b-instruct",
	{
		prompt: "tell me a joke",
	},
	{
		gateway: {
			id: "my-gateway",
		},
	},
);
```

此外，您可以使用以下代码访问最新的请求日志 ID：

```typescript
const myLogId = env.AI.aiGatewayLogId;
```

## 3. 访问网关绑定

您可以使用以下代码访问您的 AI 网关绑定：

```typescript
const gateway = env.AI.gateway("my-gateway");
```

一旦您有了网关实例，您就可以使用以下方法：

### 3.1. `patchLog`：发送反馈

`patchLog` 方法允许您为特定的日志 ID 发送反馈、评分和元数据。所有对象属性都是可选的，因此您可以包含任何参数组合：

```typescript
gateway.patchLog("my-log-id", {
	feedback: 1,
	score: 100,
	metadata: {
		user: "123",
	},
});
```

- **返回**：`Promise<void>`（确保 `await` 请求。）
- **用例示例**：使用用户反馈或附加元数据更新日志条目。

### 3.2. `getLog`：读取日志详情

`getLog` 方法检索特定日志 ID 的详细信息。它返回一个 `Promise<AiGatewayLog>` 类型的对象。如果此类型缺失，请确保您已运行 [`wrangler types`](/workers/languages/typescript/#generate-types)。

```typescript
const log = await gateway.getLog("my-log-id");
```

- **返回**：`Promise<AiGatewayLog>`
- **用例示例**：检索日志信息以进行调试或分析。

### 3.3. `getUrl`：获取网关 URL

`getUrl` 方法允许您检索 AI 网关的基本 URL，可选择指定一个提供商以获取特定于提供商的端点。

```typescript
// 获取基本网关 URL
const baseUrl = await gateway.getUrl();
// 输出: https://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/

// 获取特定于提供商的 URL
const openaiUrl = await gateway.getUrl("openai");
// 输出: https://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/openai
```

- **参数**：可选的 `provider`（字符串或 `AIGatewayProviders` 枚举）
- **返回**：`Promise<string>`
- **用例示例**：动态构建用于直接 API 调用或调试配置的 URL。

#### SDK 集成示例

`getUrl` 方法对于与流行的 AI SDK 集成特别有用：

**OpenAI SDK：**

```typescript
import OpenAI from "openai";

const openai = new OpenAI({
	apiKey: "my api key", // 默认为 process.env["OPENAI_API_KEY"]
	baseURL: await env.AI.gateway("my-gateway").getUrl("openai"),
});
```

**Vercel AI SDK 与 OpenAI：**

```typescript
import { createOpenAI } from "@ai-sdk/openai";

const openai = createOpenAI({
	baseURL: await env.AI.gateway("my-gateway").getUrl("openai"),
});
```

**Vercel AI SDK 与 Anthropic：**

```typescript
import { createAnthropic } from "@ai-sdk/anthropic";

const anthropic = createAnthropic({
	baseURL: await env.AI.gateway("my-gateway").getUrl("anthropic"),
});
```

### 3.4. `run`：通用请求

`run` 方法允许您执行通用请求。用户可以传递单个通用请求对象或其数组。此方法支持所有 AI 网关提供商。

有关可用输入的详细信息，请参阅[通用端点文档](/ai-gateway/universal/)。

```typescript
const resp = await gateway.run({
	provider: "workers-ai",
	endpoint: "@cf/meta/llama-3.1-8b-instruct",
	headers: {
		authorization: "Bearer my-api-token",
	},
	query: {
		prompt: "tell me a joke",
	},
});
```

- **返回**：`Promise<Response>`
- **用例示例**：向任何支持的提供商执行[通用请求](/ai-gateway/universal/)。

## 结论

通过这些 AI 网关绑定方法，您现在可以：

- 使用 `patchLog` 发送反馈和更新元数据。
- 使用 `getLog` 检索详细的日志信息。
- 使用 `getUrl` 获取用于直接 API 访问的网关 URL，从而轻松与流行的 AI SDK 集成。
- 使用 `run` 执行到任何 AI 网关提供商的通用请求。

这些方法为您的 AI 集成提供了更大的灵活性和控制力，使您能够在 Cloudflare Workers 平台上构建更复杂的应用程序。

---

# 身份验证

URL: https://developers.cloudflare.com/ai-gateway/configuration/authentication/

在 AI 网关中使用已验证网关通过要求每个请求都包含有效的授权令牌来增加安全性。此功能在存储日志时特别有用，因为它可以防止未经授权的访问，并防范可能增加日志存储使用量并使您难以找到所需数据的无效请求。启用已验证网关后，只有具有正确令牌的请求才会被处理。

:::note
我们建议在选择使用 AI 网关存储日志时启用已验证网关。

如果启用了已验证网关但请求不包含所需的 `cf-aig-authorization` 标头，请求将失败。此设置确保只有经过验证的请求通过网关。要绕过 `cf-aig-authorization` 标头的需要，请确保禁用已验证网关。
:::

## 使用仪表板设置已验证网关

1. 转到您要启用身份验证的特定网关的设置。
2. 选择 **创建身份验证令牌** 以生成具有所需 `Run` 权限的自定义令牌。务必安全地保存此令牌，因为它不会再次显示。
3. 在对此网关的每个请求中包含带有您的 API 令牌的 `cf-aig-authorization` 标头。
4. 返回设置页面并开启已验证网关。

## 使用 OpenAI 的示例请求

```bash
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
  --header 'cf-aig-authorization: Bearer {CF_AIG_TOKEN}' \
  --header 'Authorization: Bearer OPENAI_TOKEN' \
  --header 'Content-Type: application/json' \
  --data '{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "What is Cloudflare?"}]}'
```

使用 OpenAI SDK：

```javascript
import OpenAI from "openai";

const openai = new OpenAI({
	apiKey: process.env.OPENAI_API_KEY,
	baseURL: "https://gateway.ai.cloudflare.com/v1/account-id/gateway/openai",
	defaultHeaders: {
		"cf-aig-authorization": `Bearer {token}`,
	},
});
```

## 使用 Vercel AI SDK 的示例请求

```javascript
import { createOpenAI } from "@ai-sdk/openai";

const openai = createOpenAI({
	baseURL: "https://gateway.ai.cloudflare.com/v1/account-id/gateway/openai",
	headers: {
		"cf-aig-authorization": `Bearer {token}`,
	},
});
```

## 预期行为

下表概述了基于身份验证设置和标头状态的网关行为：

| 身份验证设置 | 标头信息 | 网关状态   | 响应                   |
| ------------ | -------- | ---------- | ---------------------- |
| 开启         | 存在标头 | 已验证网关 | 请求成功               |
| 开启         | 无标头   | 错误       | 由于缺少授权而请求失败 |
| 关闭         | 存在标头 | 未验证网关 | 请求成功               |
| 关闭         | 无标头   | 未验证网关 | 请求成功               |

---

# 缓存

URL: https://developers.cloudflare.com/ai-gateway/configuration/caching/

import { TabItem, Tabs } from "~/components";

AI 网关可以缓存来自您的 AI 模型提供商的响应，为相同的请求直接从 Cloudflare 的缓存提供服务。

## 使用缓存的好处

- **减少延迟：** 通过避免对重复请求向源 AI 提供商进行往返，为用户提供更快的响应。
- **成本节省：** 最小化向您的 AI 提供商发出的付费请求数量，特别是对于频繁访问或非动态内容。
- **增加吞吐量：** 从您的 AI 提供商卸载重复请求，使其能够更高效地处理独特请求。

:::note

目前缓存仅支持文本和图像响应，并且仅适用于相同的请求。

此配置适用于具有有限提示选项的用例。例如，询问"我可以如何帮助您？"并让用户从有限的选项集中选择答案的支持机器人，在当前缓存配置下运行良好。
我们计划在未来为缓存添加语义搜索以提高缓存命中率。
:::

## 默认配置

<Tabs syncKey="dashPlusAPI"> <TabItem label="仪表板">

要在仪表板中设置默认缓存配置：

1. 登录 [Cloudflare 仪表板](https://dash.cloudflare.com/) 并选择您的账户。
2. 选择 **AI** > **AI 网关**。
3. 选择 **设置**。
4. 启用 **缓存响应**。
5. 将默认缓存更改为您偏好的任何值。

</TabItem> <TabItem label="API">

要使用 API 设置默认缓存配置：

1. [创建 API 令牌](/fundamentals/api/get-started/create-token/)，具有以下权限：

- `AI Gateway - Read`
- `AI Gateway - Edit`

2. 获取您的[账户 ID](/fundamentals/account/find-account-and-zone-ids/)。
3. 使用该 API 令牌和账户 ID，发送 [`POST` 请求](/api/resources/ai_gateway/methods/create/)创建新网关并包含 `cache_ttl` 的值。

</TabItem> </Tabs>

此缓存行为将统一应用于所有支持缓存的请求。如果您需要为特定请求修改缓存设置，您可以灵活地在每个请求的基础上覆盖此设置。

要检查响应是否来自缓存，**cf-aig-cache-status** 将被指定为 `HIT` 或 `MISS`。

## 每个请求的缓存

虽然您网关的默认缓存设置提供了良好的基线，但您可能需要更精细的控制。这些情况可能是数据新鲜度、具有不同生命周期的内容，或动态或个性化响应。

为了满足这些需求，AI 网关允许您使用特定的 HTTP 标头在每个请求的基础上覆盖默认缓存行为。这为您提供了为单个 API 调用优化缓存的精确性。

以下标头允许您定义此每个请求的缓存行为：

:::note

以下标头已更新为新名称，尽管旧标头仍将起作用。我们建议更新为新标头以确保未来兼容性：

`cf-cache-ttl` 现在是 `cf-aig-cache-ttl`

`cf-skip-cache` 现在是 `cf-aig-skip-cache`

:::

### 跳过缓存 (cf-aig-skip-cache)

跳过缓存是指绕过缓存并直接从原始提供商获取请求，而不使用任何缓存副本。

您可以使用标头 **cf-aig-skip-cache** 绕过请求的缓存版本。

例如，当向 OpenAI 提交请求时，以以下方式包含标头：

```bash title="跳过缓存的请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
  --header "Authorization: Bearer $TOKEN" \
  --header 'Content-Type: application/json' \
  --header 'cf-aig-skip-cache: true' \
  --data ' {
   		 "model": "gpt-4o-mini",
   		 "messages": [
   			 {
   				 "role": "user",
   				 "content": "how to build a wooden spoon in 3 short steps? give as short as answer as possible"
   			 }
   		 ]
   	 }
'
```

### 缓存 TTL (cf-aig-cache-ttl)

缓存 TTL，或生存时间，是缓存请求在过期并从原始源刷新之前保持有效的持续时间。您可以使用 **cf-aig-cache-ttl** 以秒为单位设置所需的缓存持续时间。最小 TTL 是 60 秒，最大 TTL 是一个月。

例如，如果您设置一小时的 TTL，这意味着请求在缓存中保存一小时。在该小时内，相同的请求将从缓存提供服务而不是原始 API。一小时后，缓存过期，请求将转到原始 API 获取新响应，该响应将为下一小时重新填充缓存。

例如，当向 OpenAI 提交请求时，以以下方式包含标头：

```bash title="要缓存一小时的请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
  --header "Authorization: Bearer $TOKEN" \
  --header 'Content-Type: application/json' \
  --header 'cf-aig-cache-ttl: 3600' \
  --data ' {
   		 "model": "gpt-4o-mini",
   		 "messages": [
   			 {
   				 "role": "user",
   				 "content": "how to build a wooden spoon in 3 short steps? give as short as answer as possible"
   			 }
   		 ]
   	 }
'
```

### 自定义缓存键 (cf-aig-cache-key)

自定义缓存键让您覆盖默认缓存键，以便精确设置任何资源的可缓存性设置。要覆盖默认缓存键，您可以使用标头 **cf-aig-cache-key**。

当您第一次使用 **cf-aig-cache-key** 标头时，您将收到来自提供商的响应。具有相同标头的后续请求将返回缓存的响应。如果使用了 **cf-aig-cache-ttl** 标头，响应将根据指定的缓存生存时间进行缓存。否则，响应将根据仪表板中的缓存设置进行缓存。如果网关未启用缓存，响应将默认缓存 5 分钟。

例如，当向 OpenAI 提交请求时，以以下方式包含标头：

```bash title="具有自定义缓存键的请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
  --header 'Authorization: Bearer {openai_token}' \
  --header 'Content-Type: application/json' \
  --header 'cf-aig-cache-key: responseA' \
  --data ' {
   		 "model": "gpt-4o-mini",
   		 "messages": [
   			 {
   				 "role": "user",
   				 "content": "how to build a wooden spoon in 3 short steps? give as short as answer as possible"
   			 }
   		 ]
   	 }
'
```

:::caution[AI 网关缓存行为]
AI 网关中的缓存是易失性的。如果同时发送两个相同的请求，第一个请求可能无法及时缓存供第二个请求使用，这可能导致第二个请求从原始源检索数据。
:::

---

# 自定义成本

URL: https://developers.cloudflare.com/ai-gateway/configuration/custom-costs/

import { TabItem, Tabs } from "~/components";

AI 网关允许您在请求级别设置自定义成本。通过使用此功能，成本指标可以准确反映您的独特定价，覆盖默认或公共模型成本。

:::note[注意]

自定义成本仅适用于在其响应中传递令牌的请求。没有令牌信息的请求将不会计算成本。

:::

## 自定义成本

要为您的 API 请求添加自定义成本，请使用 `cf-aig-custom-cost` 标头。此标头使您能够为输入（发送的令牌）和输出（接收的令牌）指定每个令牌的成本。

- **per_token_in**：协商的输入令牌成本（每个令牌）。
- **per_token_out**：协商的输出令牌成本（每个令牌）。

您可以包含的小数位数没有限制，确保精确的成本计算，无论值有多小。

自定义成本将在日志中以下划线显示，使您可以轻松识别何时应用了自定义定价。

在此示例中，如果您的协商价格为每百万输入令牌 1 美元和每百万输出令牌 2 美元，请如下所示包含 `cf-aig-custom-cost` 标头。

```bash title="具有自定义成本的请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
  --header "Authorization: Bearer $TOKEN" \
  --header 'Content-Type: application/json' \
  --header 'cf-aig-custom-cost: {"per_token_in":0.000001,"per_token_out":0.000002}' \
  --data ' {
        "model": "gpt-4o-mini",
        "messages": [
          {
            "role": "user",
            "content": "When is Cloudflare's Birthday Week?"
          }
        ]
      }'
```

:::note

如果响应从缓存提供（缓存命中），成本始终为 `0`，即使您指定了自定义成本。自定义成本仅在请求到达模型提供商时适用。
:::

---

# 回退

URL: https://developers.cloudflare.com/ai-gateway/configuration/fallbacks/

import { Render } from "~/components";

通过您的[通用端点](/ai-gateway/universal/)指定模型或提供商回退，以处理请求失败并确保可靠性。

Cloudflare 可以在响应[请求错误](#request-failures)或[预定的请求超时](/ai-gateway/configuration/request-handling#request-timeouts)时触发您的回退提供商。[响应标头 `cf-aig-step`](#response-headercf-aig-step) 指示哪个步骤成功处理了请求。

## 请求失败

默认情况下，如果模型请求返回错误，Cloudflare 会触发您的回退。

### 示例

在以下示例中，请求首先发送到 [Workers AI](/workers-ai/) 推理 API。如果请求失败，它会回退到 OpenAI。响应标头 `cf-aig-step` 指示哪个提供商成功处理了请求。

1. 向 Workers AI 推理 API 发送请求。
2. 如果该请求失败，继续发送到 OpenAI。

```mermaid
graph TD
    A[AI Gateway] --> B[Request to Workers AI Inference API]
    B -->|Success| C[Return Response]
    B -->|Failure| D[Request to OpenAI API]
    D --> E[Return Response]
```

<br />

您可以通过在数组中添加另一个对象来添加任意数量的回退。

<Render file="universal-gateway-example" />

## 响应标头(cf-aig-step)

在使用带有回退的[通用端点](/ai-gateway/universal/)时，响应标头 `cf-aig-step` 通过返回步骤编号指示哪个模型成功处理了请求。此标头提供了关于是否触发了回退以及哪个模型最终处理了响应的可见性。

- `cf-aig-step:0` – 成功使用了第一个（主要）模型。
- `cf-aig-step:1` – 请求回退到第二个模型。
- `cf-aig-step:2` – 请求回退到第三个模型。
- 后续步骤 – 每个回退将步骤编号递增 1。

---

# 自定义元数据

URL: https://developers.cloudflare.com/ai-gateway/configuration/custom-metadata/

AI 网关中的自定义元数据允许您使用用户 ID 或其他标识符标记请求，从而更好地跟踪和分析您的请求。元数据值可以是字符串、数字或布尔值，并将出现在您的日志中，使您可以轻松搜索和过滤数据。

## 主要功能

- **自定义标记**：向您的请求添加用户 ID、团队名称、测试指示器和其他相关信息。
- **增强日志记录**：元数据出现在您的日志中，允许详细检查和故障排除。
- **搜索和过滤**：使用元数据高效搜索和过滤已记录的请求。

:::note

AI 网关允许您每个请求传递最多五个自定义元数据条目。如果提供超过五个条目，只有前五个将被保存；额外的条目将被忽略。确保您的自定义元数据限制为五个条目，以避免未处理或丢失的数据。

:::

## 支持的元数据类型

- 字符串
- 数字
- 布尔值

:::note

不支持对象作为元数据值。

:::

## 实现

### 使用 cURL

要在使用 cURL 的请求中包含自定义元数据：

```bash
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
  --header 'Authorization: Bearer {api_token}' \
  --header 'Content-Type: application/json' \
  --header 'cf-aig-metadata: {"team": "AI", "user": 12345, "test":true}' \
  --data '{"model": "gpt-4o", "messages": [{"role": "user", "content": "What should I eat for lunch?"}]}'
```

### 使用 SDK

要在使用 OpenAI SDK 的请求中包含自定义元数据：

```javascript
import OpenAI from "openai";

export default {
	async fetch(request, env, ctx) {
		const openai = new OpenAI({
			apiKey: env.OPENAI_API_KEY,
			baseURL:
				"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai",
		});

		try {
			const chatCompletion = await openai.chat.completions.create(
				{
					model: "gpt-4o",
					messages: [{ role: "user", content: "What should I eat for lunch?" }],
					max_tokens: 50,
				},
				{
					headers: {
						"cf-aig-metadata": JSON.stringify({
							user: "JaneDoe",
							team: 12345,
							test: true,
						}),
					},
				},
			);

			const response = chatCompletion.choices[0].message;
			return new Response(JSON.stringify(response));
		} catch (e) {
			console.log(e);
			return new Response(e);
		}
	},
};
```

### 使用绑定

要在使用[绑定](/workers/runtime-apis/bindings/)的请求中包含自定义元数据：

```javascript
export default {
	async fetch(request, env, ctx) {
		const aiResp = await env.AI.run(
			"@cf/mistral/mistral-7b-instruct-v0.1",
			{ prompt: "What should I eat for lunch?" },
			{
				gateway: {
					id: "gateway_id",
					metadata: { team: "AI", user: 12345, test: true },
				},
			},
		);

		return new Response(aiResp);
	},
};
```

---

# 配置

URL: https://developers.cloudflare.com/ai-gateway/configuration/

import { DirectoryListing } from "~/components";

使用多种选项和自定义设置配置您的 AI 网关。

<DirectoryListing />

---

# 管理网关

URL: https://developers.cloudflare.com/ai-gateway/configuration/manage-gateway/

import { Render } from "~/components";

您有多种不同的选项来管理 AI 网关。

## 创建网关

<Render file="create-gateway" />

## 编辑网关

<Render file="edit-gateway" />

:::note

有关可编辑设置的更多详细信息，请参阅[配置](/ai-gateway/configuration/)。

:::

## 删除网关

删除您的网关是永久性的，无法撤销。

<Render file="delete-gateway" />

---

# 速率限制

URL: https://developers.cloudflare.com/ai-gateway/configuration/rate-limiting/

import { TabItem, Tabs } from "~/components";

速率限制控制到达您应用的流量，可以防止昂贵的账单和可疑活动。

## 参数

您可以将速率限制定义为在特定时间范围内发送的请求数量。例如，您可以将应用限制为每 60 秒 100 个请求。

您还可以选择是否希望使用**固定**或**滑动**速率限制技术。使用速率限制时，我们允许在时间窗口内发送一定数量的请求。例如，如果是固定速率，窗口基于时间，因此在十分钟窗口内不会超过 `x` 个请求。如果是滑动速率，在过去十分钟内不会超过 `x` 个请求。

为了说明这一点，假设您从 12:00 开始，每十分钟限制十个请求。因此固定窗口是 12:00-12:10、12:10-12:20，以此类推。如果您在 12:09 发送十个请求，在 12:11 发送十个请求，在固定窗口策略中所有 20 个请求都会成功。但是，在滑动窗口策略中它们会失败，因为在过去十分钟内有超过十个请求。

## 处理速率限制

当您的请求超过允许的速率时，您将遇到速率限制。这意味着服务器将以 `429 Too Many Requests` 状态码响应，您的请求不会被处理。

## 默认配置

<Tabs syncKey="dashPlusAPI"> <TabItem label="仪表板">

要在仪表板中设置默认速率限制配置：

1. 登录 [Cloudflare 仪表板](https://dash.cloudflare.com/) 并选择您的账户。
2. 转到 **AI** > **AI 网关**。
3. 转到 **设置**。
4. 启用 **速率限制**。
5. 根据需要调整速率、时间周期和速率限制方法。

</TabItem> <TabItem label="API">

要使用 API 设置默认速率限制配置：

1. [创建 API 令牌](/fundamentals/api/get-started/create-token/)，具有以下权限：

- `AI Gateway - Read`
- `AI Gateway - Edit`

2. 获取您的[账户 ID](/fundamentals/account/find-account-and-zone-ids/)。
3. 使用该 API 令牌和账户 ID，发送 [`POST` 请求](/api/resources/ai_gateway/methods/create/)创建新网关并包含 `rate_limiting_interval`、`rate_limiting_limit` 和 `rate_limiting_technique` 的值。

</TabItem> </Tabs>

此速率限制行为将统一应用于该网关的所有请求。

---

# 分析

URL: https://developers.cloudflare.com/ai-gateway/observability/analytics/

import { Render, TabItem, Tabs } from "~/components";

您的 AI 网关仪表板显示请求、令牌、缓存、错误和成本的指标。您可以按时间过滤这些指标。
这些分析帮助您了解流量模式、令牌消耗以及
AI 提供商之间的潜在问题。您可以
查看以下分析：

- **请求**：跟踪 AI 网关处理的请求总数。
- **令牌使用量**：分析请求中的令牌消耗，深入了解使用模式。
- **成本**：了解使用不同 AI 提供商的相关成本，让您能够跟踪支出、管理预算和优化资源。
- **错误**：监控网关中的错误数量，帮助识别和排除问题。
- **缓存响应**：查看从缓存提供服务的响应百分比，这可以帮助降低成本并提高速度。

## 查看分析

<Tabs> <TabItem label="仪表板">

<Render file="analytics-dashboard" />

</TabItem> <TabItem label="graphql">

您可以使用 GraphQL 在 AI 网关仪表板之外查询您的使用数据。请参阅下面的示例查询。您需要在发出请求时使用您的 Cloudflare 令牌，并将 `{account_id}` 更改为匹配您的账户标签。

```bash title="请求"
curl https://api.cloudflare.com/client/v4/graphql \
  --header 'Authorization: Bearer TOKEN \
  --header 'Content-Type: application/json' \
  --data '{
    "query": "query{\n  viewer {\n	accounts(filter: { accountTag: \"{account_id}\" }) {\n	requests: aiGatewayRequestsAdaptiveGroups(\n    	limit: $limit\n    	filter: { datetimeHour_geq: $start, datetimeHour_leq: $end }\n    	orderBy: [datetimeMinute_ASC]\n  	) {\n    	count,\n    	dimensions {\n        	model,\n        	provider,\n        	gateway,\n        	ts: datetimeMinute\n    	}\n    	\n  	}\n    	\n	}\n  }\n}",
    "variables": {
   	 "limit": 1000,
   	 "start": "2023-09-01T10:00:00.000Z",
   	 "end": "2023-09-30T10:00:00.000Z",
   	 "orderBy": "date_ASC"
    }
}'
```

</TabItem> </Tabs>

---

# 请求处理

URL: https://developers.cloudflare.com/ai-gateway/configuration/request-handling/

import { Render, Aside } from "~/components";

您的 AI 网关支持不同的策略来处理对提供商的请求，这允许您有效管理 AI 交互并确保您的应用保持响应性和可靠性。

## 请求超时

请求超时允许您在提供商响应时间过长时触发回退或重试。

这些超时有助于：

- 通过防止用户等待响应时间过长来改善用户体验
- 通过检测无响应的提供商并触发回退选项来主动处理错误

请求超时可以在通用端点上设置，也可以直接在对任何提供商的请求上设置。

### 定义

超时以毫秒为单位设置。此外，超时基于响应的第一部分何时返回。只要响应的第一部分在指定的时间范围内返回 - 例如在流式传输响应时 - 您的网关将等待响应。

### 配置

#### 通用端点

如果在[通用端点](/ai-gateway/universal/)上设置，请求超时指定请求的超时持续时间并触发回退。

对于通用端点，通过在提供商特定的 `config` 对象中设置 `requestTimeout` 属性来配置超时值。每个提供商可以有不同的 `requestTimeout` 值进行精细自定义。

```bash title="提供商级别配置" {11-13} collapse={15-48}
curl 'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}' \
	--header 'Content-Type: application/json' \
	--data '[
    {
        "provider": "workers-ai",
        "endpoint": "@cf/meta/llama-3.1-8b-instruct",
        "headers": {
            "Authorization": "Bearer {cloudflare_token}",
            "Content-Type": "application/json"
        },
        "config": {
            "requestTimeout": 1000
        },
        "query": {
            "messages": [
                {
                    "role": "system",
                    "content": "You are a friendly assistant"
                },
                {
                    "role": "user",
                    "content": "What is Cloudflare?"
                }
            ]
        }
    },
    {
        "provider": "workers-ai",
        "endpoint": "@cf/meta/llama-3.1-8b-instruct-fast",
        "headers": {
            "Authorization": "Bearer {cloudflare_token}",
            "Content-Type": "application/json"
        },
        "query": {
            "messages": [
                {
                    "role": "system",
                    "content": "You are a friendly assistant"
                },
                {
                    "role": "user",
                    "content": "What is Cloudflare?"
                }
            ]
        },
				"config": {
            "requestTimeout": 3000
        },
    }
]'
```

#### 直接提供商

如果在[提供商](/ai-gateway/providers/)请求上设置，请求超时指定请求的超时持续时间，如果超过则返回错误。

对于提供商特定端点，通过添加 `cf-aig-request-timeout` 标头来配置超时值。

```bash title="提供商特定端点示例" {4}
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/meta/llama-3.1-8b-instruct \
 --header 'Authorization: Bearer {cf_api_token}' \
 --header 'Content-Type: application/json' \
 --header 'cf-aig-request-timeout: 5000'
 --data '{"prompt": "What is Cloudflare?"}'
```

---

## 请求重试

AI 网关还支持对失败请求的自动重试，最多五次重试尝试。

此功能提高了应用的弹性，确保您可以从临时问题中恢复，而无需手动干预。

请求超时可以在通用端点上设置，也可以直接在对任何提供商的请求上设置。

### 定义

使用请求重试，您可以调整三个属性的组合：

- 尝试次数（最多 5 次尝试）
- 重试前等待时间（以毫秒为单位，最多 5 秒）
- 退避方法（常量、线性或指数）

在最后一次重试尝试时，您的网关将等待直到请求完成，无论需要多长时间。

### 配置

#### 通用端点

如果在[通用端点](/ai-gateway/universal/)上设置，请求重试将在触发任何配置的回退之前自动重试失败的请求最多五次。

对于通用端点，在提供商特定的 `config` 中使用以下属性配置重试设置：

```json
config:{
	maxAttempts?: number;
	retryDelay?: number;
	backoff?: "constant" | "linear" | "exponential";
}
```

与[请求超时](/ai-gateway/configuration/request-handling/#universal-endpoint)一样，每个提供商可以有不同的重试设置进行精细自定义。

```bash title="提供商级别配置" {11-15} collapse={16-55}
curl 'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}' \
	--header 'Content-Type: application/json' \
	--data '[
    {
        "provider": "workers-ai",
        "endpoint": "@cf/meta/llama-3.1-8b-instruct",
        "headers": {
            "Authorization": "Bearer {cloudflare_token}",
            "Content-Type": "application/json"
        },
        "config": {
            "maxAttempts": 2,
						"retryDelay": 1000,
						"backoff": "constant"
        },
        "query": {
            "messages": [
                {
                    "role": "system",
                    "content": "You are a friendly assistant"
                },
                {
                    "role": "user",
                    "content": "What is Cloudflare?"
                }
            ]
        }
    },
    {
        "provider": "workers-ai",
        "endpoint": "@cf/meta/llama-3.1-8b-instruct-fast",
        "headers": {
            "Authorization": "Bearer {cloudflare_token}",
            "Content-Type": "application/json"
        },
        "query": {
            "messages": [
                {
                    "role": "system",
                    "content": "You are a friendly assistant"
                },
                {
                    "role": "user",
                    "content": "What is Cloudflare?"
                }
            ]
        },
				"config": {
            "maxAttempts": 4,
						"retryDelay": 1000,
						"backoff": "exponential"
        },
    }
]'
```

#### 直接提供商

如果在[提供商](/ai-gateway/universal/)请求上设置，请求重试将自动重试失败的请求最多五次。在最后一次重试尝试时，您的网关将等待直到请求完成，无论需要多长时间。

For a provider-specific endpoint, configure the retry settings by adding different header values:

- `cf-aig-max-attempts` (number)
- `cf-aig-retry-delay` (number)
- `cf-aig-backoff` ("constant" | "linear" | "exponential)

---

# 成本

URL: https://developers.cloudflare.com/ai-gateway/observability/costs/

成本指标仅适用于模型在其响应中返回令牌数据和模型名称的端点。

## 跟踪 AI 提供商的成本

AI 网关让您更容易监控和估算所有 AI 提供商基于令牌的成本。这可以帮助您：

- 了解和比较提供商之间的使用成本。
- 使用一致的指标监控趋势和估算支出。
- 应用自定义定价逻辑以匹配协商的费率。

:::note[注意]

成本指标是基于请求中发送和接收的令牌数量的**估算**。虽然此指标可以帮助您监控和预测成本趋势，但请参考您提供商的仪表板获取最**准确**的成本详情。

:::

:::caution[注意]

提供商可能会引入新模型或更改其定价。如果您注意到过时的成本数据或正在使用我们的成本跟踪尚不支持的模型，请[提交请求](https://forms.gle/8kRa73wRnvq7bxL48)

:::

## 自定义成本

AI 网关允许用户在特殊定价协议或协商费率下设置自定义成本。自定义成本可以在请求级别应用，应用时将覆盖默认或公共模型成本。
有关自定义成本配置的更多信息，请访问[自定义成本](/ai-gateway/configuration/custom-costs/)配置页面。

---

# 可观察性

URL: https://developers.cloudflare.com/ai-gateway/observability/

import { DirectoryListing } from "~/components";

可观察性是指为系统添加监控工具以收集指标和日志的实践，从而实现更好的监控、故障排除和应用优化。

<DirectoryListing />

---

# 使用 Workers AI 创建您的第一个 AI 网关

URL: https://developers.cloudflare.com/ai-gateway/tutorials/create-first-aig-workers/

import { Render } from "~/components";

本教程将指导您使用 Cloudflare 仪表板上的 Workers AI 创建您的第一个 AI 网关。目标受众是刚接触 AI 网关和 Workers AI 的初学者。创建一个 AI 网关可以使用户高效地管理和保护 AI 请求，从而使他们能够利用 AI 模型执行内容生成、数据处理或预测分析等任务，并具有增强的控制和性能。

## 注册和登录

1. **注册**：如果您没有 Cloudflare 帐户，请[注册](https://cloudflare.com/sign-up)。
2. **登录**：通过登录[Cloudflare 仪表板](https://dash.cloudflare.com/login)访问 Cloudflare 仪表板。

## 创建网关

然后，创建一个新的 AI 网关。

<Render file="create-gateway" />

## 连接您的 AI 提供商

1. 在 AI 网关部分，选择您创建的网关。
2. 选择 **Workers AI** 作为您的提供商，以设置一个特定于 Workers AI 的端点。
   您将收到一个用于发送请求的端点 URL。

## 配置您的 Workers AI

1. 在 Cloudflare 仪表板中转到 **AI** > **Workers AI**。
2. 选择 **使用 REST API** 并按照步骤创建并复制 API 令牌和账户 ID。
3. **向 Workers AI 发送请求**：使用提供的 API 端点。例如，您可以使用 curl 命令通过 API 运行模型。将 `{account_id}`、`{gateway_id}` 和 `{cf_api_token}` 替换为您的实际账户 ID 和 API 令牌：

   ```bash
   curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/meta/llama-3.1-8b-instruct \
   --header 'Authorization: Bearer {cf_api_token}' \
   --header 'Content-Type: application/json' \
   --data '{"prompt": "What is Cloudflare?"}'
   ```

预期的输出将类似于：

```bash
{"result":{"response":"I'd be happy to explain what Cloudflare is.\n\nCloudflare is a cloud-based service that provides a range of features to help protect and improve the performance, security, and reliability of websites, applications, and other online services. Think of it as a shield for your online presence!\n\nHere are some of the key things Cloudflare does:\n\n1. **Content Delivery Network (CDN)**: Cloudflare has a network of servers all over the world. When you visit a website that uses Cloudflare, your request is sent to the nearest server, which caches a copy of the website's content. This reduces the time it takes for the content to load, making your browsing experience faster.\n2. **DDoS Protection**: Cloudflare protects against Distributed Denial-of-Service (DDoS) attacks. This happens when a website is overwhelmed with traffic from multiple sources to make it unavailable. Cloudflare filters out this traffic, ensuring your site remains accessible.\n3. **Firewall**: Cloudflare acts as an additional layer of security, filtering out malicious traffic and hacking attempts, such as SQL injection or cross-site scripting (XSS) attacks.\n4. **SSL Encryption**: Cloudflare offers free SSL encryption, which secure sensitive information (like passwords, credit card numbers, and browsing data) with an HTTPS connection (the \"S\" stands for Secure).\n5. **Bot Protection**: Cloudflare has an AI-driven system that identifies and blocks bots trying to exploit vulnerabilities or scrape your content.\n6. **Analytics**: Cloudflare provides insights into website traffic, helping you understand your audience and make informed decisions.\n7. **Cybersecurity**: Cloudflare offers advanced security features, such as intrusion protection, DNS filtering, and Web Application Firewall (WAF) protection.\n\nOverall, Cloudflare helps protect against cyber threats, improves website performance, and enhances security for online businesses, bloggers, and individuals who need to establish a strong online presence.\n\nWould you like to know more about a specific aspect of Cloudflare?"},"success":true,"errors":[],"messages":[]}%
```

## 查看分析

监控您的 AI 网关以查看使用指标。

1. 在仪表板中转到 **AI** > **AI 网关**。
2. 选择您的网关以查看请求计数、令牌使用、缓存效率、错误和预估成本等指标。您还可以开启额外的配置，如日志记录和速率限制。

## 可选 - 后续步骤

要使用 Workers 构建更多内容，请参阅[教程](/workers/tutorials/)。

如果您有任何问题、需要帮助或想分享您的项目，请加入 [Discord](https://discord.cloudflare.com) 上的 Cloudflare 开发者社区，与其他开发者和 Cloudflare 团队联系。

---

# 部署通过 AI 网关连接到 OpenAI 的 Worker

URL: https://developers.cloudflare.com/ai-gateway/tutorials/deploy-aig-worker/

import { Render, PackageManagers } from "~/components";

在本教程中，您将学习如何部署一个通过 AI 网关调用 OpenAI 的 Worker。AI 网关通过更多的分析、缓存、速率限制和日志记录，帮助您更好地观察和控制您的 AI 应用程序。

本教程使用最新的 v4 OpenAI node 库，这是 2023 年 8 月发布的更新。

## 开始之前

所有教程都假设您已经完成了[入门指南](/workers/get-started/guide/)，该指南帮助您设置 Cloudflare Workers 帐户、[C3](https://github.com/cloudflare/workers-sdk/tree/main/packages/create-cloudflare) 和 [Wrangler](/workers/wrangler/install-and-update/)。

## 1. 创建 AI 网关和 OpenAI API 密钥

在 Cloudflare 仪表板的 AI 网关页面上，通过单击右上角的加号按钮创建一个新的 AI 网关。您应该能够命名网关和端点。单击 API 端点按钮以复制端点。您可以从特定于提供商的端点中选择，如 OpenAI、HuggingFace 和 Replicate。或者您可以使用接受特定模式并支持模型回退和重试的通用端点。

在本教程中，我们将使用 OpenAI 特定于提供商的端点，因此在下拉菜单中选择 OpenAI 并复制新的端点。

本教程还需要一个 OpenAI 帐户和 API 密钥。如果您没有，请创建一个新的 OpenAI 帐户并创建一个 API 密钥以继续本教程。请务必将您的 API 密钥存放在安全的地方，以便以后使用。

## 2. 创建一个新的 Worker

在命令行中创建一个 Worker 项目：

<PackageManagers type="create" pkg="cloudflare@latest" args={"openai-aig"} />

<Render
	file="c3-post-run-steps"
	product="workers"
	params={{
		category: "hello-world",
		type: "Worker only",
		lang: "JavaScript",
	}}
/>

转到您新的 open Worker 项目：

```sh title="打开您的新项目目录"
cd openai-aig
```

在您新的 openai-aig 目录中，找到并打开 `src/index.js` 文件。在本教程的大部分时间里，您将配置此文件。

最初，您生成的 `index.js` 文件应如下所示：

```js
export default {
	async fetch(request, env, ctx) {
		return new Response("Hello World!");
	},
};
```

## 3. 在您的 Worker 中配置 OpenAI

创建 Worker 项目后，我们可以学习如何向 OpenAI 发出第一个请求。您将使用 OpenAI node 库与 OpenAI API 进行交互。使用 `npm` 安装 OpenAI node 库：

<PackageManagers pkg="openai" />

在您的 `src/index.js` 文件中，在 `export default` 上方添加 `openai` 的导入：

```js
import OpenAI from "openai";
```

在您的 `fetch` 函数中，设置配置并使用您创建的 AI 网关端点实例化您的 `OpenAIApi` 客户端：

```js null {5-8}
import OpenAI from "openai";

export default {
	async fetch(request, env, ctx) {
		const openai = new OpenAI({
			apiKey: env.OPENAI_API_KEY,
			baseURL:
				"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai", // 在此处粘贴您的 AI 网关端点
		});
	},
};
```

要使其正常工作，您需要使用 [`wrangler secret put`](/workers/wrangler/commands/#put) 来设置您的 `OPENAI_API_KEY`。这会将 API 密钥保存到您的环境中，以便您的 Worker 在部署时可以访问它。此密钥是您之前在 OpenAI 仪表板中创建的 API 密钥：

<PackageManagers type="exec" pkg="wrangler" args="secret put OPENAI_API_KEY" />

为了在本地开发中使其正常工作，请在您的 Worker 项目中创建一个新文件 `.dev.vars` 并添加此行。请确保将 `OPENAI_API_KEY` 替换为您自己的 OpenAI API 密钥：

```txt title="在本地保存您的 API 密钥"
OPENAI_API_KEY = "<YOUR_OPENAI_API_KEY_HERE>"
```

## 4. 发出 OpenAI 请求

现在我们可以向 OpenAI [聊天完成 API](https://platform.openai.com/docs/guides/gpt/chat-completions-api) 发出请求。

您可以指定您想要的模型、角色和提示，以及您希望在总请求中使用的最大令牌数。

```js null {10-22}
import OpenAI from "openai";

export default {
	async fetch(request, env, ctx) {
		const openai = new OpenAI({
			apiKey: env.OPENAI_API_KEY,
			baseURL:
				"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai",
		});

		try {
			const chatCompletion = await openai.chat.completions.create({
				model: "gpt-4o-mini",
				messages: [{ role: "user", content: "What is a neuron?" }],
				max_tokens: 100,
			});

			const response = chatCompletion.choices[0].message;

			return new Response(JSON.stringify(response));
		} catch (e) {
			return new Response(e);
		}
	},
};
```

## 5. 部署您的 Worker 应用程序

要部署您的应用程序，请运行 `npx wrangler deploy` 命令来部署您的 Worker 应用程序：

<PackageManagers type="exec" pkg="wrangler" args="deploy" />

您现在可以在 \<YOUR_WORKER>.\<YOUR_SUBDOMAIN>.workers.dev 上预览您的 Worker。

## 6. 查看您的 AI 网关

当您在 Cloudflare 仪表板中转到 AI 网关时，您应该会看到您最近的请求被记录下来。您还可以[调整您的设置](/ai-gateway/configuration/)来管理您的日志、缓存和速率限制设置。

---

# 教程

URL: https://developers.cloudflare.com/ai-gateway/tutorials/

import { GlossaryTooltip, ListTutorials, YouTubeVideos } from "~/components";

查看<GlossaryTooltip term="tutorial">教程</GlossaryTooltip>以帮助您开始使用 AI 网关。

## 文档

<ListTutorials />

## 视频

<YouTubeVideos products={["AI Gateway"]} />

---

# 审计日志

URL: https://developers.cloudflare.com/ai-gateway/reference/audit-logs/

[审计日志](/fundamentals/account/account-security/review-audit-logs/)提供您的 Cloudflare 账户内所做更改的全面摘要，包括对 AI 网关中的网关所做的更改。此功能在所有计划类型中均可用，免费提供，并且默认启用。

## 查看审计日志

要查看 AI 网关的审计日志：

1. 登录 [Cloudflare 仪表板](https://dash.cloudflare.com/login) 并选择您的账户。
2. 转到**管理账户** > **审计日志**。

有关如何访问和使用审计日志的更多信息，请参阅[查看审计日志文档](/fundamentals/account/account-security/review-audit-logs/)。

## 记录的操作

以下配置操作会被记录：

| 操作            | 描述           |
| --------------- | -------------- |
| gateway created | 创建新网关。   |
| gateway deleted | 删除现有网关。 |
| gateway updated | 编辑现有网关。 |

## 示例日志条目

以下是显示创建新网关的审计日志条目示例：

```json
{
	"action": {
		"info": "gateway created",
		"result": true,
		"type": "create"
	},
	"actor": {
		"email": "<ACTOR_EMAIL>",
		"id": "3f7b730e625b975bc1231234cfbec091",
		"ip": "fe32:43ed:12b5:526::1d2:13",
		"type": "user"
	},
	"id": "5eaeb6be-1234-406a-87ab-1971adc1234c",
	"interface": "UI",
	"metadata": {},
	"newValue": "",
	"newValueJson": {
		"cache_invalidate_on_update": false,
		"cache_ttl": 0,
		"collect_logs": true,
		"id": "test",
		"rate_limiting_interval": 0,
		"rate_limiting_limit": 0,
		"rate_limiting_technique": "fixed"
	},
	"oldValue": "",
	"oldValueJson": {},
	"owner": {
		"id": "1234d848c0b9e484dfc37ec392b5fa8a"
	},
	"resource": {
		"id": "89303df8-1234-4cfa-a0f8-0bd848e831ca",
		"type": "ai_gateway.gateway"
	},
	"when": "2024-07-17T14:06:11.425Z"
}
```

---

# 限制

URL: https://developers.cloudflare.com/ai-gateway/reference/limits/

import { Render } from "~/components";

以下限制适用于 Cloudflare 平台中的网关配置、日志和相关功能。

| 功能                                                                      | 限制                            |
| ------------------------------------------------------------------------- | ------------------------------- |
| [可缓存请求大小](/ai-gateway/configuration/caching/)                      | 每个请求 25 MB                  |
| [缓存 TTL](/ai-gateway/configuration/caching/#cache-ttl-cf-aig-cache-ttl) | 1 个月                          |
| [自定义元数据](/ai-gateway/configuration/custom-metadata/)                | 每个请求 5 个条目               |
| [数据集](/ai-gateway/evaluations/set-up-evaluations/)                     | 每个网关 10 个                  |
| 网关免费计划                                                              | 每个账户 10 个                  |
| 网关付费计划                                                              | 每个账户 20 个                  |
| 网关名称长度                                                              | 64 个字符                       |
| 日志存储速率限制                                                          | 每个网关每秒 500 条日志         |
| 日志存储[付费计划](/ai-gateway/reference/pricing/)                        | 每个网关 1000 万条 <sup>1</sup> |
| 日志存储[免费计划](/ai-gateway/reference/pricing/)                        | 每个账户 10 万条 <sup>2</sup>   |
| [日志大小存储](/ai-gateway/observability/logging/)                        | 每条日志 10 MB <sup>3</sup>     |
| [Logpush 作业](/ai-gateway/observability/logging/logpush/)                | 每个账户 4 个                   |
| [Logpush 大小限制](/ai-gateway/observability/logging/logpush/)            | 每条日志 1MB                    |

<sup>1</sup> 如果您已达到每个网关存储 1000 万条日志的限制，新日志
将停止保存。要继续保存日志，您必须删除该网关中的较旧日志以释放空间或创建新网关。请参阅[自动日志清理](/ai-gateway/observability/logging/#auto-log-cleanup)了解如何自动删除日志的更多详细信息。

<sup>2</sup> 如果您已达到所有网关中每个账户存储 10
万条日志的限制，新日志将停止保存。要继续保存日志，您必须删除较旧的日志。请参阅[自动日志清理](/ai-gateway/observability/logging/#auto-log-cleanup)了解如何自动删除日志的更多详细信息。

<sup>3</sup> 大于 10 MB 的日志将不会被存储。

<Render file="limits-increase" product="ai-gateway" />

---

# 平台

URL: https://developers.cloudflare.com/ai-gateway/reference/

import { DirectoryListing } from "~/components";

<DirectoryListing />

---

# 定价

URL: https://developers.cloudflare.com/ai-gateway/reference/pricing/

AI 网关在所有计划中都可以使用。

AI 网关目前可用的核心功能是免费提供的，只需要一个 Cloudflare 账户和一行代码即可[开始使用](/ai-gateway/get-started/)。核心功能包括：仪表板分析、缓存和速率限制。

我们将继续构建和扩展 AI 网关。一些新功能可能是免费的附加核心功能，而其他功能可能是高级计划的一部分。我们将在这些功能可用时宣布。

您可以在 AI 网关仪表板中监控您的使用情况。

## 持久日志

:::note[注意]

持久日志的计费尚未开始。付费计划用户可以在此期间存储超过每月 20 万条日志的免费配额而不会被收费。（免费计划用户仍受其计划的 10 万条日志上限限制。）我们将在开始对持久日志存储收费之前提供充分的提前通知。

:::

持久日志在所有计划中都可用，免费计划和付费计划都有免费配额。超出这些限制的额外日志费用基于每月存储的日志数量。

### 免费配额和超额定价

| 计划         | 免费日志存储     | 超额定价                     |
| ------------ | ---------------- | ---------------------------- |
| Workers 免费 | 总共 10 万条日志 | 不适用 - 升级到 Workers 付费 |
| Workers 付费 | 总共 20 万条日志 | 每月每存储 10 万条日志 $8    |

配额基于所有网关中存储的总日志数。有关管理或删除日志的指导，请参阅我们的[文档](/ai-gateway/observability/logging)。

例如，如果您是 Workers 付费计划用户，存储了 30 万条日志，您将为超出的 10 万条日志（30 万条总日志 - 20 万条免费日志）付费，费用为每月 $8。

## Logpush

Logpush 仅在 Workers 付费计划中可用。

|      | 付费计划                      |
| ---- | ----------------------------- |
| 请求 | 每月 1000 万次，+$0.05/百万次 |

## 细则

价格可能会变更。如果您是企业客户，请联系您的客户团队确认定价详细信息。

---

# Anthropic

URL: https://developers.cloudflare.com/ai-gateway/providers/anthropic/

import { Render } from "~/components";

[Anthropic](https://www.anthropic.com/) 帮助构建可靠、可解释和可操控的 AI 系统。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/anthropic
```

## 前提条件

在向 Anthropic 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 Anthropic API 令牌。
- 您要使用的 Anthropic 模型的名称。

## 示例

### cURL

```bash title="示例获取请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/anthropic/v1/messages \
 --header 'x-api-key: {anthropic_api_key}' \
 --header 'anthropic-version: 2023-06-01' \
 --header 'Content-Type: application/json' \
 --data  '{
    "model": "claude-3-opus-20240229",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "What is Cloudflare?"}
    ]
  }'
```

### 在 JavaScript 中使用 Anthropic SDK

如果您使用 `@anthropic-ai/sdk`，您可以这样设置您的端点：

```js title="JavaScript"
import Anthropic from "@anthropic-ai/sdk";

const apiKey = env.ANTHROPIC_API_KEY;
const accountId = "{account_id}";
const gatewayId = "{gateway_id}";
const baseURL = `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/anthropic`;

const anthropic = new Anthropic({
	apiKey,
	baseURL,
});

const model = "claude-3-opus-20240229";
const messages = [{ role: "user", content: "What is Cloudflare?" }];
const maxTokens = 1024;

const message = await anthropic.messages.create({
	model,
	messages,
	max_tokens: maxTokens,
});
```

<Render
	file="chat-completions-providers"
	product="ai-gateway"
	params={{
		name: "Anthropic",
		jsonexample: `
{
	"model": "anthropic/{model}"
}`

    }}

/>

---

# Azure OpenAI

URL: https://developers.cloudflare.com/ai-gateway/providers/azureopenai/

[Azure OpenAI](https://azure.microsoft.com/en-gb/products/ai-services/openai-service/) 允许您在数据上应用自然语言算法。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/azure-openai/{resource_name}/{deployment_name}
```

## 前提条件

在向 Azure OpenAI 发出请求时，您需要：

- AI 网关账户 ID
- AI 网关网关名称
- Azure OpenAI API 密钥
- Azure OpenAI 资源名称
- Azure OpenAI 部署名称（也称为模型名称）

## URL 结构

您的新基础 URL 将使用上述数据的结构：`https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/azure-openai/{resource_name}/{deployment_name}`。然后，您可以在基础 URL 末尾附加您的端点和 api-version，如 `.../chat/completions?api-version=2023-05-15`。

## 示例

### cURL

```bash title="示例获取请求"
curl 'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/azure-openai/{resource_name}/{deployment_name}/chat/completions?api-version=2023-05-15' \
  --header 'Content-Type: application/json' \
  --header 'api-key: {azure_api_key}' \
  --data '{
  "messages": [
    {
      "role": "user",
      "content": "What is Cloudflare?"
    }
  ]
}'
```

### 在 JavaScript 中使用 `openai-node`

如果您使用 `openai-node` 库，您可以这样设置您的端点：

```js title="JavaScript"
import OpenAI from "openai";

const resource = "xxx";
const model = "xxx";
const apiVersion = "xxx";
const apiKey = env.AZURE_OPENAI_API_KEY;
const accountId = "{account_id}";
const gatewayId = "{gateway_id}";
const baseURL = `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/azure-openai/${resource}/${model}`;

const azure_openai = new OpenAI({
	apiKey,
	baseURL,
	defaultQuery: { "api-version": apiVersion },
	defaultHeaders: { "api-key": apiKey },
});
```

---

# Amazon Bedrock

URL: https://developers.cloudflare.com/ai-gateway/providers/bedrock/

[Amazon Bedrock](https://aws.amazon.com/bedrock/) 允许您使用基础模型构建和扩展生成式 AI 应用程序。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/aws-bedrock`
```

## 前提条件

在向 Amazon Bedrock 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 Amazon Bedrock API 令牌。
- 您要使用的 Amazon Bedrock 模型的名称。

## 发出请求

在向 Amazon Bedrock 发出请求时，将您当前使用的 URL 中的 `https://bedrock-runtime.us-east-1.amazonaws.com/` 替换为 `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/aws-bedrock/bedrock-runtime/us-east-1/`，然后在 URL 末尾添加您要运行的模型。

使用 Bedrock 时，您需要在向 AI 网关发出请求之前对 URL 进行签名。您可以尝试使用 [`aws4fetch`](https://github.com/mhart/aws4fetch) SDK。

## 示例

### 在 TypeScript 中使用 `aws4fetch` SDK

```typescript
import { AwsClient } from "aws4fetch";

interface Env {
	accessKey: string;
	secretAccessKey: string;
}

export default {
	async fetch(
		request: Request,
		env: Env,
		ctx: ExecutionContext,
	): Promise<Response> {
		// 替换为您的配置
		const cfAccountId = "{account_id}";
		const gatewayName = "{gateway_id}";
		const region = "us-east-1";

		// 添加为秘密 (https://developers.cloudflare.com/workers/configuration/secrets/)
		const accessKey = env.accessKey;
		const secretKey = env.secretAccessKey;

		const awsClient = new AwsClient({
			accessKeyId: accessKey,
			secretAccessKey: secretKey,
			region: region,
			service: "bedrock",
		});

		const requestBodyString = JSON.stringify({
			inputText: "What does ethereal mean?",
		});

		const stockUrl = new URL(
			`https://bedrock-runtime.${region}.amazonaws.com/model/amazon.titan-embed-text-v1/invoke`,
		);

		const headers = {
			"Content-Type": "application/json",
		};

		// 签名原始请求
		const presignedRequest = await awsClient.sign(stockUrl.toString(), {
			method: "POST",
			headers: headers,
			body: requestBodyString,
		});

		// 网关 URL
		const gatewayUrl = new URL(
			`https://gateway.ai.cloudflare.com/v1/${cfAccountId}/${gatewayName}/aws-bedrock/bedrock-runtime/${region}/model/amazon.titan-embed-text-v1/invoke`,
		);

		// 通过网关 URL 发出请求
		const response = await fetch(gatewayUrl, {
			method: "POST",
			headers: presignedRequest.headers,
			body: requestBodyString,
		});

		if (
			response.ok &&
			response.headers.get("content-type")?.includes("application/json")
		) {
			const data = await response.json();
			return new Response(JSON.stringify(data));
		}

		return new Response("Invalid response", { status: 500 });
	},
};
```

---

# Cartesia

URL: https://developers.cloudflare.com/ai-gateway/providers/cartesia/

[Cartesia](https://docs.cartesia.ai/) 提供具有可定制语音模型的高级文本转语音服务。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cartesia
```

## URL 结构

在向 Cartesia 发出请求时，请将您当前使用的 URL 中的 `https://api.cartesia.ai/v1` 替换为 `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cartesia`。

## 前提条件

在向 Cartesia 发出请求时，请确保您拥有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 Cartesia API 令牌。
- 您要使用的 Cartesia 语音模型的模型 ID 和语音 ID。

## 示例

### cURL

```bash title="请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cartesia/tts/bytes \
  --header 'Content-Type: application/json' \
  --header 'Cartesia-Version: 2024-06-10' \
  --header 'X-API-Key: {cartesia_api_token}' \
  --data '{
    "transcript": "Welcome to Cloudflare - AI Gateway!",
    "model_id": "sonic-english",
    "voice": {
        "mode": "id",
        "id": "694f9389-aac1-45b6-b726-9d9369183238"
    },
    "output_format": {
        "container": "wav",
        "encoding": "pcm_f32le",
        "sample_rate": 44100
    }
}
```

---

# Cerebras

URL: https://developers.cloudflare.com/ai-gateway/providers/cerebras/

import { Render } from "~/components";

[Cerebras](https://inference-docs.cerebras.ai/) 为开发者提供 AI 模型推理的低延迟解决方案。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cerebras-ai
```

## 前提条件

在向 Cerebras 发出请求时，请确保您拥有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 Cerebras API 令牌。
- 您要使用的 Cerebras 模型的名称。

## 示例

### cURL

```bash title="示例 fetch 请求"
curl https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY/cerebras/chat/completions \
 --header 'content-type: application/json' \
 --header 'Authorization: Bearer CEREBRAS_TOKEN' \
 --data '{
    "model": "llama3.1-8b",
    "messages": [
        {
            "role": "user",
            "content": "What is Cloudflare?"
        }
    ]
}'
```

<Render
	file="chat-completions-providers"
	product="ai-gateway"
	params={{
		name: "Cerebras",
		jsonexample: `
{
	"model": "cerebras/{model}"
}`

    }}

/>

---

# Cohere

URL: https://developers.cloudflare.com/ai-gateway/providers/cohere/

import { Render } from "~/components";

[Cohere](https://cohere.com/) 构建旨在解决现实业务挑战的 AI 模型。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cohere
```

## URL 结构

在向 [Cohere](https://cohere.com/) 发出请求时，将您当前使用的 URL 中的 `https://api.cohere.ai/v1` 替换为 `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cohere`。

## 前提条件

在向 Cohere 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 Cohere API 令牌。
- 您要使用的 Cohere 模型的名称。

## 示例

### cURL

```bash title="请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cohere/v1/chat \
  --header 'Authorization: Token {cohere_api_token}' \
  --header 'Content-Type: application/json' \
  --data '{
  "chat_history": [
    {"role": "USER", "message": "Who discovered gravity?"},
    {"role": "CHATBOT", "message": "The man who is widely credited with discovering gravity is Sir Isaac Newton"}
  ],
  "message": "What year was he born?",
  "connectors": [{"id": "web-search"}]
}'
```

### 在 Python 中使用 Cohere SDK

如果使用 [`cohere-python-sdk`](https://github.com/cohere-ai/cohere-python)，这样设置您的端点：

```js title="Python"

import cohere
import os

api_key = os.getenv('API_KEY')
account_id = '{account_id}'
gateway_id = '{gateway_id}'
base_url = f"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/cohere/v1"

co = cohere.Client(
  api_key=api_key,
  base_url=base_url,
)

message = "hello world!"
model = "command-r-plus"

chat = co.chat(
  message=message,
  model=model
)

print(chat)

```

<Render
	file="chat-completions-providers"
	product="ai-gateway"
	params={{
		name: "Cohere",
		jsonexample: `
{
	"model": "cohere/{model}"
}`

    }}

/>

---

# DeepSeek

URL: https://developers.cloudflare.com/ai-gateway/providers/deepseek/

import { Render } from "~/components";

[DeepSeek](https://www.deepseek.com/) 帮助您使用 DeepSeek 的先进 AI 模型快速构建。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/deepseek
```

## 前提条件

在向 DeepSeek 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 DeepSeek AI API 令牌。
- 您要使用的 DeepSeek AI 模型的名称。

## URL 结构

您的新基础 URL 将使用上述数据的结构：

`https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/deepseek/`。

然后您可以附加您要访问的端点，例如：`chat/completions`。

因此您的最终 URL 将组合为：

`https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/deepseek/chat/completions`。

## 示例

### cURL

```bash title="示例获取请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/deepseek/chat/completions \
 --header 'content-type: application/json' \
 --header 'Authorization: Bearer DEEPSEEK_TOKEN' \
 --data '{
    "model": "deepseek-chat",
    "messages": [
        {
            "role": "user",
            "content": "What is Cloudflare?"
        }
    ]
}'
```

### 在 JavaScript 中使用 DeepSeek

如果您使用 OpenAI SDK，您可以这样设置您的端点：

```js title="JavaScript"
import OpenAI from "openai";

const openai = new OpenAI({
	apiKey: env.DEEPSEEK_TOKEN,
	baseURL:
		"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/deepseek",
});

try {
	const chatCompletion = await openai.chat.completions.create({
		model: "deepseek-chat",
		messages: [{ role: "user", content: "What is Cloudflare?" }],
	});

	const response = chatCompletion.choices[0].message;

	return new Response(JSON.stringify(response));
} catch (e) {
	return new Response(e);
}
```

<Render
	file="chat-completions-providers"
	product="ai-gateway"
	params={{
		name: "DeepSeek",
		jsonexample: `
{
	"model": "deepseek/{model}"
}`

    }}

/>

---

# ElevenLabs

URL: https://developers.cloudflare.com/ai-gateway/providers/elevenlabs/

[ElevenLabs](https://elevenlabs.io/) 提供先进的文本转语音服务，支持多种语言的高质量语音合成。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/elevenlabs
```

## 前提条件

在向 ElevenLabs 发出请求时，请确保您拥有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 ElevenLabs API 令牌。
- 您要使用的 ElevenLabs 语音模型的模型 ID。

## 示例

### cURL

```bash title="请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/elevenlabs/v1/text-to-speech/JBFqnCBsd6RMkjVDRZzb?output_format=mp3_44100_128 \
  --header 'Content-Type: application/json' \
  --header 'xi-api-key: {elevenlabs_api_token}' \
  --data '{
    "text": "Welcome to Cloudflare - AI Gateway!",
    "model_id": "eleven_multilingual_v2"
}'
```

---

# Google AI Studio

URL: https://developers.cloudflare.com/ai-gateway/providers/google-ai-studio/

import { Render } from "~/components";

[Google AI Studio](https://ai.google.dev/aistudio) 帮助您使用 Google Gemini 模型快速构建。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/google-ai-studio
```

## 前提条件

在向 Google AI Studio 发出请求时，您需要：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 Google AI Studio API 令牌。
- 您要使用的 Google AI Studio 模型的名称。

## URL 结构

您的新基础 URL 将使用上述数据的结构：`https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/google-ai-studio/`。

然后您可以附加您要访问的端点，例如：`v1/models/{model}:{generative_ai_rest_resource}`

因此您的最终 URL 将组合为：`https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/google-ai-studio/v1/models/{model}:{generative_ai_rest_resource}`。

## 示例

### cURL

```bash title="示例获取请求"
curl "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_name}/google-ai-studio/v1/models/gemini-1.0-pro:generateContent" \
 --header 'content-type: application/json' \
 --header 'x-goog-api-key: {google_studio_api_key}' \
 --data '{
      "contents": [
          {
            "role":"user",
            "parts": [
              {"text":"What is Cloudflare?"}
            ]
          }
        ]
      }'
```

### 在 JavaScript 中使用 `@google/generative-ai`

如果您使用 `@google/generative-ai` 包，您可以这样设置您的端点：

```js title="JavaScript 示例"
import { GoogleGenerativeAI } from "@google/generative-ai";

const api_token = env.GOOGLE_AI_STUDIO_TOKEN;
const account_id = "";
const gateway_name = "";

const genAI = new GoogleGenerativeAI(api_token);
const model = genAI.getGenerativeModel(
	{ model: "gemini-1.5-flash" },
	{
		baseUrl: `https://gateway.ai.cloudflare.com/v1/${account_id}/${gateway_name}/google-ai-studio`,
	},
);

await model.generateContent(["What is Cloudflare?"]);
```

<Render
	file="chat-completions-providers"
	product="ai-gateway"
	params={{
		name: "Google AI Studio",
		jsonexample: `
{
	"model": "google-ai-studio/{model}"
}`

    }}

/>

---

# Grok

URL: https://developers.cloudflare.com/ai-gateway/providers/grok/

import { Render } from "~/components";

[Grok](https://docs.x.ai/docs#getting-started) 是一个通用模型，可用于各种任务，包括生成和理解文本、代码和函数调用。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/grok
```

## URL 结构

在向 [Grok](https://docs.x.ai/docs#getting-started) 发出请求时，将您当前使用的 URL 中的 `https://api.x.ai/v1` 替换为 `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/grok`。

## 前提条件

在向 Grok 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 Grok API 令牌。
- 您要使用的 Grok 模型的名称。

## 示例

### cURL

```bash title="请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/grok/v1/chat/completions \
  --header 'content-type: application/json' \
  --header 'Authorization: Bearer {grok_api_token}' \
  --data '{
    "model": "grok-beta",
    "messages": [
        {
            "role": "user",
            "content": "What is Cloudflare?"
        }
    ]
}'
```

### 在 JavaScript 中使用 OpenAI SDK

如果您使用 JavaScript 中的 OpenAI SDK，您可以这样设置您的端点：

```js title="JavaScript"
import OpenAI from "openai";

const openai = new OpenAI({
	apiKey: "<api key>",
	baseURL:
		"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/grok",
});

const completion = await openai.chat.completions.create({
	model: "grok-beta",
	messages: [
		{
			role: "system",
			content:
				"You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
		},
		{
			role: "user",
			content: "What is the meaning of life, the universe, and everything?",
		},
	],
});

console.log(completion.choices[0].message);
```

### 在 Python 中使用 OpenAI SDK

如果您使用 Python 中的 OpenAI SDK，您可以这样设置您的端点：

```python title="Python"
import os
from openai import OpenAI

XAI_API_KEY = os.getenv("XAI_API_KEY")
client = OpenAI(
    api_key=XAI_API_KEY,
    base_url="https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/grok",
)

completion = client.chat.completions.create(
    model="grok-beta",
    messages=[
        {"role": "system", "content": "You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy."},
        {"role": "user", "content": "What is the meaning of life, the universe, and everything?"},
    ],
)

print(completion.choices[0].message)
```

### 在 JavaScript 中使用 Anthropic SDK

如果您使用 JavaScript 中的 Anthropic SDK，您可以这样设置您的端点：

```js title="JavaScript"
import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
	apiKey: "<api key>",
	baseURL:
		"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/grok",
});

const msg = await anthropic.messages.create({
	model: "grok-beta",
	max_tokens: 128,
	system:
		"You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
	messages: [
		{
			role: "user",
			content: "What is the meaning of life, the universe, and everything?",
		},
	],
});

console.log(msg);
```

### 在 Python 中使用 Anthropic SDK

如果您使用 Python 中的 Anthropic SDK，您可以这样设置您的端点：

```python title="Python"
import os
from anthropic import Anthropic

XAI_API_KEY = os.getenv("XAI_API_KEY")
client = Anthropic(
    api_key=XAI_API_KEY,
    base_url="https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/grok",
)

message = client.messages.create(
    model="grok-beta",
    max_tokens=128,
    system="You are Grok, a chatbot inspired by the Hitchhiker's Guide to the Galaxy.",
    messages=[
        {
            "role": "user",
            "content": "What is the meaning of life, the universe, and everything?",
        },
    ],
)

print(message.content)
```

<Render
	file="chat-completions-providers"
	product="ai-gateway"
	params={{
		name: "Grok",
		jsonexample: `
{
	"model": "grok/{model}"
}`

    }}

/>

---

# Groq

URL: https://developers.cloudflare.com/ai-gateway/providers/groq/

import { Render } from "~/components";

[Groq](https://groq.com/) 提供高速处理和低延迟性能。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/groq
```

## URL 结构

在向 [Groq](https://groq.com/) 发出请求时，将您当前使用的 URL 中的 `https://api.groq.com/openai/v1` 替换为 `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/groq`。

## 前提条件

在向 Groq 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 Groq API 令牌。
- 您要使用的 Groq 模型的名称。

## 示例

### cURL

```bash title="示例获取请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/groq/chat/completions \
  --header 'Authorization: Bearer {groq_api_key}' \
  --header 'Content-Type: application/json' \
  --data '{
    "messages": [
      {
        "role": "user",
        "content": "What is Cloudflare?"
      }
    ],
    "model": "llama3-8b-8192"
}'
```

### 在 JavaScript 中使用 Groq SDK

如果使用 [`groq-sdk`](https://www.npmjs.com/package/groq-sdk)，这样设置您的端点：

```js title="JavaScript"
import Groq from "groq-sdk";

const apiKey = env.GROQ_API_KEY;
const accountId = "{account_id}";
const gatewayId = "{gateway_id}";
const baseURL = `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/groq`;

const groq = new Groq({
	apiKey,
	baseURL,
});

const messages = [{ role: "user", content: "What is Cloudflare?" }];
const model = "llama3-8b-8192";

const chatCompletion = await groq.chat.completions.create({
	messages,
	model,
});
```

<Render
	file="chat-completions-providers"
	product="ai-gateway"
	params={{
		name: "Groq",
		jsonexample: `
{
	"model": "groq/{model}"
}`

    }}

/>

---

# 模型提供商

URL: https://developers.cloudflare.com/ai-gateway/providers/

以下是我们支持的提供商快速列表：

import { DirectoryListing } from "~/components";

<DirectoryListing />

---

# Mistral AI

URL: https://developers.cloudflare.com/ai-gateway/providers/mistral/

import { Render } from "~/components";

[Mistral AI](https://mistral.ai) 帮助您使用 Mistral 的先进 AI 模型快速构建。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/mistral
```

## 前提条件

在向 Mistral AI 发出请求时，您需要：

- AI 网关账户 ID
- AI 网关网关名称
- Mistral AI API 令牌
- Mistral AI 模型名称

## URL 结构

您的新基础 URL 将使用上述数据的结构：`https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/mistral/`。

然后您可以附加您要访问的端点，例如：`v1/chat/completions`

因此您的最终 URL 将组合为：`https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/mistral/v1/chat/completions`。

## 示例

### cURL

```bash title="示例获取请求"
curl -X POST https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/mistral/v1/chat/completions \
 --header 'content-type: application/json' \
 --header 'Authorization: Bearer MISTRAL_TOKEN' \
 --data '{
    "model": "mistral-large-latest",
    "messages": [
        {
            "role": "user",
            "content": "What is Cloudflare?"
        }
    ]
}'
```

### 在 JavaScript 中使用 `@mistralai/mistralai` 包

如果您使用 `@mistralai/mistralai` 包，您可以这样设置您的端点：

```js title="JavaScript 示例"
import { Mistral } from "@mistralai/mistralai";

const client = new Mistral({
	apiKey: MISTRAL_TOKEN,
	serverURL: `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/mistral`,
});

await client.chat.create({
	model: "mistral-large-latest",
	messages: [
		{
			role: "user",
			content: "What is Cloudflare?",
		},
	],
});
```

<Render
	file="chat-completions-providers"
	product="ai-gateway"
	params={{
		name: "Mistral",
		jsonexample: `
{
	"model": "mistral/{model}"
}`

    }}

/>

---

# HuggingFace

URL: https://developers.cloudflare.com/ai-gateway/providers/huggingface/

[HuggingFace](https://huggingface.co/) 帮助用户构建、部署和训练机器学习模型。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/huggingface
```

## URL 结构

在向 HuggingFace 推理 API 发出请求时，将您当前使用的 URL 中的 `https://api-inference.huggingface.co/models/` 替换为 `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/huggingface`。请注意，您要访问的模型应该紧跟其后，例如 `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/huggingface/bigcode/starcoder`。

## 前提条件

在向 HuggingFace 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 HuggingFace API 令牌。
- 您要使用的 HuggingFace 模型的名称。

## 示例

### cURL

```bash title="请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/huggingface/bigcode/starcoder \
  --header 'Authorization: Bearer {hf_api_token}' \
  --header 'Content-Type: application/json' \
  --data '{
    "inputs": "console.log"
}'
```

### 在 JavaScript 中使用 HuggingFace.js 库

如果您使用 HuggingFace.js 库，您可以这样设置您的推理端点：

```js title="JavaScript"
import { HfInferenceEndpoint } from "@huggingface/inference";

const accountId = "{account_id}";
const gatewayId = "{gateway_id}";
const model = "gpt2";
const baseURL = `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/huggingface/${model}`;
const apiToken = env.HF_API_TOKEN;

const hf = new HfInferenceEndpoint(baseURL, apiToken);
```

---

# OpenAI

URL: https://developers.cloudflare.com/ai-gateway/providers/openai/

[OpenAI](https://openai.com/about/) 帮助您使用 ChatGPT 进行构建。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai
```

### 聊天完成端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
```

### 响应端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/responses \
```

## URL 结构

在向 OpenAI 发出请求时，将您当前使用的 URL 中的 `https://api.openai.com/v1` 替换为 `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai`。

## 前提条件

在向 OpenAI 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 OpenAI API 令牌。
- 您要使用的 OpenAI 模型的名称。

## 聊天完成端点

### cURL 示例

```bash
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
--header 'Authorization: Bearer {openai_token}' \
--header 'Content-Type: application/json' \
--data '{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "What is Cloudflare?"
    }
  ]
}'
```

### JavaScript SDK 示例

```js
import OpenAI from "openai";

const apiKey = "my api key"; // 或 process.env["OPENAI_API_KEY"]
const accountId = "{account_id}";
const gatewayId = "{gateway_id}";
const baseURL = `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`;

const openai = new OpenAI({
	apiKey,
	baseURL,
});

try {
	const model = "gpt-3.5-turbo-0613";
	const messages = [{ role: "user", content: "What is a neuron?" }];
	const maxTokens = 100;
	const chatCompletion = await openai.chat.completions.create({
		model,
		messages,
		max_tokens: maxTokens,
	});
	const response = chatCompletion.choices[0].message;
	console.log(response);
} catch (e) {
	console.error(e);
}
```

## OpenAI 响应端点

### cURL 示例

```bash
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/responses \
--header 'Authorization: Bearer {openai_token}' \
--header 'Content-Type: application/json' \
--data '{
  "model": "gpt-4.1",
  "input": [
    {
      "role": "user",
      "content": "Write a one-sentence bedtime story about a unicorn."
    }
  ]
}'
```

### JavaScript SDK 示例

```js
import OpenAI from "openai";

const apiKey = "my api key"; // 或 process.env["OPENAI_API_KEY"]
const accountId = "{account_id}";
const gatewayId = "{gateway_id}";
const baseURL = `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/openai`;

const openai = new OpenAI({
	apiKey,
	baseURL,
});

try {
	const model = "gpt-4.1";
	const input = [
		{
			role: "user",
			content: "Write a one-sentence bedtime story about a unicorn.",
		},
	];
	const response = await openai.responses.create({
		model,
		input,
	});
	console.log(response.output_text);
} catch (e) {
	console.error(e);
}
```

---

# OpenRouter

URL: https://developers.cloudflare.com/ai-gateway/providers/openrouter/

[OpenRouter](https://openrouter.ai/) 是一个提供统一接口来访问和使用大型语言模型 (LLMs) 的平台。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openrouter
```

## URL 结构

在向 [OpenRouter](https://openrouter.ai/) 发出请求时，将您当前使用的 URL 中的 `https://openrouter.ai/api/v1/chat/completions` 替换为 `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openrouter`。

## 前提条件

在向 OpenRouter 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 OpenRouter API 令牌或来自原始模型提供商的令牌。
- 您要使用的 OpenRouter 模型的名称。

## 示例

### cURL

```bash title="请求"
curl -X POST https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY/openrouter/v1/chat/completions \
 --header 'content-type: application/json' \
 --header 'Authorization: Bearer OPENROUTER_TOKEN' \
 --data '{
    "model": "openai/gpt-3.5-turbo",
    "messages": [
        {
            "role": "user",
            "content": "What is Cloudflare?"
        }
    ]
}'

```

### 在 JavaScript 中使用 OpenAI SDK

如果您使用 JavaScript 中的 OpenAI SDK，您可以这样设置您的端点：

```js title="JavaScript"
import OpenAI from "openai";

const openai = new OpenAI({
	apiKey: env.OPENROUTER_TOKEN,
	baseURL:
		"https://gateway.ai.cloudflare.com/v1/ACCOUNT_TAG/GATEWAY/openrouter",
});

try {
	const chatCompletion = await openai.chat.completions.create({
		model: "openai/gpt-3.5-turbo",
		messages: [{ role: "user", content: "What is Cloudflare?" }],
	});

	const response = chatCompletion.choices[0].message;

	return new Response(JSON.stringify(response));
} catch (e) {
	return new Response(e);
}
```

---

# Perplexity

URL: https://developers.cloudflare.com/ai-gateway/providers/perplexity/

import { Render } from "~/components";

[Perplexity](https://www.perplexity.ai/) 是一个 AI 驱动的答案引擎。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/perplexity-ai
```

## 前提条件

在向 Perplexity 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 Perplexity API 令牌。
- 您要使用的 Perplexity 模型的名称。

## 示例

### cURL

```bash title="示例获取请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/perplexity-ai/chat/completions \
     --header 'accept: application/json' \
     --header 'content-type: application/json' \
     --header 'Authorization: Bearer {perplexity_token}' \
     --data '{
      "model": "mistral-7b-instruct",
      "messages": [
        {
          "role": "user",
          "content": "What is Cloudflare?"
        }
      ]
    }'
```

### 在 JavaScript 中通过 OpenAI SDK 使用 Perplexity

Perplexity 没有自己的 SDK，但它们与 OpenAI SDK 兼容。您可以使用 OpenAI SDK 通过 AI 网关进行 Perplexity 调用，如下所示：

```js title="JavaScript"
import OpenAI from "openai";

const apiKey = env.PERPLEXITY_API_KEY;
const accountId = "{account_id}";
const gatewayId = "{gateway_id}";
const baseURL = `https://gateway.ai.cloudflare.com/v1/${accountId}/${gatewayId}/perplexity-ai`;

const perplexity = new OpenAI({
	apiKey,
	baseURL,
});

const model = "mistral-7b-instruct";
const messages = [{ role: "user", content: "What is Cloudflare?" }];
const maxTokens = 20;

const chatCompletion = await perplexity.chat.completions.create({
	model,
	messages,
	max_tokens: maxTokens,
});
```

<Render
	file="chat-completions-providers"
	product="ai-gateway"
	params={{
		name: "Perplexity",
		jsonexample: `
{
	"model": "perplexity/{model}"
}`

    }}

/>

---

# Replicate

URL: https://developers.cloudflare.com/ai-gateway/providers/replicate/

[Replicate](https://replicate.com/) 运行和微调开源模型。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/replicate
```

## URL 结构

在向 Replicate 发出请求时，将您当前使用的 URL 中的 `https://api.replicate.com/v1` 替换为 `https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/replicate`。

## 前提条件

在向 Replicate 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 Replicate API 令牌。
- 您要使用的 Replicate 模型的名称。

## 示例

### cURL

```bash title="请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/replicate/predictions \
  --header 'Authorization: Token {replicate_api_token}' \
  --header 'Content-Type: application/json' \
  --data '{
    "input":
      {
        "prompt": "What is Cloudflare?"
      }
    }'
```

---

# Google Vertex AI

URL: https://developers.cloudflare.com/ai-gateway/providers/vertex/

[Google Vertex AI](https://cloud.google.com/vertex-ai) 使开发者能够轻松构建和部署企业级生成式 AI 体验。

以下是设置您的 Google Cloud 账户的快速指南：

1. Google Cloud Platform (GCP) 账户

   - 注册 [GCP 账户](https://cloud.google.com/vertex-ai)。新用户可能有资格获得积分（有效期 90 天）。

2. 启用 Vertex AI API

   - 导航到[启用 Vertex AI API](https://console.cloud.google.com/marketplace/product/google/aiplatform.googleapis.com) 并为您的项目激活 API。

3. 申请访问所需模型。

## 端点

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/google-vertex-ai
```

## 前提条件

在向 Google Vertex 发出请求时，您需要：

- AI 网关账户标签
- AI 网关网关名称
- Google Vertex API 密钥
- Google Vertex 项目名称
- Google Vertex 区域（例如，us-east4）
- Google Vertex 模型

## URL 结构

您的新基础 URL 将使用上述数据的结构：`https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/google-vertex-ai/v1/projects/{project_name}/locations/{region}`。

然后您可以附加您要访问的端点，例如：`/publishers/google/models/{model}:{generative_ai_rest_resource}`

因此您的最终 URL 将组合为：`https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/google-vertex-ai/v1/projects/{project_name}/locations/{region}/publishers/google/models/gemini-1.0-pro-001:generateContent`

## 示例

### cURL

```bash title="示例获取请求"
curl "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/google-vertex-ai/v1/projects/{project_name}/locations/{region}/publishers/google/models/gemini-1.0-pro-001:generateContent" \
    -H "Authorization: Bearer {vertex_api_key}" \
    -H 'Content-Type: application/json' \
    -d '{
        "contents": {
          "role": "user",
          "parts": [
            {
              "text": "Tell me more about Cloudflare"
            }
          ]
        }'

```

---

# Workers AI

URL: https://developers.cloudflare.com/ai-gateway/providers/workersai/

import { Render } from "~/components";

使用 AI 网关对 [Workers AI](/workers-ai/) 的请求进行分析、缓存和安全控制。Workers AI 与 AI 网关无缝集成，允许您通过 API 请求或通过 Workers 脚本的环境绑定执行 AI 推理。绑定通过以最少的设置将请求路由到您的 AI 网关来简化过程。

## 前提条件

在向 Workers AI 发出请求时，确保您具有以下内容：

- 您的 AI 网关账户 ID。
- 您的 AI 网关网关名称。
- 一个有效的 Workers AI API 令牌。
- 您要使用的 Workers AI 模型的名称。

## REST API

要与 REST API 交互，请更新用于请求的 URL：

- **之前**：

```txt
https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/run/{model_id}
```

- **现在**：

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/{model_id}
```

对于这些参数：

- `{account_id}` 是您的 Cloudflare [账户 ID](/workers-ai/get-started/rest-api/#1-get-api-token-and-account-id)。
- `{gateway_id}` 指您现有的 [AI 网关](/ai-gateway/get-started/#create-gateway)的名称。
- `{model_id}` 指 [Workers AI 模型](/workers-ai/models/)的模型 ID。

## 示例

首先，生成一个具有 `Workers AI Read` 访问权限的 [API 令牌](/fundamentals/api/get-started/create-token/)并在您的请求中使用它。

```bash title="对 Workers AI llama 模型的请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/meta/llama-3.1-8b-instruct \
 --header 'Authorization: Bearer {cf_api_token}' \
 --header 'Content-Type: application/json' \
 --data '{"prompt": "What is Cloudflare?"}'
```

```bash title="对 Workers AI 文本分类模型的请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/@cf/huggingface/distilbert-sst-2-int8 \
  --header 'Authorization: Bearer {cf_api_token}' \
  --header 'Content-Type: application/json' \
  --data '{ "text": "Cloudflare docs are amazing!" }'
```

### OpenAI 兼容端点

<Render file="openai-compatibility" product="workers-ai" /> <br />

```bash title="对 OpenAI 兼容端点的请求"
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/workers-ai/v1/chat/completions \
 --header 'Authorization: Bearer {cf_api_token}' \
 --header 'Content-Type: application/json' \
 --data '{
      "model": "@cf/meta/llama-3.1-8b-instruct",
      "messages": [
        {
          "role": "user",
          "content": "What is Cloudflare?"
        }
      ]
    }
'
```

## Workers 绑定

您可以使用环境绑定将 Workers AI 与 AI 网关集成。要在您的 Worker 中包含 AI 网关，请在您的 Workers AI 请求中将网关添加为对象。

```ts
export interface Env {
	AI: Ai;
}

export default {
	async fetch(request: Request, env: Env): Promise<Response> {
		const response = await env.AI.run(
			"@cf/meta/llama-3.1-8b-instruct",
			{
				prompt: "Why should you use Cloudflare for your AI inference?",
			},
			{
				gateway: {
					id: "{gateway_id}",
					skipCache: false,
					cacheTtl: 3360,
				},
			},
		);
		return new Response(JSON.stringify(response));
	},
} satisfies ExportedHandler<Env>;
```

有关使用绑定将 Workers AI 与 AI 网关集成的详细分步指南，请参阅 [AI 网关中的集成](/ai-gateway/integrations/aig-workers-ai-binding/)。

Workers AI 支持以下 AI 网关参数：

- `id` string
  - 您现有的 [AI 网关](/ai-gateway/get-started/#create-gateway)的名称。必须与您的 Worker 在同一账户中。
- `skipCache` boolean（默认：false）
  - 控制请求是否应[跳过缓存](/ai-gateway/configuration/caching/#skip-cache-cf-aig-skip-cache)。
- `cacheTtl` number
  - 控制[缓存 TTL](/ai-gateway/configuration/caching/#cache-ttl-cf-aig-cache-ttl)。

<Render
	file="chat-completions-providers"
	product="ai-gateway"
	params={{
		name: "Workers AI",
		jsonexample: `
{
	"model": "workers-ai/{model}"
}`

    }}

/>

---

# WebSockets API

URL: https://developers.cloudflare.com/ai-gateway/websockets-api/

AI 网关 WebSockets API 为 AI 交互提供持久连接，消除重复握手并减少延迟。此 API 分为两类：

- **实时 API** - 专为通过 WebSockets 提供低延迟、多模态交互的 AI 提供商而设计。
- **非实时 API** - 支持 AI 提供商的标准 WebSocket 通信，包括那些本身不支持 WebSockets 的提供商。

## 何时使用 WebSockets

WebSockets 是长期存在的 TCP 连接，支持客户端和服务器之间的双向、实时和非实时通信。与需要为每个请求重复握手的 HTTP 连接不同，WebSockets 维护连接，支持持续数据交换并减少开销。WebSockets 非常适合需要低延迟、实时数据的应用程序，如语音助手。

## 主要优势

- **减少开销**：通过维护单一持久连接，避免重复握手和 TLS 协商的开销。
- **提供商兼容性**：与 AI 网关中的所有 AI 提供商兼容。即使您选择的提供商不支持 WebSockets，Cloudflare 也会为您处理，管理对您首选 AI 提供商的请求。

## 主要区别

| 功能              | 实时 API                                                                                                | 非实时 API                                                      |
| :---------------- | :------------------------------------------------------------------------------------------------------ | :-------------------------------------------------------------- |
| **目的**          | 为提供专用 WebSocket 端点的提供商启用实时、多模态 AI 交互。                                             | 支持本身不支持 WebSockets 的提供商的基于 WebSocket 的 AI 交互。 |
| **用例**          | 用于语音、视频和实时交互的流式响应。                                                                    | 基于文本的查询和响应，如 LLM 请求。                             |
| **AI 提供商支持** | [仅限于提供实时 WebSocket API 的提供商。](/ai-gateway/websockets-api/realtime-api/#supported-providers) | [AI 网关中的所有 AI 提供商。](/ai-gateway/providers/)           |
| **流式支持**      | 提供商本身支持实时数据流。                                                                              | AI 网关通过 WebSockets 处理流式传输。                           |

有关实现的详细信息，请参阅下一节：

- [实时 WebSockets API](/ai-gateway/websockets-api/realtime-api/)
- [非实时 WebSockets API](/ai-gateway/websockets-api/non-realtime-api/)

---

# 非实时 WebSockets API

URL: https://developers.cloudflare.com/ai-gateway/websockets-api/non-realtime-api/

非实时 WebSockets API 允许您为 AI 请求建立持久连接，而无需重复握手。这种方法非常适合不需要实时交互但仍能从减少的延迟和持续通信中受益的应用程序。

## 设置 WebSockets API

1. 生成具有相应 AI 网关运行权限的 AI 网关令牌，并选择使用经过身份验证的网关。
2. 通过将 `https://` 替换为 `wss://` 来修改您的通用端点 URL，以启动 WebSocket 连接：
   ```
   wss://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}
   ```
3. 打开一个使用具有 AI 网关运行权限的 Cloudflare 令牌进行身份验证的 WebSocket 连接。

:::note
或者，如果您使用的是浏览器 WebSocket，我们也支持通过 `sec-websocket-protocol` 标头进行身份验证。
:::

## 示例请求

```javascript
import WebSocket from "ws";

const ws = new WebSocket(
	"wss://gateway.ai.cloudflare.com/v1/my-account-id/my-gateway/",
	{
		headers: {
			"cf-aig-authorization": "Bearer AI_GATEWAY_TOKEN",
		},
	},
);

ws.send(
	JSON.stringify({
		type: "universal.create",
		request: {
			eventId: "my-request",
			provider: "workers-ai",
			endpoint: "@cf/meta/llama-3.1-8b-instruct",
			headers: {
				Authorization: "Bearer WORKERS_AI_TOKEN",
				"Content-Type": "application/json",
			},
			query: {
				prompt: "tell me a joke",
			},
		},
	}),
);

ws.on("message", function incoming(message) {
	console.log(message.toString());
});
```

## 示例响应

```json
{
	"type": "universal.created",
	"metadata": {
		"cacheStatus": "MISS",
		"eventId": "my-request",
		"logId": "01JC3R94FRD97JBCBX3S0ZAXKW",
		"step": "0",
		"contentType": "application/json"
	},
	"response": {
		"result": {
			"response": "Why was the math book sad? Because it had too many problems. Would you like to hear another one?"
		},
		"success": true,
		"errors": [],
		"messages": []
	}
}
```

## 示例流式请求

对于流式请求，AI 网关发送一个初始消息，其中包含请求元数据，指示流已开始：

```json
{
	"type": "universal.created",
	"metadata": {
		"cacheStatus": "MISS",
		"eventId": "my-request",
		"logId": "01JC40RB3NGBE5XFRZGBN07572",
		"step": "0",
		"contentType": "text/event-stream"
	}
}
```

在此初始消息之后，所有流式块在从推理提供商到达时都会实时中继到 WebSocket 连接。在这些流式块的元数据中仅包含 `eventId` 字段。`eventId` 允许 AI 网关在每个消息中包含一个客户端定义的 ID，即使在流式 WebSocket 环境中也是如此。

```json
{
	"type": "universal.stream",
	"metadata": {
		"eventId": "my-request"
	},
	"response": {
		"response": "would"
	}
}
```

一旦请求的所有块都已流式传输完毕，AI 网关会发送最后一条消息以表示请求完成。为了增加灵活性，此消息再次包含所有元数据，即使它最初是在流式处理开始时提供的。

```json
{
	"type": "universal.done",
	"metadata": {
		"cacheStatus": "MISS",
		"eventId": "my-request",
		"logId": "01JC40RB3NGBE5XFRZGBN07572",
		"step": "0",
		"contentType": "text/event-stream"
	}
}
```

---

# 实时 WebSockets API

URL: https://developers.cloudflare.com/ai-gateway/websockets-api/realtime-api/

一些 AI 提供商通过 WebSockets 支持实时、低延迟的交互。AI 网关允许与这些 API 无缝集成，支持文本、音频和视频等多模态交互。

## 支持的提供商

- [OpenAI](https://platform.openai.com/docs/guides/realtime-websocket)
- [Google AI Studio](https://ai.google.dev/gemini-api/docs/multimodal-live)
- [Cartesia](https://docs.cartesia.ai/api-reference/tts/tts)
- [ElevenLabs](https://elevenlabs.io/docs/conversational-ai/api-reference/conversational-ai/websocket)

## 身份验证

对于实时 WebSockets，可以使用以下方式进行身份验证：

- 标头（用于非浏览器环境）
- `sec-websocket-protocol`（用于浏览器）

## 示例

### OpenAI

```javascript
import WebSocket from "ws";

const url =
	"wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/openai?model=gpt-4o-realtime-preview-2024-12-17";
const ws = new WebSocket(url, {
	headers: {
		"cf-aig-authorization": process.env.CLOUDFLARE_API_KEY,
		Authorization: "Bearer " + process.env.OPENAI_API_KEY,
		"OpenAI-Beta": "realtime=v1",
	},
});

ws.on("open", () => console.log("Connected to server."));
ws.on("message", (message) => console.log(JSON.parse(message.toString())));

ws.send(
	JSON.stringify({
		type: "response.create",
		response: { modalities: ["text"], instructions: "Tell me a joke" },
	}),
);
```

### Google AI Studio

```javascript
const ws = new WebSocket(
	"wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/google?api_key=<google_api_key>",
	["cf-aig-authorization.<cloudflare_token>"],
);

ws.on("open", () => console.log("Connected to server."));
ws.on("message", (message) => console.log(message.data));

ws.send(
	JSON.stringify({
		setup: {
			model: "models/gemini-2.0-flash-exp",
			generationConfig: { responseModalities: ["TEXT"] },
		},
	}),
);
```

### Cartesia

```javascript
const ws = new WebSocket(
	"wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/cartesia?cartesia_version=2024-06-10&api_key=<cartesia_api_key>",
	["cf-aig-authorization.<cloudflare_token>"],
);

ws.on("open", function open() {
	console.log("Connected to server.");
});

ws.on("message", function incoming(message) {
	console.log(message.data);
});

ws.send(
	JSON.stringify({
		model_id: "sonic",
		transcript: "Hello, world! I'm generating audio on ",
		voice: { mode: "id", id: "a0e99841-438c-4a64-b679-ae501e7d6091" },
		language: "en",
		context_id: "happy-monkeys-fly",
		output_format: {
			container: "raw",
			encoding: "pcm_s16le",
			sample_rate: 8000,
		},
		add_timestamps: true,
		continue: true,
	}),
);
```

### ElevenLabs

```javascript
const ws = new WebSocket(
	"wss://gateway.ai.cloudflare.com/v1/<account_id>/<gateway>/elevenlabs?agent_id=<elevenlabs_agent_id>",
	[
		"xi-api-key.<elevenlabs_api_key>",
		"cf-aig-authorization.<cloudflare_token>",
	],
);

ws.on("open", function open() {
	console.log("Connected to server.");
});

ws.on("message", function incoming(message) {
	console.log(message.data);
});

ws.send(
	JSON.stringify({
		text: "This is a sample text ",
		voice_settings: { stability: 0.8, similarity_boost: 0.8 },
		generation_config: { chunk_length_schedule: [120, 160, 250, 290] },
	}),
);
```

---

# Workers Logpush

URL: https://developers.cloudflare.com/ai-gateway/observability/logging/logpush/

import { Render, Tabs, TabItem } from "~/components";

AI 网关允许您安全地将日志导出到外部存储位置，在那里您可以解密和处理它们。
您可以在 [Cloudflare 仪表板](https://dash.cloudflare.com) 设置中开启和关闭 Workers Logpush。此产品在 Workers 付费计划中可用。有关定价信息，请参阅[定价](/ai-gateway/reference/pricing)。

本指南解释了如何为 AI 网关设置 Workers Logpush，生成用于加密的 RSA 密钥对，以及在接收到日志后如何解密日志。

您每个网关最多可以存储 1000 万条日志。如果达到限制，新日志将停止保存，也不会通过 Workers Logpush 导出。要继续保存和导出日志，您必须删除较旧的日志以为新日志释放空间。Workers Logpush 限制为 4 个作业，每个日志的最大请求大小为 1 MB。

:::note[注意]

要使用 Workers Logpush 导出日志，您必须为网关开启日志记录。

:::

<Render file="limits-increase" product="ai-gateway" />

## 日志是如何加密的

我们采用混合加密模型来提高效率和安全性。首先，为每个日志生成一个 AES 密钥。这个 AES 密钥实际加密您的大部分数据，选择它是因为它在高效处理大型数据集方面的速度和安全性。

现在，为了安全地共享这个 AES 密钥，我们使用 RSA 加密。以下是发生的过程：AES 密钥虽然轻量级，但需要安全地传输给接收者。我们使用接收者的 RSA 公钥加密此密钥。此步骤利用 RSA 在安全密钥分发方面的优势，确保只有拥有相应 RSA 私钥的人才能解密和使用 AES 密钥。

加密后，AES 加密的数据和 RSA 加密的 AES 密钥一起发送。到达后，接收者的系统使用 RSA 私钥解密 AES 密钥。现在可以访问 AES 密钥，解密主数据载荷就很简单了。

此方法结合了两个世界的优点：用于数据加密的 AES 效率与 RSA 的安全密钥交换能力，确保在整个数据生命周期中最佳地维护数据完整性、机密性和性能。

## 设置 Workers Logpush

要为 AI 网关配置 Workers Logpush，请按以下步骤操作：

## 1. 在本地生成 RSA 密钥对

您需要生成一个密钥对来加密和解密日志。此脚本将输出您的 RSA 私钥和公钥。保持私钥安全，因为它将用于解密日志。下面是使用 Node.js 和 OpenSSL 生成密钥的示例脚本。

<Tabs syncKey="JSPlusSSL"> <TabItem label="JavaScript">

```js title="JavaScript"
const crypto = require("crypto");

const { privateKey, publicKey } = crypto.generateKeyPairSync("rsa", {
	modulusLength: 4096,
	publicKeyEncoding: {
		type: "spki",
		format: "pem",
	},
	privateKeyEncoding: {
		type: "pkcs8",
		format: "pem",
	},
});

console.log(publicKey);
console.log(privateKey);
```

通过在终端中执行以下代码运行脚本。将 `file name` 替换为您的 JavaScript 文件名。

```bash
node {file name}
```

</TabItem> <TabItem label="OpenSSL">

1. 生成私钥：
   使用以下命令生成 RSA 私钥：

   ```bash
   openssl genpkey -algorithm RSA -out private_key.pem -pkeyopt rsa_keygen_bits:4096
   ```

2. 生成公钥：
   生成私钥后，您可以使用以下命令提取相应的公钥：

   ```bash
   openssl rsa -pubout -in private_key.pem -out public_key.pem
   ```

</TabItem> </Tabs>

## 2. 将公钥上传到网关设置

生成密钥对后，将公钥上传到您的 AI 网关设置。此密钥将用于加密您的日志。要启用 Workers Logpush，您需要为该网关启用日志记录。

## 3. 设置 Logpush

要设置 Logpush，请参阅 [Logpush 快速开始](/logs/get-started/)。

## 4. 接收加密日志

配置 Workers Logpush 后，日志将使用您上传的公钥进行加密发送。要访问数据，您需要使用私钥对其进行解密。日志将发送到您选择的对象存储提供商。

## 5. 解密日志

要解密来自 AI 网关的加密日志正文和元数据，您可以使用以下 Node.js 脚本或 OpenSSL：

<Tabs syncKey="JSPlusSSL"> <TabItem label="JavaScript">

要解密来自 AI 网关的加密日志正文和元数据，请将日志下载到一个文件夹，在本例中名为 `my_log.log.gz`。

然后将此 JavaScript 文件复制到同一文件夹中，并将您的私钥放在顶部变量中。

```js title="JavaScript"
const privateKeyStr = `-----BEGIN RSA PRIVATE KEY-----
....
-----END RSA PRIVATE KEY-----`;

const crypto = require("crypto");
const privateKey = crypto.createPrivateKey(privateKeyStr);

const fs = require("fs");
const zlib = require("zlib");
const readline = require("readline");

async function importAESGCMKey(keyBuffer) {
	try {
		// 确保密钥长度对 AES 有效
		if ([128, 192, 256].includes(256)) {
			return await crypto.webcrypto.subtle.importKey(
				"raw",
				keyBuffer,
				{
					name: "AES-GCM",
					length: 256,
				},
				true, // 密钥是否可提取（在此情况下为 true，以便稍后需要时允许导出）
				["encrypt", "decrypt"], // 用于加密和解密
			);
		} else {
			throw new Error("无效的 AES 密钥长度。必须是 128、192 或 256 位。");
		}
	} catch (error) {
		console.error("导入密钥失败：", error);
		throw error;
	}
}

async function decryptData(encryptedData, aesKey, iv) {
	const decryptedData = await crypto.subtle.decrypt(
		{ name: "AES-GCM", iv: iv },
		aesKey,
		encryptedData,
	);
	return new TextDecoder().decode(decryptedData);
}

async function decryptBase64(privateKey, data) {
	if (data.key === undefined) {
		return data;
	}

	const aesKeyBuf = crypto.privateDecrypt(
		{
			key: privateKey,
			oaepHash: "SHA256",
		},
		Buffer.from(data.key, "base64"),
	);
	const aesKey = await importAESGCMKey(aesKeyBuf);

	const decryptedData = await decryptData(
		Buffer.from(data.data, "base64"),
		aesKey,
		Buffer.from(data.iv, "base64"),
	);

	return decryptedData.toString();
}

async function run() {
	let lineReader = readline.createInterface({
		input: fs.createReadStream("my_log.log.gz").pipe(zlib.createGunzip()),
	});

	lineReader.on("line", async (line) => {
		line = JSON.parse(line);

		const { Metadata, RequestBody, ResponseBody, ...remaining } = line;

		console.log({
			...remaining,
			Metadata: await decryptBase64(privateKey, Metadata),
			RequestBody: await decryptBase64(privateKey, RequestBody),
			ResponseBody: await decryptBase64(privateKey, ResponseBody),
		});
		console.log("--");
	});
}

run();
```

通过在终端中执行以下代码运行脚本。将 `file name` 替换为您的 JavaScript 文件名。

```bash
node {file name}
```

The script reads the encrypted log file `(my_log.log.gz)`, decrypts the metadata, request body, and response body, and prints the decrypted data.
Ensure you replace the `privateKey` variable with your actual private RSA key that you generated in step 1.

</TabItem> <TabItem label="OpenSSL">

1. Decrypt the encrypted log file using the private key.

Assuming that the logs were encrypted with the public key (for example `public_key.pem`), you can use the private key (`private_key.pem`) to decrypt the log file.

For example, if the encrypted logs are in a file named `encrypted_logs.bin`, you can decrypt it like this:

```bash
openssl rsautl -decrypt -inkey private_key.pem -in encrypted_logs.bin -out decrypted_logs.txt
```

- `-decrypt` tells OpenSSL that we want to decrypt the file.
- `-inkey private_key.pem` specifies the private key that will be used to decrypt the logs.
- `-in encrypted_logs.bin` is the encrypted log file.
- `-out decrypted_logs.txt`decrypted logs will be saved into this file.

2. View the decrypted logs
   Once decrypted, you can view the logs by simply running:

```bash
cat decrypted_logs.txt
```

This command will output the decrypted logs to the terminal.

</TabItem> </Tabs>

---

# 日志记录

URL: https://developers.cloudflare.com/ai-gateway/observability/logging/

import { Render } from "~/components";

日志记录是应用开发的基本构建块。日志在开发的早期阶段提供洞察，并且通常对于理解生产中发生的问题至关重要。

您的 AI 网关仪表板显示单个请求的日志，包括用户提示、模型响应、提供商、时间戳、请求状态、令牌使用量、成本和持续时间。这些日志会持久化，为您提供按首选持续时间存储它们的灵活性，并对有价值的请求数据做更多事情。

默认情况下，每个网关最多可以存储 1000 万条日志。您可以在网关设置中为每个网关自定义此限制，以符合您的特定要求。如果达到存储限制，新日志将停止保存。要继续保存日志，您必须删除较旧的日志以为新日志释放空间。
要了解更多关于您的计划限制，请参阅[限制](/ai-gateway/reference/limits/)。

我们建议在存储日志时使用已验证的网关，以防止未经授权的访问，并防范可能增加日志存储使用量并使您难以找到所需数据的无效请求。了解更多关于设置[已验证网关](/ai-gateway/configuration/authentication/)的信息。

## 默认配置

日志（包括指标以及请求和响应数据）默认为每个网关启用。此日志记录行为将统一应用于网关中的所有请求。如果您担心隐私或合规性并想关闭日志收集，您可以转到设置并选择退出日志。如果您需要为特定请求修改日志设置，您可以在每个请求的基础上覆盖此设置。

<Render file="logging" />

## 每个请求的日志记录

要覆盖在设置选项卡中设置的默认日志记录行为，您可以在每个请求的基础上定义标头。

### 收集日志 (`cf-aig-collect-log`)

`cf-aig-collect-log` 标头允许您绕过网关的默认日志设置。如果网关配置为保存日志，标头将排除该特定请求的日志。相反，如果在网关级别禁用日志记录，此标头将为该请求保存日志。

在下面的示例中，我们使用 `cf-aig-collect-log` 绕过默认设置以避免保存日志。

```bash
curl https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions \
  --header "Authorization: Bearer $TOKEN" \
  --header 'Content-Type: application/json' \
  --header 'cf-aig-collect-log: false \
  --data ' {
        "model": "gpt-4o-mini",
        "messages": [
          {
            "role": "user",
            "content": "What is the email address and phone number of user123?"
          }
        ]
      }
'
```

## 管理日志存储

要有效管理您的日志存储，您可以：

- 设置存储限制：在您的网关设置中配置每个网关存储的日志数量限制，以确保您只为所需的内容付费。
- 启用自动日志删除：在您的网关设置中激活自动日志删除功能，以在达到您设置的日志限制或默认存储限制 1000 万条日志时自动删除最旧的日志。这确保新日志始终得到保存，无需手动干预。

## 如何删除日志

要有效管理您的日志存储并确保持续日志记录，您可以使用以下方法删除日志：

### 自动日志删除

要在网关存储约束内维持持续日志记录，请在您的网关设置中启用自动日志删除。此功能在达到您设置的日志限制或默认存储限制 1000 万条日志时自动删除最旧的日志，确保新日志得到保存，无需手动干预。

### 手动删除

要通过仪表板手动删除日志，请导航到仪表板中的日志选项卡。使用可用的过滤器，如状态、缓存、提供商、成本或下拉菜单中的任何其他选项来细化您希望删除的日志。过滤后，选择删除日志以完成操作。

请参阅下面可用过滤器及其描述的完整列表：

| 过滤器类别 | 过滤器选项                        | 过滤器描述               |
| ---------- | --------------------------------- | ------------------------ |
| 状态       | 错误，状态                        | 错误类型或状态。         |
| 缓存       | 已缓存，未缓存                    | 基于是否被缓存。         |
| 提供商     | 特定提供商                        | 选定的 AI 提供商。       |
| AI 模型    | 特定模型                          | 选定的 AI 模型。         |
| 成本       | 小于，大于                        | 成本，指定阈值。         |
| 请求类型   | 通用，Workers AI 绑定，WebSockets | 请求的类型。             |
| 令牌       | 总令牌，输入令牌，输出令牌        | 令牌计数（小于或大于）。 |
| 持续时间   | 小于，大于                        | 请求持续时间。           |
| 反馈       | 等于，不等于（赞，踩，无反馈）    | 反馈类型。               |
| 元数据键   | 等于，不等于                      | 特定元数据键。           |
| 元数据值   | 等于，不等于                      | 特定元数据值。           |
| 日志 ID    | 等于，不等于                      | 特定日志 ID。            |
| 事件 ID    | 等于，不等于                      | 特定事件 ID。            |

### API 删除

您可以使用 AI 网关 API 以编程方式删除日志。有关 `DELETE` 日志端点的更全面信息，请查看 [Cloudflare API 文档](/api/resources/ai_gateway/subresources/logs/methods/delete/)。

---