Tailwind CSSTailwind CSS
Home
  • Tailwind CSS 书籍目录
  • Vue 3 开发实战指南
  • React 和 Next.js 学习
  • TypeScript
  • React开发框架书籍大纲
  • Shadcn学习大纲
  • Swift 编程语言:从入门到进阶
  • SwiftUI 学习指南
  • 函数式编程大纲
  • Swift 异步编程语言
  • Swift 协议化编程
  • SwiftUI MVVM 开发模式
  • SwiftUI 图表开发书籍
  • SwiftData
  • ArkTS编程语言:从入门到精通
  • 仓颉编程语言:从入门到精通
  • 鸿蒙手机客户端开发实战
  • WPF书籍
  • C#开发书籍
learn
  • 搜索未来:SEO与GEO双引擎实战手册
  • Java编程语言
  • Kotlin 编程入门与实战
  • /python/outline.html
  • Rust 开发入门
  • AI Agent
  • MCP (Model Context Protocol) 应用指南
  • 深度学习
  • 深度学习
  • 强化学习: 理论与实践
  • 扩散模型书籍
  • Agentic AI for Everyone
langchain
Home
  • Tailwind CSS 书籍目录
  • Vue 3 开发实战指南
  • React 和 Next.js 学习
  • TypeScript
  • React开发框架书籍大纲
  • Shadcn学习大纲
  • Swift 编程语言:从入门到进阶
  • SwiftUI 学习指南
  • 函数式编程大纲
  • Swift 异步编程语言
  • Swift 协议化编程
  • SwiftUI MVVM 开发模式
  • SwiftUI 图表开发书籍
  • SwiftData
  • ArkTS编程语言:从入门到精通
  • 仓颉编程语言:从入门到精通
  • 鸿蒙手机客户端开发实战
  • WPF书籍
  • C#开发书籍
learn
  • 搜索未来:SEO与GEO双引擎实战手册
  • Java编程语言
  • Kotlin 编程入门与实战
  • /python/outline.html
  • Rust 开发入门
  • AI Agent
  • MCP (Model Context Protocol) 应用指南
  • 深度学习
  • 深度学习
  • 强化学习: 理论与实践
  • 扩散模型书籍
  • Agentic AI for Everyone
langchain
  • 附录E.5:App与Web内容统一API设计

附录E.5:App与Web内容统一API设计

1. 为什么需要统一API设计

在双引擎优化时代,App与Web内容的割裂是常见问题。生成式搜索引擎(如Google SGE、Perplexity、豆包)和传统搜索引擎都需要抓取和理解你的内容。如果App与Web使用不同的数据结构、API接口或内容呈现方式,会导致:

  • 爬虫抓取不一致:AI爬虫可能只能抓取Web版本,而App中的优质内容无法被索引
  • 内容重复与冲突:同一内容在Web和App中以不同格式呈现,导致搜索引擎困惑
  • 维护成本翻倍:需要维护两套内容生产与发布流程
  • GEO效果打折:生成式引擎无法从App中提取结构化答案

统一API设计的目标是:一套数据源,多种呈现方式,同时服务于Web、App和生成式引擎。

2. 统一API的核心原则

2.1 数据层统一

  • 单一事实源:所有内容(文本、图片、视频、结构化数据)存储在统一的数据层
  • 版本控制:内容变更通过API版本管理,确保Web和App使用相同版本
  • 多租户支持:同一API可返回不同格式(JSON、HTML、XML)以适应不同客户端

2.2 内容结构化

  • 语义化字段:每个内容单元包含标题、摘要、正文、作者、发布时间、分类、标签等
  • 实体关联:内容中的人名、地点、产品、概念等实体需标记并关联知识图谱
  • 多模态支持:文本、图片、视频、表格、代码块等作为独立内容块,可被单独引用

2.3 访问控制

  • 爬虫友好:API端点对AI爬虫开放,无需登录即可访问关键内容
  • 速率限制:对爬虫和普通用户实施不同的速率限制策略
  • 身份验证:App用户通过OAuth认证,Web用户通过Cookie/Session

3. 统一API架构设计

3.1 整体架构

┌─────────────────────────────────────────────────────────────┐
│                    客户端层 (Client Layer)                    │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────────┐ │
│  │ Web App  │  │ iOS App  │  │ Android  │  │ AI Crawler │ │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └─────┬──────┘ │
└───────┼──────────────┼──────────────┼──────────────┼────────┘
        │              │              │              │
        ▼              ▼              ▼              ▼
┌─────────────────────────────────────────────────────────────┐
│                     API网关层 (API Gateway)                   │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ 路由与负载均衡│  │ 身份认证     │  │ 速率限制     │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ 请求转换     │  │ 缓存策略     │  │ 日志与监控   │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                    业务逻辑层 (Business Logic)               │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ 内容服务     │  │ 用户服务     │  │ 搜索服务     │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ 推荐服务     │  │ 分析服务     │  │ 通知服务     │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                    数据层 (Data Layer)                       │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────────┐ │
│  │ 关系数据库│  │ 文档数据库│  │ 缓存     │  │ 搜索引擎  │ │
│  └──────────┘  └──────────┘  └──────────┘  └────────────┘ │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────────┐ │
│  │ 对象存储 │  │ 消息队列 │  │ 数据仓库 │  │ 知识图谱  │ │
│  └──────────┘  └──────────┘  └──────────┘  └────────────┘ │
└─────────────────────────────────────────────────────────────┘

3.2 API端点设计

3.2.1 内容API

# 内容API端点
openapi: 3.0.0
info:
  title: 统一内容API
  version: 1.0.0
paths:
  /api/v1/content/{id}:
    get:
      summary: 获取内容详情
      parameters:
        - name: id
          in: path
          required: true
          schema:
            type: string
        - name: format
          in: query
          schema:
            type: string
            enum: [json, html, markdown, structured]
          default: json
        - name: client
          in: query
          schema:
            type: string
            enum: [web, ios, android, crawler]
          default: web
      responses:
        '200':
          description: 成功返回内容
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ContentResponse'
  
  /api/v1/content/list:
    get:
      summary: 获取内容列表
      parameters:
        - name: category
          in: query
          schema:
            type: string
        - name: tags
          in: query
          schema:
            type: array
            items:
              type: string
        - name: page
          in: query
          schema:
            type: integer
            default: 1
        - name: limit
          in: query
          schema:
            type: integer
            default: 20
      responses:
        '200':
          description: 成功返回内容列表

components:
  schemas:
    ContentResponse:
      type: object
      properties:
        id:
          type: string
        title:
          type: string
        summary:
          type: string
        body:
          type: string
        structured_data:
          type: object
          properties:
            schema_type:
              type: string
              example: Article
            json_ld:
              type: object
        author:
          type: object
          properties:
            name:
              type: string
            url:
              type: string
        publish_date:
          type: string
          format: date-time
        last_modified:
          type: string
          format: date-time
        category:
          type: string
        tags:
          type: array
          items:
            type: string
        entities:
          type: array
          items:
            type: object
            properties:
              name:
                type: string
              type:
                type: string
                enum: [person, place, product, concept, organization]
              wikidata_id:
                type: string
        media:
          type: array
          items:
            type: object
            properties:
              type:
                type: string
                enum: [image, video, audio, document]
              url:
                type: string
              alt_text:
                type: string
        related_content:
          type: array
          items:
            type: object
            properties:
              id:
                type: string
              title:
                type: string
              url:
                type: string
        seo_metadata:
          type: object
          properties:
            canonical_url:
              type: string
            meta_description:
              type: string
            open_graph:
              type: object
            twitter_card:
              type: object

3.2.2 搜索API

# 搜索API端点
paths:
  /api/v1/search:
    get:
      summary: 统一搜索
      parameters:
        - name: q
          in: query
          required: true
          schema:
            type: string
        - name: type
          in: query
          schema:
            type: string
            enum: [all, article, product, video, question]
          default: all
        - name: source
          in: query
          schema:
            type: string
            enum: [web, app, both]
          default: both
        - name: page
          in: query
          schema:
            type: integer
            default: 1
      responses:
        '200':
          description: 搜索结果
          content:
            application/json:
              schema:
                type: object
                properties:
                  total_results:
                    type: integer
                  results:
                    type: array
                    items:
                      $ref: '#/components/schemas/SearchResult'
                  facets:
                    type: object
                  suggestions:
                    type: array
                    items:
                      type: string

components:
  schemas:
    SearchResult:
      type: object
      properties:
        id:
          type: string
        title:
          type: string
        snippet:
          type: string
        url:
          type: string
        app_deep_link:
          type: string
        source:
          type: string
          enum: [web, app]
        score:
          type: number
        type:
          type: string
        thumbnail:
          type: string
        publish_date:
          type: string
          format: date-time

3.3 数据结构设计

3.3.1 内容单元模型

# 统一内容单元模型
from dataclasses import dataclass, field
from typing import List, Optional, Dict, Any
from datetime import datetime

@dataclass
class ContentUnit:
    """统一内容单元"""
    id: str
    title: str
    summary: str
    body: str  # Markdown格式正文
    body_html: str  # HTML格式正文
    body_structured: Dict[str, Any]  # 结构化正文(用于AI解析)
    
    # 元数据
    author: Dict[str, str]
    publish_date: datetime
    last_modified: datetime
    category: str
    tags: List[str]
    
    # 实体
    entities: List[Dict[str, str]] = field(default_factory=list)
    
    # 媒体
    media: List[Dict[str, str]] = field(default_factory=list)
    
    # 关联内容
    related_content: List[Dict[str, str]] = field(default_factory=list)
    
    # SEO元数据
    seo_metadata: Dict[str, Any] = field(default_factory=dict)
    
    # 结构化数据(JSON-LD)
    json_ld: Dict[str, Any] = field(default_factory=dict)
    
    # 平台特定数据
    platform_data: Dict[str, Any] = field(default_factory=dict)
    # 示例:{"web": {"canonical_url": "..."}, "app": {"deep_link": "..."}}
    
    # 版本信息
    version: int = 1
    status: str = "published"  # draft, published, archived

3.3.2 响应格式示例

JSON格式(默认):

{
  "id": "content_12345",
  "title": "如何优化Core Web Vitals",
  "summary": "本文详细介绍LCP、INP、CLS的优化方法...",
  "body": "## 引言\nCore Web Vitals是Google的重要排名因素...\n\n### LCP优化\n...",
  "structured_data": {
    "schema_type": "Article",
    "json_ld": {
      "@context": "https://schema.org",
      "@type": "Article",
      "headline": "如何优化Core Web Vitals",
      "datePublished": "2024-01-15T10:00:00Z",
      "author": {
        "@type": "Person",
        "name": "张三"
      }
    }
  },
  "entities": [
    {"name": "Core Web Vitals", "type": "concept", "wikidata_id": "Q123456"},
    {"name": "LCP", "type": "concept", "wikidata_id": "Q789012"}
  ],
  "platform_data": {
    "web": {
      "canonical_url": "https://example.com/article/core-web-vitals",
      "meta_description": "全面的Core Web Vitals优化指南"
    },
    "app": {
      "deep_link": "myapp://article/core-web-vitals",
      "screen": "ArticleDetail"
    }
  }
}

结构化格式(针对AI爬虫):

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "如何优化Core Web Vitals",
  "author": {
    "@type": "Person",
    "name": "张三"
  },
  "hasPart": [
    {
      "@type": "WebPageElement",
      "name": "引言",
      "text": "Core Web Vitals是Google的重要排名因素..."
    },
    {
      "@type": "WebPageElement",
      "name": "LCP优化",
      "text": "LCP(Largest Contentful Paint)..."
    }
  ],
  "mainEntity": {
    "@type": "Question",
    "name": "如何优化Core Web Vitals",
    "acceptedAnswer": {
      "@type": "Answer",
      "text": "优化Core Web Vitals需要关注三个指标:LCP、INP、CLS..."
    }
  }
}

4. 实现示例

4.1 Python (FastAPI) 实现

# app.py
from fastapi import FastAPI, Query, HTTPException
from fastapi.responses import JSONResponse, HTMLResponse
from typing import Optional, List
from datetime import datetime
import json

app = FastAPI(title="统一内容API")

# 模拟数据存储
content_store = {
    "content_12345": {
        "id": "content_12345",
        "title": "如何优化Core Web Vitals",
        "summary": "本文详细介绍LCP、INP、CLS的优化方法...",
        "body": "## 引言\nCore Web Vitals是Google的重要排名因素...\n\n### LCP优化\n...",
        "body_html": "<h2>引言</h2><p>Core Web Vitals是Google的重要排名因素...</p><h3>LCP优化</h3><p>...</p>",
        "author": {"name": "张三", "url": "https://example.com/author/zhangsan"},
        "publish_date": "2024-01-15T10:00:00Z",
        "last_modified": "2024-02-20T14:30:00Z",
        "category": "SEO",
        "tags": ["Core Web Vitals", "LCP", "INP", "CLS"],
        "entities": [
            {"name": "Core Web Vitals", "type": "concept", "wikidata_id": "Q123456"},
            {"name": "LCP", "type": "concept", "wikidata_id": "Q789012"}
        ],
        "media": [
            {"type": "image", "url": "https://example.com/images/cwv-chart.png", "alt_text": "CWV指标对比图"}
        ],
        "related_content": [
            {"id": "content_12346", "title": "INP优化最佳实践", "url": "https://example.com/article/inp-optimization"}
        ],
        "seo_metadata": {
            "canonical_url": "https://example.com/article/core-web-vitals",
            "meta_description": "全面的Core Web Vitals优化指南",
            "open_graph": {
                "title": "如何优化Core Web Vitals",
                "description": "全面的Core Web Vitals优化指南",
                "image": "https://example.com/images/og-cwv.png"
            }
        },
        "json_ld": {
            "@context": "https://schema.org",
            "@type": "Article",
            "headline": "如何优化Core Web Vitals",
            "datePublished": "2024-01-15T10:00:00Z",
            "author": {"@type": "Person", "name": "张三"}
        },
        "platform_data": {
            "web": {
                "canonical_url": "https://example.com/article/core-web-vitals"
            },
            "app": {
                "deep_link": "myapp://article/core-web-vitals",
                "screen": "ArticleDetail"
            }
        },
        "version": 3,
        "status": "published"
    }
}

@app.get("/api/v1/content/{content_id}")
async def get_content(
    content_id: str,
    format: str = Query("json", enum=["json", "html", "markdown", "structured"]),
    client: str = Query("web", enum=["web", "ios", "android", "crawler"])
):
    """获取内容详情"""
    if content_id not in content_store:
        raise HTTPException(status_code=404, detail="Content not found")
    
    content = content_store[content_id]
    
    if format == "json":
        return JSONResponse(content=content)
    
    elif format == "html":
        return HTMLResponse(content=content["body_html"])
    
    elif format == "markdown":
        return JSONResponse(content={"body": content["body"]})
    
    elif format == "structured":
        # 返回针对AI爬虫的结构化数据
        structured_response = {
            "@context": "https://schema.org",
            "@type": "Article",
            "headline": content["title"],
            "author": content["author"],
            "datePublished": content["publish_date"],
            "hasPart": [
                {
                    "@type": "WebPageElement",
                    "name": "引言",
                    "text": "Core Web Vitals是Google的重要排名因素..."
                }
            ],
            "mainEntity": {
                "@type": "Question",
                "name": content["title"],
                "acceptedAnswer": {
                    "@type": "Answer",
                    "text": content["summary"]
                }
            }
        }
        return JSONResponse(content=structured_response)

@app.get("/api/v1/content/list")
async def list_content(
    category: Optional[str] = None,
    tags: Optional[List[str]] = Query(None),
    page: int = Query(1, ge=1),
    limit: int = Query(20, ge=1, le=100)
):
    """获取内容列表"""
    # 过滤逻辑
    filtered = list(content_store.values())
    if category:
        filtered = [c for c in filtered if c["category"] == category]
    if tags:
        filtered = [c for c in filtered if any(tag in c["tags"] for tag in tags)]
    
    # 分页
    start = (page - 1) * limit
    end = start + limit
    results = filtered[start:end]
    
    return JSONResponse(content={
        "total": len(filtered),
        "page": page,
        "limit": limit,
        "results": [
            {
                "id": c["id"],
                "title": c["title"],
                "summary": c["summary"],
                "category": c["category"],
                "tags": c["tags"],
                "publish_date": c["publish_date"],
                "url": c["seo_metadata"]["canonical_url"],
                "app_deep_link": c["platform_data"]["app"]["deep_link"]
            }
            for c in results
        ]
    })

@app.get("/api/v1/search")
async def search_content(
    q: str = Query(..., min_length=1),
    type: str = Query("all", enum=["all", "article", "product", "video", "question"]),
    source: str = Query("both", enum=["web", "app", "both"]),
    page: int = Query(1, ge=1)
):
    """统一搜索"""
    # 模拟搜索逻辑
    results = [
        {
            "id": "content_12345",
            "title": "如何优化Core Web Vitals",
            "snippet": "Core Web Vitals是Google的重要排名因素...",
            "url": "https://example.com/article/core-web-vitals",
            "app_deep_link": "myapp://article/core-web-vitals",
            "source": "web",
            "score": 0.95,
            "type": "article",
            "thumbnail": "https://example.com/images/cwv-thumb.png",
            "publish_date": "2024-01-15T10:00:00Z"
        }
    ]
    
    return JSONResponse(content={
        "total_results": len(results),
        "results": results,
        "facets": {
            "types": {"article": 1, "video": 0},
            "sources": {"web": 1, "app": 0}
        },
        "suggestions": ["Core Web Vitals优化", "LCP优化方法"]
    })

4.2 Node.js (Express) 实现

// server.js
const express = require('express');
const app = express();

// 模拟数据
const contentStore = {
  'content_12345': {
    id: 'content_12345',
    title: '如何优化Core Web Vitals',
    summary: '本文详细介绍LCP、INP、CLS的优化方法...',
    body: '## 引言\nCore Web Vitals是Google的重要排名因素...\n\n### LCP优化\n...',
    body_html: '<h2>引言</h2><p>Core Web Vitals是Google的重要排名因素...</p><h3>LCP优化</h3><p>...</p>',
    author: { name: '张三', url: 'https://example.com/author/zhangsan' },
    publish_date: '2024-01-15T10:00:00Z',
    last_modified: '2024-02-20T14:30:00Z',
    category: 'SEO',
    tags: ['Core Web Vitals', 'LCP', 'INP', 'CLS'],
    entities: [
      { name: 'Core Web Vitals', type: 'concept', wikidata_id: 'Q123456' },
      { name: 'LCP', type: 'concept', wikidata_id: 'Q789012' }
    ],
    media: [
      { type: 'image', url: 'https://example.com/images/cwv-chart.png', alt_text: 'CWV指标对比图' }
    ],
    related_content: [
      { id: 'content_12346', title: 'INP优化最佳实践', url: 'https://example.com/article/inp-optimization' }
    ],
    seo_metadata: {
      canonical_url: 'https://example.com/article/core-web-vitals',
      meta_description: '全面的Core Web Vitals优化指南',
      open_graph: {
        title: '如何优化Core Web Vitals',
        description: '全面的Core Web Vitals优化指南',
        image: 'https://example.com/images/og-cwv.png'
      }
    },
    json_ld: {
      '@context': 'https://schema.org',
      '@type': 'Article',
      headline: '如何优化Core Web Vitals',
      datePublished: '2024-01-15T10:00:00Z',
      author: { '@type': 'Person', name: '张三' }
    },
    platform_data: {
      web: { canonical_url: 'https://example.com/article/core-web-vitals' },
      app: { deep_link: 'myapp://article/core-web-vitals', screen: 'ArticleDetail' }
    },
    version: 3,
    status: 'published'
  }
};

// 内容详情API
app.get('/api/v1/content/:id', (req, res) => {
  const { id } = req.params;
  const format = req.query.format || 'json';
  const client = req.query.client || 'web';
  
  const content = contentStore[id];
  if (!content) {
    return res.status(404).json({ error: 'Content not found' });
  }
  
  if (format === 'json') {
    res.json(content);
  } else if (format === 'html') {
    res.send(content.body_html);
  } else if (format === 'markdown') {
    res.json({ body: content.body });
  } else if (format === 'structured') {
    res.json({
      '@context': 'https://schema.org',
      '@type': 'Article',
      headline: content.title,
      author: content.author,
      datePublished: content.publish_date,
      hasPart: [
        {
          '@type': 'WebPageElement',
          name: '引言',
          text: 'Core Web Vitals是Google的重要排名因素...'
        }
      ],
      mainEntity: {
        '@type': 'Question',
        name: content.title,
        acceptedAnswer: {
          '@type': 'Answer',
          text: content.summary
        }
      }
    });
  }
});

// 内容列表API
app.get('/api/v1/content/list', (req, res) => {
  const { category, tags, page = 1, limit = 20 } = req.query;
  
  let filtered = Object.values(contentStore);
  
  if (category) {
    filtered = filtered.filter(c => c.category === category);
  }
  if (tags) {
    const tagArray = Array.isArray(tags) ? tags : [tags];
    filtered = filtered.filter(c => 
      tagArray.some(tag => c.tags.includes(tag))
    );
  }
  
  const start = (page - 1) * limit;
  const end = start + parseInt(limit);
  const results = filtered.slice(start, end);
  
  res.json({
    total: filtered.length,
    page: parseInt(page),
    limit: parseInt(limit),
    results: results.map(c => ({
      id: c.id,
      title: c.title,
      summary: c.summary,
      category: c.category,
      tags: c.tags,
      publish_date: c.publish_date,
      url: c.seo_metadata.canonical_url,
      app_deep_link: c.platform_data.app.deep_link
    }))
  });
});

// 搜索API
app.get('/api/v1/search', (req, res) => {
  const { q, type = 'all', source = 'both', page = 1 } = req.query;
  
  // 模拟搜索结果
  const results = [
    {
      id: 'content_12345',
      title: '如何优化Core Web Vitals',
      snippet: 'Core Web Vitals是Google的重要排名因素...',
      url: 'https://example.com/article/core-web-vitals',
      app_deep_link: 'myapp://article/core-web-vitals',
      source: 'web',
      score: 0.95,
      type: 'article',
      thumbnail: 'https://example.com/images/cwv-thumb.png',
      publish_date: '2024-01-15T10:00:00Z'
    }
  ];
  
  res.json({
    total_results: results.length,
    results,
    facets: {
      types: { article: 1, video: 0 },
      sources: { web: 1, app: 0 }
    },
    suggestions: ['Core Web Vitals优化', 'LCP优化方法']
  });
});

app.listen(3000, () => {
  console.log('统一内容API运行在 http://localhost:3000');
});

5. 与生成式引擎的集成

5.1 为AI爬虫优化API响应

# 检测AI爬虫中间件
from fastapi import Request

AI_CRAWLERS = {
    "GPTBot": "OpenAI",
    "GoogleOther": "Google",
    "CCBot": "CommonCrawl",
    "ClaudeBot": "Anthropic",
    "Bytespider": "ByteDance",
    "DeepSeek-Bot": "DeepSeek",
    "Amazonbot": "Amazon"
}

@app.middleware("http")
async def detect_ai_crawler(request: Request, call_next):
    user_agent = request.headers.get("user-agent", "").lower()
    
    # 检测是否为AI爬虫
    is_ai_crawler = any(
        crawler.lower() in user_agent 
        for crawler in AI_CRAWLERS.keys()
    )
    
    if is_ai_crawler:
        # 自动返回结构化格式
        request.state.format = "structured"
        request.state.client = "crawler"
    else:
        request.state.format = request.query_params.get("format", "json")
        request.state.client = request.query_params.get("client", "web")
    
    response = await call_next(request)
    return response

5.2 深度链接生成

# 深度链接生成器
from typing import Dict

class DeepLinkGenerator:
    """生成统一深度链接"""
    
    @staticmethod
    def generate_deep_links(content_id: str, platform_data: Dict) -> Dict:
        """生成各平台深度链接"""
        return {
            "web": platform_data.get("web", {}).get("canonical_url", f"/content/{content_id}"),
            "ios": f"myapp://content/{content_id}",
            "android": f"myapp://content/{content_id}",
            "universal": f"https://example.com/content/{content_id}"
        }
    
    @staticmethod
    def generate_app_links(content: Dict) -> Dict:
        """生成App Links和Universal Links"""
        return {
            "applinks": {
                "apps": [],
                "details": [
                    {
                        "appID": "ABCD1234.com.example.app",
                        "paths": [f"/content/{content['id']}"]
                    }
                ]
            },
            "apple-app-site-association": {
                "applinks": {
                    "apps": [],
                    "details": [
                        {
                            "appID": "ABCD1234.com.example.app",
                            "paths": [f"/content/{content['id']}"]
                        }
                    ]
                }
            }
        }

6. 部署与监控

6.1 Docker Compose配置

# docker-compose.yml
version: '3.8'

services:
  api:
    build: .
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgresql://user:pass@db:5432/contentdb
      - REDIS_URL=redis://cache:6379
      - ENVIRONMENT=production
    depends_on:
      - db
      - cache
    volumes:
      - ./logs:/app/logs
    deploy:
      replicas: 3
      resources:
        limits:
          cpus: '0.5'
          memory: 512M

  db:
    image: postgres:15
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=pass
      - POSTGRES_DB=contentdb
    volumes:
      - postgres_data:/var/lib/postgresql/data

  cache:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
      - ./ssl:/etc/nginx/ssl
    depends_on:
      - api

volumes:
  postgres_data:
  redis_data:

6.2 监控指标

# 监控指标示例
from prometheus_client import Counter, Histogram, Gauge

# API请求计数
api_requests_total = Counter(
    'api_requests_total',
    'Total API requests',
    ['endpoint', 'method', 'status']
)

# 响应时间
api_response_time = Histogram(
    'api_response_time_seconds',
    'API response time in seconds',
    ['endpoint'],
    buckets=[0.1, 0.25, 0.5, 1, 2.5, 5, 10]
)

# 缓存命中率
cache_hit_ratio = Gauge(
    'cache_hit_ratio',
    'Cache hit ratio',
    ['cache_type']
)

# 内容版本分布
content_version_gauge = Gauge(
    'content_version',
    'Content version distribution',
    ['version']
)

7. 最佳实践与注意事项

7.1 性能优化

  • 缓存策略:对高频访问的内容使用Redis缓存,设置合理的TTL
  • CDN加速:静态内容(图片、CSS、JS)通过CDN分发
  • 数据库索引:对常用查询字段(id、category、tags)建立索引
  • 异步处理:内容更新使用消息队列异步处理

7.2 安全考虑

  • API认证:对敏感操作使用API Key或OAuth认证
  • 速率限制:对AI爬虫和普通用户实施不同的速率限制
  • 数据验证:所有输入数据必须经过验证和清理
  • HTTPS:强制使用HTTPS传输

7.3 SEO/GEO注意事项

  • 规范化URL:确保Web和App使用相同的规范化URL
  • 结构化数据一致性:Web和App返回的JSON-LD数据保持一致
  • 深度链接验证:定期检查App深度链接是否有效
  • 爬虫友好:确保AI爬虫可以访问关键API端点

7.4 常见陷阱

陷阱解决方案
Web和App内容不一致使用统一数据源,通过API版本控制
深度链接失效定期自动化测试深度链接
API响应过慢实施缓存策略,优化数据库查询
爬虫被限制为AI爬虫设置专门的速率限制策略
结构化数据不完整使用Schema验证工具检查JSON-LD

8. 总结

统一API设计是连接Web和App内容、同时服务于传统搜索引擎和生成式搜索引擎的关键基础设施。通过实施:

  1. 单一数据源:所有内容存储在同一数据层
  2. 结构化输出:支持JSON、HTML、Markdown、结构化等多种格式
  3. 平台适配:根据客户端类型返回优化内容
  4. 爬虫友好:为AI爬虫提供专门的结构化响应
  5. 监控与优化:持续监控API性能和内容一致性

这套设计可以显著提升内容的可发现性,确保生成式引擎能够准确引用你的内容,同时降低维护成本。

Last Updated:: 5/9/26, 5:13 PM