Skip to main content

使用大型语言模型生成 Remotion 代码

本指南展示了如何使用 Vercel AI SDK 根据自然语言提示生成 Remotion 组件代码的示例。

🌐 This guide shows an example of how to generate Remotion component code from natural language prompts using the Vercel AI SDK.

安装

🌐 Installation

npm i --save-exact ai @ai-sdk/openai zod

基本示例

🌐 Basic Example

这个简单的系统提示完全依赖于模型对 Remotion 的现有知识——虽然不理想,但这是一个好的起点。

🌐 This simple system prompt relies entirely on the model's existing knowledge of Remotion - not ideal, but a good starting point.

generate.ts
import {generateText} from 'ai'; import {openai} from '@ai-sdk/openai'; const systemPrompt = ` You are a Remotion component generator. Generate a single React component that uses Remotion. Rules: - Export the component as a named export called "MyComposition" - Use useCurrentFrame() and useVideoConfig() from "remotion" - Use spring() for animations - Only output the code, no markdown or explanations `.trim(); const {text: code, usage} = await generateText({ model: openai('gpt-5.2'), system: systemPrompt, prompt: 'Create a countdown from 5 to 1 with spring animations', }); console.log(code); // "import {useCurrentFrame, ...} export const MyComposition ..." console.log(usage); // { inputTokens: 89, outputTokens: 205, totalTokens: 294 }

结果包括用于跟踪令牌消耗的使用元数据。 text 属性包含生成的代码作为字符串。经过格式化后看起来像这样:

🌐 The result includes usage metadata for tracking token consumption.
The text property contains the generated code as a string. With formatting it looks like this:

import {useCurrentFrame, useVideoConfig, spring, AbsoluteFill} from 'remotion';

export const MyComposition: React.FC = () => {
  const frame = useCurrentFrame();
  const {fps} = useVideoConfig();
  // ... rest of the component
};

请注意,原始输出通常包含 Markdown 代码块(如 ```jsx),这需要额外的提示或清理才能去除。 有关后处理的示例实现,请参见 Prompt to Motion Graphics 模板

🌐 Note that the raw output often includes markdown code fences (like ```jsx), which requires additional prompting or sanitation to remove.
For an example implementation of postprocessing, see the Prompt to Motion Graphics template.

使用 Zod 的结构化输出

🌐 Structured Output with Zod

现在,大多数较大的大型语言模型提供商都为其端点提供了结构化输出模式,因此你可以确切地定义你期望的属性。

🌐 Most bigger LLM providers offer a structured output mode for their endpoints by now, so you can define exactly which properties you are expecting.

要获得对输出的更多控制,请使用 generateText()output 属性获取带验证的结构化输出。

🌐 For more control over the output, use generateText() with the output property to get structured output with validation.

通常这已经足够让模型理解你实际上希望 code 属性是字符串化的代码,但你仍然可以在系统提示中再添加一个关于它的说明。

🌐 Often this is enough for the model to understand that you actually expect stringified code for the code property, but you can still add one more note about it in the system prompt.

generate-structured.ts
import {generateText, Output} from 'ai'; import {openai} from '@ai-sdk/openai'; import {z} from 'zod'; const systemPrompt = ` You are a Remotion component generator. Generate a React component that uses Remotion. For the code property, ALWAYS directly output the code as string without any markdown tags. Rules: - The component should be a named export called "MyComposition" - Use useCurrentFrame() and useVideoConfig() from "remotion" - Use spring() for smooth animations - Keep it self-contained with no external dependencies `.trim(); const {output} = await generateText({ model: openai('gpt-5.2'), system: systemPrompt, prompt: 'Create a text animation that types out "Hello World" letter by letter', maxRetries: 3, output: Output.object({ schema: z.object({ code: z.string().describe('The complete React component code'), title: z.string().describe('A short title for this composition'), durationInFrames: z.number().describe('Recommended duration in frames'), fps: z.number().min(1).max(120).describe('Recommended frames per second'), }), }), }); console.log(output.code); console.log(`Recommended: ${output.fps}fps`);

结构化输出保证你接收包含所有必填字段的有效 JSON,从而使配置播放器或渲染管道更容易。
如果存在模式不匹配,AI SDK 将自动重试最多 maxRetries 次(默认值为 2 次)。

🌐 The structured output guarantees you receive valid JSON with all required fields, making it easier to configure the Player or rendering pipeline.
If there is a schema mismatch, the AI SDK will automatically retry up to maxRetries times (default is 2).

找到合适的系统提示

🌐 Finding the right system prompt

找到完美的系统提示是很复杂的。
根据你想要构建的内容,你可以针对特定的动画类型甚至是你的公司品牌进行优化。

🌐 Finding the perfect system prompt is complex.
Depending on what you want to build, you could optimize for specific animation types or even your corporate branding.

然而,一个很好的起点是使用 Remotion 的 系统提示,它教授通用的 LLM Remotion API 和最佳实践。
你可以从那里进行迭代,生成几个组件,并根据你的需求进行调整。

🌐 However, a great starting place is to use Remotion's System Prompt which teaches the general LLM Remotion's APIs and best practices.
You can iterate from there, generate a couple components and tune it to your needs.

注意上下文衰减——随着上下文增大,输出质量和指令指导会下降。

🌐 Be careful of context rot - as the context grows, the output quality and instructiong degrades.

技能

🌐 Skills

你可以使用 技能 —— 模块化的知识单元,根据用户的请求进行注入,而不是使用一个大型的系统提示。

🌐 Instead of one large system prompt, you can use Skills - modular knowledge units that get injected based on the user's request.

这种方法保持了基础提示的轻量化,并提高了专业字段(例如:图表、排版、过渡、时序、3D)的输出质量。

🌐 This approach keeps the base prompt lightweight and improves output quality for specialized domains (for example: charts, typography, transitions, timings, 3D).

这里有一个关于技能检测如何工作的最小示例:

🌐 Here's a minimal example of how skill detection works:

skill-detection.ts
import {generateText, Output} from 'ai'; import {openai} from '@ai-sdk/openai'; import {z} from 'zod'; const SKILL_NAMES = ['charts', 'typography', 'transitions', 'spring-physics', '3d'] as const; // Step 1: Detect which skills are needed const {output: detectedSkills} = await generateText({ model: openai('gpt-5-mini'), // Use a fast, cheap model for classification prompt: 'Create a bouncy bar chart showing quarterly sales data', output: Output.object({ schema: z.object({ skills: z.array(z.enum(SKILL_NAMES)).describe('Matching skill categories'), }), }), }); console.log(detectedSkills.skills); // ["charts", "spring-physics"] // Step 2: Load only the relevant skill content const skillContent = detectedSkills.skills .map((skill) => loadSkillMarkdown(skill)) // Your function to load skill files .join('\n\n'); // Step 3: Generate code with focused context const {output} = await generateText({ model: openai('gpt-5.2'), system: baseSystemPrompt + '\n\n' + skillContent, prompt: 'Create a bouncy bar chart showing quarterly sales data', output: Output.object({ schema: z.object({ code: z.string().describe('The complete React component code'), // metadata, your other properties, ... }), }), });

请参阅 Prompt to Motion Graphics 模板 以获取完整的示例实现。

🌐 See the Prompt to Motion Graphics template for a complete example implementation.

下一步

🌐 Next Steps

在你生成代码之后,你需要对其进行编译和渲染。 请看这里展示了如何将这段代码字符串在浏览器中变成实时预览。

🌐 After you generated the code, you need to compile and render it.
See here shows how to turn this code string into a live preview in the browser.

另请参阅

🌐 See Also