Skip to main content

客户端渲染的工作原理

warning

非常实验性的功能——随时可能出现漏洞和重大更改。 在 GitHub 上跟踪进度 并在 Discord 的 #web-renderer 通道中讨论。

客户端渲染的最大挑战是无法捕获浏览器视口。只有某些 HTML 元素,例如 <canvas><img><video><svg>,可以被原生捕获。

🌐 The biggest challenge of client-side rendering is that it is not possible to capture the browser viewport.
Only certain HTML elements such as <canvas>, <img>, <video> or <svg> can be captured natively.

与服务器端渲染不同,在服务器端渲染中会制作像素级精确的截图,而在客户端渲染中,Remotion 会根据它认为元素在 DOM 中的位置和显示方式,将所有元素放置在画布上。

🌐 Unlike in server-side rendering, where a pixel-perfect screenshot is made, in client-side rendering Remotion places all elements on a canvas based on how it believes they are positioned and appearing in the DOM.

为此,Remotion 开发了一个复杂的算法来计算画布上元素的布局。 当然,我们无法支持所有的网页功能,因此只支持特定子集的元素和样式

🌐 For this, Remotion has developed a sophisticated algorithm for calculating the placement of the elements on the canvas.
Of course, we cannot support all web features, so only a specific subset of elements and styles are supported.

渲染过程

🌐 Rendering process

初始化

🌐 Initialization

首先,组件被挂载在 DOM 中一个用户看不到的位置。与此同时,初始化一个空画布。

🌐 First, the component is mounted in the DOM in a place where it is not visible to the user.
Simultaneously, an empty canvas is initialized.

帧捕获过程

🌐 Frame capture process

对于每个需要渲染的帧,渲染器使用 element.createTreeWalker() 来查找 DOM 中的所有元素和文本节点。具有 display: none 的节点及其子节点将被跳过。

🌐 For each frame that needs to be rendered, the renderer uses element.createTreeWalker() to find all elements and text nodes in the DOM. Nodes that have display: none and their children are skipped.

对于每个可捕获的元素,渲染器:

🌐 For each capturable element, the renderer:

  1. 沿着 DOM 树向上,并将所有 transform CSS 属性重置为 none
  2. 使用 .getBoundingClientRect() 获取边界框,以及父元素的边界框。
  3. 将转换和位置相加以确定元素在 DOM 中的原始位置。
  4. 获取元素的像素 - 对于 <svg><canvas><img> 元素,这些可以被捕获。对于文本节点,布局是手动重建的。
  5. 根据计算后的摆放将它们绘制到画布上。

音频捕获

🌐 Audio capture

来自已安装的 <Audio><Video> 元素的音频被捕获并混合在一起,然后添加到视频的音轨中。

🌐 Audio from mounted <Audio> and <Video> elements is captured and mixed together, and added to the audio track of the video.

编码

🌐 Encoding

Mediabunny 用于将帧和处理过的音频编码成视频文件。

捕捉像素

🌐 Capturing pixels

对于 <svg><canvas><img> 元素,可以使用广泛记录的技术本地捕获像素。

🌐 For <svg>, <canvas>, <img> elements, the pixels can be captured natively using the widely documented techniques.

对于渲染其他类型的元素,仅支持部分属性,例如 backgroundborderborder-radius。这些样式是使用 Canvas 2D API 手动绘制到画布上的。

🌐 For rendering other types of elements, only a subset of properties are supported such as background, border and border-radius. These styles are drawn to the canvas manually with the Canvas 2D API.

捕获文本节点

🌐 Capturing text nodes

对于文本节点,需要进行更多的布局计算。

🌐 For text nodes, more layout calculations need to be made.

通常情况下,获取文本节点的边界框是不可能的,但通过将文本节点封装在 <span> 元素中,我们可以在该 span 上调用 .getBoundingClientRect() 来获取边界框,并如上文 [#frame-capture-process] 所述解析变换。

🌐 Normally, it is not possible to get the bounding box of a text node, but by wrapping a text node in a <span> element, we can call .getBoundingClientRect() on the span to get the bounding box and resolve the transforms as described above.

然后使用 Intl.Segmenter 将文本拆分成单词,每个标记再次被 <span> 封装。对于每个标记,调用 .getBoundingClientRect(),并将标记绘制到画布上。

🌐 Then Intl.Segmenter is used to split the text into words, and each token is again wrapped in a <span>. For each token, .getBoundingClientRect() is called and the tokens are drawn to the canvas.

最终,DOM 会被重置到其原始状态。

🌐 In the end, the DOM is reset to its original state.

上下文隔离

🌐 Context isolation

渲染发生在与你的应用相同的浏览器环境中。这意味着 CSS 和 Tailwind 变量会自动工作,但你有可能与宿主页面发生冲突。

🌐 Renders happen in the same browser environment as your app. This means CSS and Tailwind variables will automatically work, but you run the risk of conflicts with the host page.

请参阅 限制 以获取更多详细信息,以确保你的代码与客户端渲染兼容。

🌐 See Limitations for more details to ensure your code works with client-side rendering.

贡献

🌐 Contributing

如果你有兴趣改进网页渲染器,例如通过添加新样式,请参阅贡献客户端渲染

🌐 If you are interested in improving the web renderer, for example by adding new styles, see Contributing to client-side rendering.

另请参阅

🌐 See also