Bun 中的 JavaScript 宏

Jarred Sumner · 2023 年 5 月 31 日

两周前，我们在 Bun v0.6.0 中发布了新的 JavaScript 打包器。今天，我们发布一项新功能，突显了 Bun 打包器和运行时之间的紧密集成：Bun 宏。

宏是一种在捆绑时运行 JavaScript 函数的机制。从这些函数返回的值会直接内联到您的捆绑包中。

作为一个简单的示例，考虑这个返回随机数的简单函数。

export function random() {
  return Math.random();
}

在我们的源代码中，我们可以使用 import attribute 语法将此函数作为宏导入。如果您以前没有见过这种语法，它是一个 Stage 3 TC39 提案，允许您将额外的元数据附加到 import 语句。

cli.tsx
import { random } from './random.ts' with { type: 'macro' };

console.log(`Your random number is ${random()}`);

现在我们将使用 bun build 打包此文件。打包后的文件将打印到 stdout。

bun build ./cli.tsx
console.log(`Your random number is ${0.6805550949689833}`);

正如您所看到的，random 函数的源代码在捆绑包中任何地方都没有出现。相反，它在捆绑期间执行，函数调用 (random()) 被函数的结果替换。由于源代码永远不会包含在捆绑包中，因此宏可以安全地执行特权操作，例如从数据库读取数据。

何时使用宏

对于您通常会使用一次性构建脚本的小事情，捆绑时代码执行可能更易于维护。它与代码的其余部分一起存在，与构建的其余部分一起运行，自动并行化，如果失败，构建也会失败。

如果您发现自己在捆绑时运行大量代码，请考虑运行服务器来代替。

让我们看看宏可能有用的一些场景。

嵌入最新的 git 提交哈希值

in-the-browser.ts

getGitCommitHash.ts

in-the-browser.ts
import { getGitCommitHash } from './getGitCommitHash.ts' with { type: 'macro' };

console.log(`The current Git commit hash is ${getGitCommitHash()}`);

getGitCommitHash.ts
export function getGitCommitHash() {
  const {stdout} = Bun.spawnSync({
    cmd: ["git", "rev-parse", "HEAD"],
    stdout: "pipe",
  });

  return stdout.toString();
}

当我们构建它时，getGitCommitHash 将被替换为调用函数的结果

output.js

CLI

output.js
console.log(`The current Git commit hash is 3ee3259104f`);

CLI
bun build --target=browser ./in-the-browser.ts

您可能在想“为什么不直接使用 process.env.GIT_COMMIT_HASH？”。好吧，您也可以这样做。但是您可以使用环境变量来做到这一点吗？

在捆绑时发出 `fetch()` 请求

在此示例中，我们使用 fetch() 发出出站 HTTP 请求，使用 HTMLRewriter 解析 HTML 响应，并返回包含标题和元标记的对象 - 所有操作都在捆绑时完成。

in-the-browser.tsx

meta.ts

in-the-browser.tsx
import { extractMetaTags } from './meta.ts' with { type: 'macro' };

export const Head = () => {
  const headTags = extractMetaTags("https://example.com");

  if (headTags.title !== "Example Domain") {
    throw new Error("Expected title to be 'Example Domain'");
  }

  return <head>
    <title>{headTags.title}</title>
    <meta name="viewport" content={headTags.viewport} />
  </head>;
};

meta.ts
export async function extractMetaTags(url: string) {
  const response = await fetch(url);
  const meta = {
    title: "",
  };
  new HTMLRewriter()
    .on("title", {
      text(element) {
        meta.title += element.text;
      },
    })
    .on("meta", {
      element(element) {
        const name =
          element.getAttribute("name") ||
          element.getAttribute("property") ||
          element.getAttribute("itemprop");

        if (name) meta[name] = element.getAttribute("content");
      },
    })
    .transform(response);

  return meta;
}

extractMetaTags 函数在捆绑时被擦除，并替换为函数调用的结果。这意味着 fetch 请求发生在捆绑时，结果嵌入到捆绑包中。此外，抛出错误的 branch 被消除，因为它不可达。

output.js

CLI

output.js
import { jsx, jsxs } from "react/jsx-runtime";
export const Head = () => {
  jsxs("head", {
    children: [
      jsx("title", {
        children: "Example Domain",
      }),
      jsx("meta", {
        name: "viewport",
        content: "width=device-width, initial-scale=1",
      }),
    ],
  });
};

export { Head };

CLI
bun build --target=browser --minify-syntax ./in-the-browser.ts

工作原理

Bun 宏是使用 {type: 'macro'} import attribute 注释的 import 语句。

import { myMacro } from './macro.ts' with { type: 'macro' }

Import attributes 是 Stage 3 ECMAScript 提案，这意味着它们极有可能作为 JavaScript 语言的官方部分添加。

Bun 也支持 import assertion 语法。Import assertions 是 import attributes 的早期版本，现在已被放弃（但已被许多浏览器和运行时支持）。

import { myMacro } from "./macro.ts" assert { type: "macro" };

当 Bun 的 transpiler 看到这些特殊导入之一时，它会在 transpiler 内部使用 Bun 的 JavaScript 运行时调用该函数，并将 JavaScript 的返回值转换为 AST 节点。这些 JavaScript 函数在捆绑时调用，而不是运行时。

执行顺序

Bun 宏在访问阶段期间在 transpiler 中同步执行 - 在插件之前和 transpiler 生成 AST 之前。它们按照被调用的顺序执行。Transpiler 将等待宏完成执行，然后再继续。Transpiler 也会 await 宏返回的任何 Promise。

Bun 的打包器是多线程的。因此，宏在多个衍生的 JavaScript “worker” 中并行执行。

死代码消除

打包器在运行和内联宏之后执行死代码消除。因此，给定以下宏

returnFalse.ts
export function returnFalse() {
  return false;
}

...然后捆绑以下文件将生成一个空捆绑包。

import {returnFalse} from './returnFalse.ts' with { type: 'macro' };

if (returnFalse()) {
  console.log("This code is eliminated");
}

安全注意事项

宏必须显式地使用 { type: "macro" } 导入才能在捆绑时执行。这些导入如果未被调用则不起作用，这与可能具有副作用的常规 JavaScript 导入不同。

您可以通过将 --no-macros 标志传递给 Bun 来完全禁用宏。它会产生如下构建错误

error: Macros are disabled

foo();
^
./hello.js:3:1 53

宏在 node_modules 中被禁用

为了减少恶意包的潜在攻击面，宏不能从 node_modules/**/* 内部调用。如果一个包尝试调用宏，您将看到如下错误

error: For security reasons, macros cannot be run from node_modules.

beEvil();
^
node_modules/evil/index.js:3:1 50

您的应用程序代码仍然可以从 node_modules 导入宏并调用它们。

import {macro} from "some-package" with { type: "macro" };

macro();

局限性

一些需要了解的事项。

宏的结果必须是可序列化的！

Bun 的 transpiler 需要能够序列化宏的结果，以便它可以内联到 AST 中。支持所有 JSON 兼容的数据结构

macro.ts
export function getObject() {
  return {
    foo: "bar",
    baz: 123,
    array: [ 1, 2, { nested: "value" }],
  };
}

宏可以是异步的，或者返回 Promise 实例。Bun 的 transpiler 将自动 await Promise 并内联结果。

macro.ts
export async function getText() {
  return "async value";
}

Transpiler 实现了特殊的逻辑来序列化常见的数据格式，例如 Response、Blob、TypedArray。

TypedArray：解析为 base64 编码的字符串。
Response：在相关情况下，Bun 将读取 Content-Type 并进行相应的序列化；例如，类型为 application/json 的 Response 将自动解析为对象，而 text/plain 将作为字符串内联。类型未知或未定义的 Response 将以 base-64 编码。
Blob：与 Response 一样，序列化取决于 type 属性。

fetch 的结果是 Promise<Response>，因此可以直接返回。

macro.ts
export function getObject() {
  return fetch("https://bun.net.cn")
}

函数和大多数类（上述提到的类除外）的实例不可序列化。

export function getText(url: string) {
  // this doesn't work!
  return () => {};
}

输入参数必须是静态可分析的。

宏可以接受输入，但仅在有限的情况下。该值必须是静态已知的。例如，以下是不允许的

import {getText} from './getText.ts' with { type: 'macro' };

export function howLong() {
  // the value of `foo` cannot be statically known
  const foo = Math.random() ? "foo" : "bar";

  const text = getText(`https://example.com/${foo}`);
  console.log("The page is ", text.length, " characters long");
}

但是，如果 foo 的值在捆绑时是已知的（例如，如果它是一个常量或另一个宏的结果），那么它是允许的

import {getText} from './getText.ts' with { type: 'macro' };
import {getFoo} from './getFoo.ts' with { type: 'macro' };

export function howLong() {
  // this works because getFoo() is statically known
  const foo = getFoo();
  const text = getText(`https://example.com/${foo}`);
  console.log("The page is", text.length, "characters long");
}

这输出

function howLong() {
  console.log("The page is", 1322, "characters long");
}
export { howLong };