WHATWG API
WHATWG URL 标准 使用比 Legacy API 使用的方法更具选择性和更细粒度的方法来选择编码字符。
¥The WHATWG URL Standard uses a more selective and fine grained approach to selecting encoded characters than that used by the Legacy API.
WHATWG 算法定义了四个 "百分比编码集",它们描述了必须进行百分比编码的字符范围:
¥The WHATWG algorithm defines four "percent-encode sets" that describe ranges of characters that must be percent-encoded:
-
C0 控制百分比编码集包括 U+0000 到 U+001F(含)范围内的代码点和大于 U+007E 的所有代码点。
¥The C0 control percent-encode set includes code points in range U+0000 to U+001F (inclusive) and all code points greater than U+007E.
-
片段百分比编码集包括 C0 控制百分比编码集和代码点 U+0020、U+0022、U+003C、U+003E 和 U+0060。
¥The fragment percent-encode set includes the C0 control percent-encode set and code points U+0020, U+0022, U+003C, U+003E, and U+0060.
-
路径百分比编码集包括 C0 控制百分比编码集和代码点 U+0020、U+0022、U+0023、U+003C、U+003E、U+003F、U+0060、U+007B 和 U+007D。
¥The path percent-encode set includes the C0 control percent-encode set and code points U+0020, U+0022, U+0023, U+003C, U+003E, U+003F, U+0060, U+007B, and U+007D.
-
userinfo 编码集包括路径百分比编码集和代码点 U+002F、U+003A、U+003B、U+003D、U+0040、U+005B、U+005C、U+005D、U+005E、 和 U+007C。
¥The userinfo encode set includes the path percent-encode set and code points U+002F, U+003A, U+003B, U+003D, U+0040, U+005B, U+005C, U+005D, U+005E, and U+007C.
userinfo 百分比编码集专门用于在 URL 中编码的用户名和密码。路径百分比编码集用于大多数 URL 的路径。片段百分比编码集用于 URL 片段。C0 控制百分比编码集用于某些特定条件下的主机和路径,以及所有其他情况。
¥The userinfo percent-encode set is used exclusively for username and passwords encoded within the URL. The path percent-encode set is used for the path of most URLs. The fragment percent-encode set is used for URL fragments. The C0 control percent-encode set is used for host and path under certain specific conditions, in addition to all other cases.
当主机名中出现非 ASCII 字符时,主机名将使用 Punycode 算法进行编码。但是请注意,主机名可能同时包含 Punycode 编码字符和百分比编码字符:
¥When non-ASCII characters appear within a host name, the host name is encoded using the Punycode algorithm. Note, however, that a host name may contain both Punycode encoded and percent-encoded characters:
const myURL = new URL('https://%CF%80.example.com/foo');
console.log(myURL.href);
// Prints https://xn--1xa.example.com/foo
console.log(myURL.origin);
// Prints https://xn--1xa.example.com