跳到内容

Node.js 事件循环

🌐 The Node.js Event Loop

什么是事件循环?

🌐 What is the Event Loop?

事件循环使 Node.js 能够执行非阻塞 I/O 操作 - 尽管默认情况下只使用单个 JavaScript 线程 - 通过在可能的情况下将操作卸载到系统内核。

🌐 The event loop is what allows Node.js to perform non-blocking I/O operations — despite the fact that a single JavaScript thread is used by default — by offloading operations to the system kernel whenever possible.

由于大多数现代内核是多线程的,它们可以处理在后台执行的多项操作。当这些操作中的某一项完成时,内核会通知 Node.js,以便将相应的回调添加到 poll 队列中,最终执行。我们将在本主题后面更详细地解释这一点。

🌐 Since most modern kernels are multi-threaded, they can handle multiple operations executing in the background. When one of these operations completes, the kernel tells Node.js so that the appropriate callback may be added to the poll queue to eventually be executed. We'll explain this in further detail later in this topic.

事件循环解释

🌐 Event Loop Explained

当 Node.js 启动时,它会初始化事件循环,处理提供的输入脚本(或进入 REPL,本文件中未涵盖),这些脚本可能会发起异步 API 调用、安排定时器或调用 process.nextTick(),然后开始处理事件循环。

🌐 When Node.js starts, it initializes the event loop, processes the provided input script (or drops into the REPL, which is not covered in this document) which may make async API calls, schedule timers, or call process.nextTick(), then begins processing the event loop.

下图显示了事件循环操作顺序的简化概览。

🌐 The following diagram shows a simplified overview of the event loop's order of operations.

   ┌───────────────────────────┐
┌─>│           timers          
  └─────────────┬─────────────┘
  ┌─────────────┴─────────────┐
       pending callbacks     
  └─────────────┬─────────────┘
  ┌─────────────┴─────────────┐
         idle, prepare       
  └─────────────┬─────────────┘      ┌───────────────┐
  ┌─────────────┴─────────────┐         incoming:   
             poll            │<─────┤  connections, 
  └─────────────┬─────────────┘         data, etc.  
  ┌─────────────┴─────────────┐      └───────────────┘
             check           
  └─────────────┬─────────────┘
  ┌─────────────┴─────────────┐
└──┤      close callbacks      
   └───────────────────────────┘

每个方框将被称为事件循环的“阶段”。

每个阶段都有一个 FIFO 队列来执行回调。虽然每个阶段都有其特定的特点,但通常情况下,当事件循环进入某个阶段时,它会执行该阶段特定的操作,然后执行该阶段队列中的回调,直到队列被耗尽或达到最大回调数。当队列耗尽或达到回调上限时,事件循环将进入下一个阶段,依此类推。

🌐 Each phase has a FIFO queue of callbacks to execute. While each phase is special in its own way, generally, when the event loop enters a given phase, it will perform any operations specific to that phase, then execute callbacks in that phase's queue until the queue has been exhausted or the maximum number of callbacks has executed. When the queue has been exhausted or the callback limit is reached, the event loop will move to the next phase, and so on.

由于这些操作中的任意一个都可能调度更多操作,并且在 poll 阶段处理的新事件会被内核排队,因此在处理轮询事件时,轮询事件也可能被排队。结果,长时间运行的回调可能会使轮询阶段运行的时间远超定时器的阈值。更多细节请参见 timerspoll 部分。

🌐 Since any of these operations may schedule more operations and new events processed in the poll phase are queued by the kernel, poll events can be queued while polling events are being processed. As a result, long running callbacks can allow the poll phase to run much longer than a timer's threshold. See the timers and poll sections for more details.

Windows 和 Unix/Linux 的实现之间有一些小差异,但这对本演示来说并不重要。最重要的部分在这里。实际上有七八个步骤,但我们关心的 - 即 Node.js 实际使用的步骤 - 就是上面提到的那些。

阶段概览

🌐 Phases Overview

  • 计时器:此阶段执行由 setTimeout()setInterval() 安排的回调。
  • 待处理回调:执行延迟到下一次循环迭代的 I/O 回调。
  • 空闲,准备:仅在内部使用。
  • poll:检索新的 I/O 事件;执行与 I/O 相关的回调(几乎全部,除了关闭回调、定时器调度的回调以及 setImmediate() 调用的回调);在适当情况下,Node 会在此阻塞。
  • 检查setImmediate() 回调在这里被调用。
  • 关闭回调:一些关闭回调,例如 socket.on('close', ...)

在事件循环的每次运行之间,Node.js 会检查是否正在等待任何异步 I/O 或定时器,如果没有,它会干净地关闭。

🌐 Between each run of the event loop, Node.js checks if it is waiting for any asynchronous I/O or timers and shuts down cleanly if there are not any.

从 libuv 1.45.0(Node.js 20)开始,事件循环的行为发生了变化,现在定时器只在 poll 阶段之后运行,而不像早期版本中那样在之前和之后都运行。这一变化可能会影响 setImmediate() 回调的时机,以及它们在某些场景下与定时器的交互方式。

🌐 Starting with libuv 1.45.0 (Node.js 20), the event loop behavior changed to run timers only after the poll phase, instead of both before and after as in earlier versions. This change can affect the timing of setImmediate() callbacks and how they interact with timers in certain scenarios.

阶段详解

🌐 Phases in Detail

定时器

🌐 timers

计时器指定了一个阈值,_ 超过该阈值后 _ 提供的回调函数可能会被执行,而不是用户希望它被执行的准确时间。计时器回调将在指定时间过去后尽可能早地被调度执行;然而,操作系统的调度或其他回调的运行可能会导致延迟。

🌐 A timer specifies the threshold after which a provided callback may be executed rather than the exact time a person wants it to be executed. Timers callbacks will run as early as they can be scheduled after the specified amount of time has passed; however, Operating System scheduling or the running of other callbacks may delay them.

从技术上讲,轮询阶段 控制计时器的执行时机。

例如,假设你安排了一个超时,在 100 毫秒后执行,然后你的脚本开始异步读取一个文件,这个过程耗时 95 毫秒:

🌐 For example, say you schedule a timeout to execute after a 100 ms threshold, then your script starts asynchronously reading a file which takes 95 ms:

const  = ('node:fs');

function () {
  // Assume this takes 95ms to complete
  .('/path/to/file', );
}

const  = .();

(() => {
  const  = .() - ;

  .(`${}ms have passed since I was scheduled`);
}, 100);

// do someAsyncOperation which takes 95 ms to complete
(() => {
  const  = .();

  // do something that will take 10ms...
  while (.() -  < 10) {
    // do nothing
  }
});

当事件循环进入 poll 阶段时,队列是空的(fs.readFile() 尚未完成),所以它会等待直到最早定时器的阈值达到之前剩余的毫秒数。在等待期间经过了 95 毫秒,fs.readFile() 完成了文件读取,其回调(需要 10 毫秒完成)被加入到 poll 队列并执行。当回调完成后,队列中没有更多回调,所以事件循环会发现最早定时器的阈值已到达,然后回到 timers 阶段去执行定时器的回调。在这个例子中,你会看到定时器被调度和其回调执行之间的总延迟将是 105 毫秒。

🌐 When the event loop enters the poll phase, it has an empty queue (fs.readFile() has not completed), so it will wait for the number of ms remaining until the soonest timer's threshold is reached. While it is waiting 95 ms pass, fs.readFile() finishes reading the file and its callback which takes 10 ms to complete is added to the poll queue and executed. When the callback finishes, there are no more callbacks in the queue, so the event loop will see that the threshold of the soonest timer has been reached then wrap back to the timers phase to execute the timer's callback. In this example, you will see that the total delay between the timer being scheduled and its callback being executed will be 105ms.

为了防止 poll 阶段导致事件循环饥饿,libuv(实现 Node.js 事件循环和平台所有异步行为的 C 库)也设置了一个硬性最大值(依赖系统),在达到该值之前它不会停止轮询更多事件。

待处理回调

🌐 pending callbacks

此阶段执行某些系统操作的回调,例如 TCP 错误类型。例如,如果 TCP 套接字在尝试连接时收到 ECONNREFUSED,某些 *nix 系统可能会等待再报告该错误。这将被排队以在 待处理回调 阶段执行。

🌐 This phase executes callbacks for some system operations such as types of TCP errors. For example if a TCP socket receives ECONNREFUSED when attempting to connect, some *nix systems want to wait to report the error. This will be queued to execute in the pending callbacks phase.

投票

🌐 poll

投票阶段有两个主要功能:

🌐 The poll phase has two main functions:

  1. 然后计算它应该阻塞和轮询 I/O 的时间
  2. 正在处理 poll 队列中的事件。

当事件循环进入 轮询 阶段 且没有计划中的定时器 时,会发生以下两种情况之一:

🌐 When the event loop enters the poll phase and there are no timers scheduled, one of two things will happen:

  • _如果 轮询 队列 不为空,事件循环将遍历其回调队列,按顺序同步执行,直到队列耗尽或达到系统依赖的硬性限制。
  • _如果轮询队列为空,会发生以下两件事之一:
    • 如果脚本已经被 setImmediate() 调度,事件循环将结束 poll 阶段,并继续进入 check 阶段以执行那些已调度的脚本。
    • 如果脚本尚未通过 setImmediate() 安排,事件循环将等待回调被添加到队列中,然后立即执行它们。

一旦 poll 队列为空,事件循环将检查那些 已达到时间阈值 的定时器。如果有一个或多个定时器已经准备就绪,事件循环将回到 timers 阶段来执行这些定时器的回调。

🌐 Once the poll queue is empty the event loop will check for timers whose time thresholds have been reached. If one or more timers are ready, the event loop will wrap back to the timers phase to execute those timers' callbacks.

检查

🌐 check

这一阶段允许事件循环在轮询阶段完成后立即执行回调。如果轮询阶段空闲且脚本已经通过 setImmediate() 排队,事件循环可能会继续进入检查阶段,而不是等待。

🌐 This phase allows the event loop to execute callbacks immediately after the poll phase has completed. If the poll phase becomes idle and scripts have been queued with setImmediate(), the event loop may continue to the check phase rather than waiting.

setImmediate() 实际上是一种特殊的定时器,它在事件循环的独立阶段运行。它使用了一个 libuv API 来安排回调,在 poll 阶段完成后执行。

通常,当代码被执行时,事件循环最终会进入 poll 阶段,在该阶段它会等待传入的连接、请求等。然而,如果使用 setImmediate() 安排了回调,并且 poll 阶段处于空闲状态,它将结束并继续到 check 阶段,而不是等待 poll 事件。

🌐 Generally, as the code is executed, the event loop will eventually hit the poll phase where it will wait for an incoming connection, request, etc. However, if a callback has been scheduled with setImmediate() and the poll phase becomes idle, it will end and continue to the check phase rather than waiting for poll events.

关闭回调

🌐 close callbacks

如果一个套接字或句柄被突然关闭(例如 socket.destroy()),'close' 事件将在此阶段触发。否则,它将通过 process.nextTick() 触发。

🌐 If a socket or handle is closed abruptly (e.g. socket.destroy()), the 'close' event will be emitted in this phase. Otherwise it will be emitted via process.nextTick().

setImmediate()setTimeout()

🌐 setImmediate() vs setTimeout()

setImmediate()setTimeout() 类似,但它们的行为会根据调用时机的不同而有所差异。

  • setImmediate() 旨在在当前轮询阶段完成后执行脚本。
  • setTimeout() 会安排脚本在经过至少指定毫秒数后运行。

定时器执行的顺序将根据调用它们的上下文而有所不同。如果两者都从主模块中调用,则定时将受进程性能的限制(可能会受到机器上其他正在运行的应用的影响)。

🌐 The order in which the timers are executed will vary depending on the context in which they are called. If both are called from within the main module, then timing will be bound by the performance of the process (which can be impacted by other applications running on the machine).

例如,如果我们运行以下不在 I/O 循环(即主模块)中的脚本,两个定时器的执行顺序是非确定性的,因为它受到进程性能的影响:

🌐 For example, if we run the following script which is not within an I/O cycle (i.e. the main module), the order in which the two timers are executed is non-deterministic, as it is bound by the performance of the process:

// timeout_vs_immediate.js
(() => {
  .('timeout');
}, 0);

(() => {
  .('immediate');
});

然而,如果你将这两个调用移动到一个 I/O 循环中,立即回调总是会先执行:

🌐 However, if you move the two calls within an I/O cycle, the immediate callback is always executed first:

// timeout_vs_immediate.js
const  = ('node:fs');

.(, () => {
  (() => {
    .('timeout');
  }, 0);
  (() => {
    .('immediate');
  });
});

setImmediate() 相对于 setTimeout() 的主要优势在于,如果在 I/O 循环中调度,setImmediate() 总是会在任何定时器之前执行,无论存在多少个定时器。

🌐 The main advantage to using setImmediate() over setTimeout() is setImmediate() will always be executed before any timers if scheduled within an I/O cycle, independently of how many timers are present.

process.nextTick()

理解 process.nextTick()

🌐 Understanding process.nextTick()

你可能已经注意到,尽管 process.nextTick() 是异步 API 的一部分,但它并未显示在图示中。这是因为 process.nextTick() 从技术上讲不是事件循环的一部分。相反,nextTickQueue 会在当前操作完成后被处理,而不管事件循环当前处于哪个阶段。在这里,_ 操作 _ 被定义为从底层 C/C++ 处理器转换,并处理需要执行的 JavaScript。

🌐 You may have noticed that process.nextTick() was not displayed in the diagram, even though it's a part of the asynchronous API. This is because process.nextTick() is not technically part of the event loop. Instead, the nextTickQueue will be processed after the current operation is completed, regardless of the current phase of the event loop. Here, an operation is defined as a transition from the underlying C/C++ handler, and handling the JavaScript that needs to be executed.

回头看看我们的图表,每当你在某个阶段调用 process.nextTick() 时,传递给 process.nextTick() 的所有回调都会在事件循环继续之前被执行。这可能会造成一些不良情况,因为它允许你通过递归调用 process.nextTick() 来“饥饿”你的 I/O,从而阻止事件循环到达poll阶段。

🌐 Looking back at our diagram, any time you call process.nextTick() in a given phase, all callbacks passed to process.nextTick() will be resolved before the event loop continues. This can create some bad situations because it allows you to "starve" your I/O by making recursive process.nextTick() calls, which prevents the event loop from reaching the poll phase.

为什么会被允许?

🌐 Why would that be allowed?

为什么这样的功能会被包含在 Node.js 中?部分原因是设计哲学认为,即使 API 不必异步,也应该始终是异步的。看看这个代码片段作为例子:

🌐 Why would something like this be included in Node.js? Part of it is a design philosophy where an API should always be asynchronous even where it doesn't have to be. Take this code snippet for example:

function (, ) {
  if (typeof  !== 'string') {
    return .(
      ,
      new ('argument should be string')
    );
  }
}

该代码片段会进行参数检查,如果参数不正确,它会将错误传递给回调函数。API 最近进行了更新,现在允许向 process.nextTick() 传递参数,使其能够将回调函数之后传递的任何参数作为回调函数的参数进行传递,这样你就不必嵌套函数了。

🌐 The snippet does an argument check and if it's not correct, it will pass the error to the callback. The API updated fairly recently to allow passing arguments to process.nextTick() allowing it to take any arguments passed after the callback to be propagated as the arguments to the callback so you don't have to nest functions.

我们所做的是将错误传回给用户,但只有在我们允许用户的其余代码执行之后才这样做。通过使用 process.nextTick(),我们可以保证 apiCall() 总是在用户的其余代码执行之后、事件循环允许继续之前运行它的回调。为实现这一点,JS 调用栈会先允许展开,然后立即执行提供的回调,这使得人们可以对 process.nextTick() 进行递归调用,而不会因 v8 导致 RangeError: Maximum call stack size exceeded(调用堆栈大小超过最大值错误)。

🌐 What we're doing is passing an error back to the user but only after we have allowed the rest of the user's code to execute. By using process.nextTick() we guarantee that apiCall() always runs its callback after the rest of the user's code and before the event loop is allowed to proceed. To achieve this, the JS call stack is allowed to unwind then immediately execute the provided callback which allows a person to make recursive calls to process.nextTick() without reaching a RangeError: Maximum call stack size exceeded from v8.

这种哲学可能导致一些潜在的问题情况。以这个示例片段为例:

🌐 This philosophy can lead to some potentially problematic situations. Take this snippet for example:

let  = null;

// this has an asynchronous signature, but calls callback synchronously
function () {
  ();
}

// the callback is called before `someAsyncApiCall` completes.
(() => {
  // since someAsyncApiCall hasn't completed, bar hasn't been assigned any value
  .('bar', ); // null
});

 = 1;

用户将 someAsyncApiCall() 定义为具有异步签名,但它实际上是同步运行的。当它被调用时,提供给 someAsyncApiCall() 的回调会在事件循环的同一阶段被调用,因为 someAsyncApiCall() 实际上并没有进行任何异步操作。结果,回调尝试引用 bar,即使该变量可能还不在作用域中,因为脚本尚未完成执行。

🌐 The user defines someAsyncApiCall() to have an asynchronous signature, but it actually operates synchronously. When it is called, the callback provided to someAsyncApiCall() is called in the same phase of the event loop because someAsyncApiCall() doesn't actually do anything asynchronously. As a result, the callback tries to reference bar even though it may not have that variable in scope yet, because the script has not been able to run to completion.

通过将回调放在 process.nextTick() 中,脚本仍然能够运行至完成,这样所有的变量、函数等都可以在回调被调用之前初始化。它还有一个好处,就是不允许事件循环继续。这可能对用户有用,使他们在事件循环继续之前被提醒错误。下面是以前使用 process.nextTick() 的示例:

🌐 By placing the callback in a process.nextTick(), the script still has the ability to run to completion, allowing all the variables, functions, etc., to be initialized prior to the callback being called. It also has the advantage of not allowing the event loop to continue. It may be useful for the user to be alerted to an error before the event loop is allowed to continue. Here is the previous example using process.nextTick():

let  = null;

function () {
  .();
}

(() => {
  .('bar', ); // 1
});

 = 1;

以下是另一个真实示例:

🌐 Here's another real world example:

const  = net.createServer(() => {}).listen(8080);

.on('listening', () => {});

当只传递端口时,端口会立即绑定。因此,'listening' 回调可能会立即被调用。问题是,到那个时候 .on('listening') 回调还没有被设置。

🌐 When only a port is passed, the port is bound immediately. So, the 'listening' callback could be called immediately. The problem is that the .on('listening') callback will not have been set by that time.

为了解决这个问题,'listening' 事件会被排入 nextTick() 队列,以允许脚本运行完成。这使用户可以设置他们想要的任何事件处理程序。

🌐 To get around this, the 'listening' event is queued in a nextTick() to allow the script to run to completion. This allows the user to set any event handlers they want.

process.nextTick()setImmediate()

🌐 process.nextTick() vs setImmediate()

就用户而言,我们有两个调用是相似的,但它们的名称令人困惑。

🌐 We have two calls that are similar as far as users are concerned, but their names are confusing.

  • process.nextTick() 会在同一阶段立即触发
  • setImmediate() 会在事件循环的下一次迭代或“时钟”触发

本质上,这些名称应该互换。process.nextTick() 的触发比 setImmediate() 更加即时,但这是过去的产物,不太可能改变。进行这种交换会破坏 npm 上很大一部分包。随着每天有更多新模块被添加,这意味着每延迟一天,就可能发生更多潜在的破坏。虽然它们令人困惑,但名称本身不会改变。

🌐 In essence, the names should be swapped. process.nextTick() fires more immediately than setImmediate(), but this is an artifact of the past which is unlikely to change. Making this switch would break a large percentage of the packages on npm. Every day more new modules are being added, which means every day we wait, more potential breakages occur. While they are confusing, the names themselves won't change.

我们建议开发者在所有情况下使用 setImmediate(),因为它更容易理解。

为什么使用 process.nextTick()

🌐 Why use process.nextTick()?

主要有两个原因:

🌐 There are two main reasons:

  1. 允许用户处理错误,清理任何不再需要的资源,或者在事件循环继续之前尝试再次发送请求。
  2. 有时有必要允许回调在调用栈清空之后但在事件循环继续之前运行。

一个例子是符合用户的期望。简单例子:

🌐 One example is to match the user's expectations. Simple example:

const  = net.createServer();
.on('connection',  => {});

.listen(8080);
.on('listening', () => {});

假设 listen() 在事件循环开始时运行,但监听回调被放置在 setImmediate() 中。除非传入了主机名,否则端口绑定将立即发生。为了让事件循环继续,它必须进入 poll 阶段,这意味着有一定几率在监听事件之前已经收到了连接,从而触发连接事件。

🌐 Say that listen() is run at the beginning of the event loop, but the listening callback is placed in a setImmediate(). Unless a hostname is passed, binding to the port will happen immediately. For the event loop to proceed, it must hit the poll phase, which means there is a non-zero chance that a connection could have been received allowing the connection event to be fired before the listening event.

另一个例子是扩展 EventEmitter 并在构造函数中触发一个事件:

🌐 Another example is extending an EventEmitter and emitting an event from within the constructor:

const  = ('node:events');

class  extends  {
  constructor() {
    super();
    this.('event');
  }
}

const  = new ();
.('event', () => {
  .('an event occurred!');
});

你不能在构造函数中立即触发事件,因为脚本还没有执行到用户为该事件分配回调的阶段。因此,在构造函数内部,你可以使用 process.nextTick() 来设置一个回调,在构造函数执行完后触发事件,这样可以得到预期的结果:

🌐 You can't emit an event from the constructor immediately because the script will not have processed to the point where the user assigns a callback to that event. So, within the constructor itself, you can use process.nextTick() to set a callback to emit the event after the constructor has finished, which provides the expected results:

const  = ('node:events');

class  extends  {
  constructor() {
    super();

    // use nextTick to emit the event once a handler is assigned
    .(() => {
      this.('event');
    });
  }
}

const  = new ();
.('event', () => {
  .('an event occurred!');
});