异步流控制

¥Asynchronous flow control

这篇文章中的材料深受 Mixu 的 Node.js 书 的启发。

¥The material in this post is heavily inspired by Mixu's Node.js Book.

从本质上讲,JavaScript 被设计为在 "main" 线程上不阻塞,这是视图渲染的地方。你可以想象这在浏览器中的重要性。当主线程被阻塞时,会导致终端用户害怕的臭名昭著的 "freezing",并且无法分派其他事件,从而导致数据获取丢失。

¥At its core, JavaScript is designed to be non-blocking on the "main" thread, this is where views are rendered. You can imagine the importance of this in the browser. When the main thread becomes blocked it results in the infamous "freezing" that end users dread, and no other events can be dispatched resulting in the loss of data acquisition, for example.

这会产生一些独特的约束,只有函数式编程才能解决。这就是回调发挥作用的地方。

¥This creates some unique constraints that only a functional style of programming can cure. This is where callbacks come in to the picture.

但是,在更复杂的过程中,回调可能很难处理。这通常会导致 "回调地狱",其中带有回调的多个嵌套函数使代码更难阅读、调试、组织等。

¥However, callbacks can become challenging to handle in more complicated procedures. This often results in "callback hell" where multiple nested functions with callbacks make the code more challenging to read, debug, organize, etc.

async1(function (input, result1) {
  async2(function (result2) {
    async3(function (result3) {
      async4(function (result4) {
        async5(function (output) {
          // do something with output
        });
      });
    });
  });
});

当然,在现实生活中,很可能会有额外的代码行来处理 result1result2 等,因此,这个问题的长度和复杂性通常会导致代码看起来比上面的示例更混乱。

¥Of course, in real life there would most likely be additional lines of code to handle result1, result2, etc., thus, the length and complexity of this issue usually results in code that looks much more messy than the example above.

这就是函数大有用武之地的地方。更复杂的操作由许多函数组成:

¥This is where functions come in to great use. More complex operations are made up of many functions:

  1. 发起者风格/输入

    ¥initiator style / input

  2. 中间件

    ¥middleware

  3. terminator

"发起者风格/输入" 是序列中的第一个函数。此函数将接受操作的原始输入(如果有)。该操作是一系列可执行的函数,原始输入主要为:

¥The "initiator style / input" is the first function in the sequence. This function will accept the original input, if any, for the operation. The operation is an executable series of functions, and the original input will primarily be:

  1. 全局环境中的变量

    ¥variables in a global environment

  2. 带或不带参数的直接调用

    ¥direct invocation with or without arguments

  3. 通过文件系统或网络请求获取的值

    ¥values obtained by file system or network requests

网络请求可以是由外部网络、同一网络上的另一个应用或同一或外部网络上的应用本身发起的传入请求。

¥Network requests can be incoming requests initiated by a foreign network, by another application on the same network, or by the app itself on the same or foreign network.

中间件函数将返回另一个函数,终止符函数将调用回调。以下说明了网络或文件系统请求的流程。此处延迟为 0,因为所有这些值都在内存中可用。

¥A middleware function will return another function, and a terminator function will invoke the callback. The following illustrates the flow to network or file system requests. Here the latency is 0 because all these values are available in memory.

function final(someInput, callback) {
  callback(`${someInput} and terminated by executing callback `);
}

function middleware(someInput, callback) {
  return final(`${someInput} touched by middleware `, callback);
}

function initiate() {
  const someInput = 'hello this is a function ';
  middleware(someInput, function (result) {
    console.log(result);
    // requires callback to `return` result
  });
}

initiate();

状态管理

¥State management

函数可能依赖于状态,也可能不依赖于状态。当函数的输入或其他变量依赖于外部函数时,就会出现状态依赖。

¥Functions may or may not be state dependent. State dependency arises when the input or other variable of a function relies on an outside function.

这样,状态管理有两种主要策略:

¥In this way there are two primary strategies for state management:

  1. 将变量直接传递给函数,并且

    ¥passing in variables directly to a function, and

  2. 从缓存、会话、文件、数据库、网络或其他外部源获取变量值。

    ¥acquiring a variable value from a cache, session, file, database, network, or other outside source.

注意,我没有提到全局变量。使用全局变量管理状态通常是一种草率的反模式,这使得很难或不可能保证状态。应尽可能避免在复杂程序中使用全局变量。

¥Note, I did not mention global variable. Managing state with global variables is often a sloppy anti-pattern that makes it difficult or impossible to guarantee state. Global variables in complex programs should be avoided when possible.

控制流

¥Control flow

如果对象在内存中可用,则可以进行迭代,并且不会改变控制流:

¥If an object is available in memory, iteration is possible, and there will not be a change to control flow:

function getSong() {
  let _song = '';
  let i = 100;
  for (i; i > 0; i -= 1) {
    _song += `${i} beers on the wall, you take one down and pass it around, ${
      i - 1
    } bottles of beer on the wall\n`;
    if (i === 1) {
      _song += "Hey let's get some more beer";
    }
  }

  return _song;
}

function singSong(_song) {
  if (!_song) throw new Error("song is '' empty, FEED ME A SONG!");
  console.log(_song);
}

const song = getSong();
// this will work
singSong(song);

但是,如果数据存在于内存之外,迭代将不再起作用:

¥However, if the data exists outside of memory the iteration will no longer work:

function getSong() {
  let _song = '';
  let i = 100;
  for (i; i > 0; i -= 1) {
    /* eslint-disable no-loop-func */
    setTimeout(function () {
      _song += `${i} beers on the wall, you take one down and pass it around, ${
        i - 1
      } bottles of beer on the wall\n`;
      if (i === 1) {
        _song += "Hey let's get some more beer";
      }
    }, 0);
    /* eslint-enable no-loop-func */
  }

  return _song;
}

function singSong(_song) {
  if (!_song) throw new Error("song is '' empty, FEED ME A SONG!");
  console.log(_song);
}

const song = getSong('beer');
// this will not work
singSong(song);
// Uncaught Error: song is '' empty, FEED ME A SONG!

为什么会发生这种情况?setTimeout 指示 CPU 将指令存储在总线上的其他位置,并指示数据计划在稍后时间拾取。在函数在 0 毫秒标记处再次命中之前,经过数千个 CPU 周期,CPU 从总线获取指令并执行它们。唯一的问题是歌曲 ('') 在数千个周期之前被返回。

¥Why did this happen? setTimeout instructs the CPU to store the instructions elsewhere on the bus, and instructs that the data is scheduled for pickup at a later time. Thousands of CPU cycles pass before the function hits again at the 0 millisecond mark, the CPU fetches the instructions from the bus and executes them. The only problem is that song ('') was returned thousands of cycles prior.

在处理文件系统和网络请求时也会出现同样的情况。主线程根本无法在一段不确定的时间内被阻止 - 因此,我们使用回调以受控的方式及时安排代码的执行。

¥The same situation arises in dealing with file systems and network requests. The main thread simply cannot be blocked for an indeterminate period of time-- therefore, we use callbacks to schedule the execution of code in time in a controlled manner.

你将能够使用以下 3 种模式执行几乎所有操作:

¥You will be able to perform almost all of your operations with the following 3 patterns:

  1. 系列:函数将按严格的顺序执行,这个顺序与 for 循环最相似。

    ¥In series: functions will be executed in a strict sequential order, this one is most similar to for loops.

// operations defined elsewhere and ready to execute
const operations = [
  { func: function1, args: args1 },
  { func: function2, args: args2 },
  { func: function3, args: args3 },
];

function executeFunctionWithArgs(operation, callback) {
  // executes function
  const { args, func } = operation;
  func(args, callback);
}

function serialProcedure(operation) {
  if (!operation) process.exit(0); // finished
  executeFunctionWithArgs(operation, function (result) {
    // continue AFTER callback
    serialProcedure(operations.shift());
  });
}

serialProcedure(operations.shift());
  1. 完全并行:当排序不是问题时,例如向 1,000,000 个电子邮件收件人列表发送电子邮件。

    ¥Full parallel: when ordering is not an issue, such as emailing a list of 1,000,000 email recipients.

let count = 0;
let success = 0;
const failed = [];
const recipients = [
  { name: 'Bart', email: 'bart@tld' },
  { name: 'Marge', email: 'marge@tld' },
  { name: 'Homer', email: 'homer@tld' },
  { name: 'Lisa', email: 'lisa@tld' },
  { name: 'Maggie', email: 'maggie@tld' },
];

function dispatch(recipient, callback) {
  // `sendEmail` is a hypothetical SMTP client
  sendMail(
    {
      subject: 'Dinner tonight',
      message: 'We have lots of cabbage on the plate. You coming?',
      smtp: recipient.email,
    },
    callback
  );
}

function final(result) {
  console.log(`Result: ${result.count} attempts \
      & ${result.success} succeeded emails`);
  if (result.failed.length)
    console.log(`Failed to send to: \
        \n${result.failed.join('\n')}\n`);
}

recipients.forEach(function (recipient) {
  dispatch(recipient, function (err) {
    if (!err) {
      success += 1;
    } else {
      failed.push(recipient.name);
    }
    count += 1;

    if (count === recipients.length) {
      final({
        count,
        success,
        failed,
      });
    }
  });
});
  1. 有限并行:与限制并行,例如成功向 1000 万用户列表中的 1,000,000 名收件人发送电子邮件。

    ¥Limited parallel: parallel with limit, such as successfully emailing 1,000,000 recipients from a list of 10 million users.

let successCount = 0;

function final() {
  console.log(`dispatched ${successCount} emails`);
  console.log('finished');
}

function dispatch(recipient, callback) {
  // `sendEmail` is a hypothetical SMTP client
  sendMail(
    {
      subject: 'Dinner tonight',
      message: 'We have lots of cabbage on the plate. You coming?',
      smtp: recipient.email,
    },
    callback
  );
}

function sendOneMillionEmailsOnly() {
  getListOfTenMillionGreatEmails(function (err, bigList) {
    if (err) throw err;

    function serial(recipient) {
      if (!recipient || successCount >= 1000000) return final();
      dispatch(recipient, function (_err) {
        if (!_err) successCount += 1;
        serial(bigList.pop());
      });
    }

    serial(bigList.pop());
  });
}

sendOneMillionEmailsOnly();

每个都有自己的用例、好处和问题,你可以进行实验并更详细地阅读。最重要的是,记得模块化你的操作并使用回调!如果你有任何疑问,请将所有内容视为中间件!

¥Each has its own use cases, benefits, and issues you can experiment and read about in more detail. Most importantly, remember to modularize your operations and use callbacks! If you feel any doubt, treat everything as if it were middleware!