Avoid tearing your hair out on variable values in deeply nested JavaScript/Node.js callback chains

JavaScript, and hence Node.js, supports a useful model for variable scoping and callback functions where the available variables build up as the code nests. It's very convenient because your code doesn't have to pass values through function parameters to code in an in-line callback function. But it's possible for a variable to not have the expected value by the time the callback function executes. When that's the case, it's tempting to start tearing your hair out in frustration screaming about the variable's value is incorrect. The culprit can be that the code is executing asynchronously even though it looks like linear code.

Is there a way to pass variables down to a nested callback without passing them to each function along the way, unnecessarily?

To begin our journey, compare these implementations of the Fibonacci algorithm (from my book Node Web Development - see the sidebar for links):

var fibonacci = function(n) {
    if (n === 1)
        return 1;
    else if (n === 2)
        return 1;
    else
        return fibonacci(n-1) + fibonacci(n-2);
}

var fibonacciAsync = function(n, done) {
    if (n === 0)
        done(undefined, 0);
    else if (n === 1 || n === 2)
        done(undefined, 1);
    else {
        setImmediate(function() {
            fibonacciAsync(n-1, function(err, val1) {
                if (err) done(err);
                else setImmediate(function() {
                    fibonacciAsync(n-2, function(err, val2) {
                        if (err) done(err);
                        else done(undefined, val1+val2);
                    });
                });
            });
        });
    }
}

The first is a dead simple transliteration of the typical recursive Fibonacci function into JavaScript. The problem with this function is that, in Node.js, it is blocking and because the event loop is never invoked it stops Node.js server's from responding to events. The second one doesn't do any performance enhancements, both take an incredibly long time to calculate the Fibonacci number for values over 20 or 30, but at least fibonacciAsync uses the Node.js event loop to dispatch work and the server can continue responding to events.

The important point shown by fibonacciAsync is that the variables val1, val2 and n all are available to code in the callback functions. The order of code execution in fibonacciAsync is actually very complex. Once fibonacciAsync(n-2) executes it invokes fibonacciAsync(n-1) and then fibonacciAsync(n-2) again, etc, building up a large number of call chains.

This works very well, and we quickly get accustomed building code with nested in-line callbacks that have access to variables in parent scopes. But earlier I suggested it's possible to get really confused.

var processList = function(list, done) {
    for (var i = 0; i < list.length; i++) {
        processStep1(list[i].item1, function(err, result1) {
            processStep2(result1, list[i].item2, function(err, result2) {
                finalStep(list[i], result2, function(err, resultFinal) {
                    done(err, resultFinal);
                });
            });
        });
    }
}

The idea here is that you have an array of items requiring multiple steps of asynchronous processing, with a final step after which the "done" callback takes you back to the caller. How many times have you written this in Node.js? But what's the flaw here?

The flaw is that processStep2 doesn't receive list[i].item2, but instead receives an undefined value. Why?

The problem is the code is using a synchronous control structure to drive asynchronous code execution. The for loop executes almost immediately, because its task is simply to queue up asynchronous invocations of processStep1. By the time any of the inner code is executed, the value of 'i' is equal to list.length, and list[list.length] is undefined.

It's better to recode this as so:

var processList = function(list, done) {
    list.forEach(function(item) {
        processStep1(item.item1, function(err, result1) {
            processStep2(result1, item.item2, function(err, result2) {
                finalStep(item, result2, function(err, resultFinal) {
                    done(err, resultFinal);
                });
            });
        });
    });
}

Now we have an asynchronous control structure driving asynchronous code.