Java, Twitter, and asynchronous event driven architecture
Twitter famously launched using the then-popular Ruby on Rails web framework. Since then they suffered scalability problems which they famously made light of with the Fail Whale. Word has been that they started using Scala a while back, and it turns out they've been doing an intense study of methods to scale their service to handle the traffic volume they've been facing. A recent article on InfoQ went over some of the things they did, and surprisingly they did not use any Node.js software.
Well, their choice may be surprising today when Node.js is getting so much excited attention, but we should recall the decisions they made began before Node.js was available and even today Node still has a pre-1.0 version number. In any case let's ponder what the InfoQ article says.
They changed the search engine storage from MySQL to Lucene, and replaced a Ruby on Rails search UI "with a Java server they called Blender." (Blender is "a Thrift and HTTP service built on Netty, a highly-scalable New I/O (NIO) client server library written in Java that enables the development of a variety of protocol servers")
They wrote an open source framework, Gizzard, "for creating distributed datastores, is used to partition MySQL". They're using "HDFS in Hadoop extensively for off-line computation" and so on.
They developed Finagle as a "a library for building asynchronous RPC servers and clients in Java, Scala, or any JVM language. It is written in Scala, but also supports a highly Java-idiomatic API."
They're happy with their system performance, they are "one of the largest websites in the world, but run on a very small hardware footprint compared to other big dynamic sites" and "Keeping the hardware footprint small has advantages in terms of cost, but also avoids some of the secondary scaleability concerns, such as the performance of the TCP stack, that can impact sites with larger hardware demands." So performance wasn't their prime motivating factor, something else was: The primary driver is honestly encapsulation, so we can iterate faster as a company. Having a single, monolithic application codebase is not amenable to quick movement on a per-team basis. So when we decide to encapsulate something, then because of our performance concerns, its better to rewrite it in the JVM for most systems, than to write a new Ruby system.
Okay, that was a lot of cool information about their decisions. But let's ponder it versus Node.js as a potential tool for the issues they describe.
Languages: They're clearly a multi-language shop, and Node.js is a single language solution.
Performance: Performance is more complex than just asynchronous coding.
Asynchronous: The issue of event driven asynchronous architecture is only a small part of the overall system.
In other words, there are a lot more issues at play than the focus on an asynchronous architecture that was the focus of the design of Node.js.
Node.js is an exciting system. But is it the be-all-end-all of web application development?
You might be interested in some earlier articles about Node.js: