Following the recent controversy in the Node.js community, the ousting of a prominent contributor to the Node.js core and the fact that Isaac Schlueter has stepped down from leading the project pursuing his own startup, I was wondering how long it would take until a new fork of Node.js appeared.
As it turned out, not too long. In fact, while all of this has been happening, a few guys were working on a fork of Node.js called NodeJX. Although the name and the addition of an “X” as if it instantly makes something more exciting is cringe worthy, I was still interested as to what these guys had to offer. They even managed to get one of their blog posts into Node Weekly comparing NodeJX performance to Node.js and Vert.x. One of the core features NodeJX advertises is the addition of multithreading. The first thing that came to my mind when reading this was the current node module “cluster” which basically uses the all the cores in a machine to serve request. In fact they had compared their multithreaded performance to Node.js on cluster and it had won.
But I am still not convinced we need threads. Hear me out before you call me a heretic.
A distributed system
One of the remnants of the past decade of backend engineering is the idea of vertical scaling and powerful server machines and the addition of the occasional load balancer. Basically the more of a powerful machine you have, the less of a block in IO you would get and the more request you could serve. This in turn led to technologies being really focused on using all the cores of these powerful computers. Multithreading became an essential part of server side programming.
With the emergence of cloud computing, developers slowly realised that instead of having a couple of very powerful computers serving requests, you could have many more smaller units of computing doing the same thing. You would then copy the program into these smaller machines and some gateway technology would spread the workload amongst them. This is what we call Horizontal scaling. A prime example of this is Heroku and its “dynos”.
I’m all in for horizontal scaling, which offers a lot more reliability as the system is distributed. And when you need to scale, rather than the complicated job of setting another computing beast or adding resources to the current ones, you simply create more of these smaller units of computation which is mostly done with a click of a button.
This sort of scaling reached a new level with emergence of Node.js. The Non-blocking IO of Node.js meant that your application would not be unresponsive while it was dealing with another request. Something that was previously done with process forking and the dangerous world of multithreading before, could now be easily achieved with the event based IO model of Node.js. But there is a catch. While IO in Node.js is event based and non-blocking, you app logic will still block because it run in the same single thread as IO. This means while serving these request, if you decide to do some complicated server side operation, you would block the main process. So we should go back to multithreading and multiple cores right? Wrong.
What I’ve learned through the Node project I’ve worked on, some with very complicated logic, is to separate the computational bit of your application from the IO bit. This means having two separate applications, leaving IO entirely heavy computation free and moving the heavy bits to worker applications. The communication between these applications can be done via some sort of messaging system, many of them around. Which would also free you from sticking to Node.js for all parts of your app logic. If your worker application would be faster written in C, then you can easily do so while communication with your Node.js IO handler using a messaging system like RabbitMQ. This also frees these certain parts of your application. You can add as many workers as you need while keeping the IO handler the same way.
This doesn’t mean that there will never be a need for making use of different CPU cores in Node.js. Web development is not the only use for Node, with the introduction of boards like Tessel, it might have a big say in embedded computing too. Who knows what people will come up with. What I’m arguing here is that we probably don’t need a forked Node to do this for us.
Fragmentation of a great platform
Surely having NodeJX around is not harmful, you’re saying. Well, Yes and No. Yes because I’m sure there are some great ideas which the Node.js community can learn from and No because it might cause fragmentation in such a great platform.
Just have a look at the amount of work that goes into NPM modules everyday. Enormous amount of engineering and innovation happens in these modules. On top of this, there is a great trend of tested and proven to work modules on NPM. Modules you can just plug in and be certain they will work as promised. This is a great feature of Node platform and we should do all we can to preserve it.
All in all, while I am disappointed with the recent news about Node, I still think this is the greatest and most exciting platform to be involved in. I hope the current issues in the community will be ironed out as it’s the only thing that can tamper the amazing growth in demand for Node.