We can extract more value from Microservices if we use languages born in the Cloud era
Today maybe there are tools better than Java to develop applications on top of modern architectures
Microservices and Containers are very popular today, even in traditional corporate IT shops. Often though they are implemented using languages, such as Java, born in the early ’90s and designed for a world of monolithical applications. Do you remember the big old Application Servers?
Sometimes this choice is driven by solid reasons: need to integrate legacy systems via libraries present only in the Java ecosystem, convenience to use skills already in house. Often, however, this choice can derive from a not careful evaluation of the alternatives which are available today and the benefits (economic and of performance) they can bring.
The last decade has seen the rise of new programming platforms, all aimed to provide to support t best the “modern networked computing”, which is the paradigm at the basis of Microservices. Ignoring such solutions can bring us to produce sub-optimal solutions both in terms of costs and in terms of quality of service. In other words: throw away money to give a mediocre service.
Ignoring the programming platforms developed in the last ten years can lead to sub-optimal results and high run costs when adopting Microservices
Moreover, with the advent of Containers, developers can “write in whatever language they want and run everywhere”, making the original Java proposition, “write once, run everywhere”, much less relevant.
This post is focused on 2 of such “relatively recent” technologies: Node and Go.
But why Node and Go? Because they are among the first born in the “networked” Cloud era and have proved they can be used to build complex and scalable services (for instance: Uber uses Go in a core part of its platform, Paypal and Netflix leverage a lot Node).
Go and Node have also a thriving ecosystem (see for instance the last StackOverflow 2020 Developer Survey). We can therefore consider them as technologies mature also for the Enterprise.
November 2009, almost 15 years after the release of Java
In just 2 days 2 announcements, 2 technologies which were going to heavily influence the way we develop modern applications today. It is clear that there was a great need of new ways to respond to the new challenges that were emerging.
Microservices provide scalability as long as they are used to run efficiently many small tasks in a concurrent mode
By 2009 the exponential growth in the demand of digital services was already a fact. But such exponential growth of load could not be addressed by a similarly exponential increase in infrastructure. It would have been not economically sustainable, and this is even more true today.
As response to this challenge a new architectural style emerged: horizontally scalable systems based on multi core commodity (e.g. x86) processors, in other words the basis of what Microservices are.
With Microservices you better optimize the concurrent processing of many small tasks which is not what languages like Java were designed for
Such architectures have proved to scale at sustainable costs as long as they are able to process many small tasks at “the same time”. This means to be able to serve many “concurrent” requests, optimizing the response time for the clients as well as exploiting at best the computing resources (CPU and memory).
Traditional languages like Java, born in the era of monolithic Application Servers and monolith processes, had not been designed to manage many small tasks “at the same time” (multi-threading is certainly not one of the strengths of Java) and can show limitations when used with such architectural models.
Both Node and Go came in to address this mismatch, even if from very different directions.
What do Node and Go share? A natural support for concurrency
Being born at the same time may be seen as a coincidence, specifically if we talk about Node and Go which are very different platforms in many respects.
But maybe it is not a coincidence if we look at what they have in common: they have been designed to support “concurrent” processing of many requests in a natural way. And this is why they can be seen as two different ways to respond to the new challenge posed by modern “networked” computing: a strong support for concurrency.
Why is concurrency important in distributed architectures?
Let’s look at a typical program, a program that performs many I/O (Input/Output) operations: interacts with databases, REST services and maybe storage.
What does such a program normally do? Most of the time it stays idle waiting for some I/O operations to be performed somewhere else. It waits because I/O operations are, for reasons of physics, orders of magnitude slower than the operations performed by the CPU. And, while waiting, it waste CPU cycles. This type of processing is called “I/O bound”, since the amount of processing that can actually be performed in a unit of time is limited by the capacity of I/O to respond fast and not by the speed of our cores or memory limits.
But if we waste CPU cycles we are not using our infrastructure efficiently. We increase costs, we increase response time, we do not serve well our clients. It is like renting an house with ten rooms but eventually ending up using only one. We throw away money and do not benefit from the quality we are buying.
With transactional programs, not being able to exploit concurrency well is like renting an house with ten rooms to live just in one.
This is what happens unless we do something, unless we make sure that more than one request can be managed by our CPU “at the same time”. And this is exactly what concurrency is all about.
This is not a new problem. Traditionally, in the Java world this was the task of the Application Servers. But Application Servers are not a good fit for horizontally scalable architectures. And this is where the likes of Node and Go can come to rescue.
Concurrency and parallelism
Concurrency and parallelism are similar but different concepts. Here I use concurrency, since this is what really matters in this context. For a detailed explanation about the differences between these two concepts you can look at this thorough explanation from Rob Pike.
Node and concurrency
Node is single threaded. So everything runs in one single thread (well… almost everything, but that’s not relevant in our context). So how can it support concurrency? The secret is that Node is also non blocking on I/O operations.
In Node, when you run an I/O operation, the program does not stop waiting for the I/O response. Rather it provides the system with a function, the so called “call-back”, which represents what has to be done when I/O returns, and then immediately moves on to the next operation. When I/O completes, the callback will be executed resuming the logical processing of your program.
So, in a request/response scenario, we trigger an I/O operation and we free up the Node thread, so that another request can be immediately taken care of by the same Node instance.
In the above example, the first request Req1 runs some initial logic (the first dark green bar) and then starts an I/O operation (I/O operation 1.1) specifying the function that will have to be called when I/O completes (the function cb11). At that point, the processing of Req1 halts and Node can start processing another request, e.g. Req2. When I/O Operation 1.1 completes, Node is ready to resume the processing of Req1 and will invoke cb11. cb11 will itself start another I/O operation (I/O Operation 1.2) passing cb12 as callback function, which will be invoked when the second I/O operation completes. And so on until Req1 processing ends and the response Resp1 is sent back to the client.
In this way, with a single thread, Node can serve many requests at the same time, i.e. concurrently. The non blocking model is the key for concurrency in Node.
Being single threaded though means that we can not use more than one core (for multi core scenarios it is possible to use Node clusters, but going along this path is inevitably adding some complexity to the overall solution).
Another aspect to note is that the non blocking model implies the use of an asynchronous style of programming, which at the beginning may result hard to reason about and that can lead to complicated code, the so called “callback hell”, unless properly managed.
Go and concurrency
The Go approach to concurrency is based on goroutines, which are lightweight threads managed by the Go runtime communicating among each other via channels. Programs can launch many goroutines and the Go runtime will take care of scheduling them on the CPU cores available for Go, according to its optimised algorithm. Goroutines are not Operating System tasks, they require much less resources and can be spawned very fast in very high number (there are several references of Go running hundred of thousands, even millions of goroutines concurrently).
Go is also non blocking, but this is done behind the scene by the runtime. For instance, if a goroutine fires a network I/O operation, its state is changed from “executing” to “waiting” and the Go runtime scheduler picks another goroutine for execution.
So, from a concurrency perspective, this is similar to what Node does, but has 2 main differences:
- it does not require any callback mechanism and the code flows pretty much as in normal sequential synchronous logic, which is usually easier to reason about
- it is multithreaded and can seamlessly leverage all CPU cores made available to the Go runtime
In the above example, the Go runtime has 2 cores available (called Processors). All processors are used to serve incoming requests. Each incoming request is served by a goroutine. For instance, Req1 is served by goroutine gr1 on Processor1. When gr1 issues an I/O operation, the Go runtime scheduler moves gr1 to “waiting” state and starts processing another goroutine. When the I/O operation completes, gr1 is put in “runnable” state and the Go scheduler will resume its execution as soon as possible. So, if we look at a single processor, we have a picture similar to that of Node. The switch of goroutine state (from “running” to “waiting” to “runnable” to “running” again) is performed by the Go runtime under the hood and the code is a simple flow of statements to be performed sequentially, which is different from the callback-based mechanism imposed by Node.
In addition to all of the above, Go provides with a very simple and powerful mechanism of communication among goroutines based on channels and mutex which allows smooth synchronisation and orchestration among different goroutines.
There is more than concurrency though
Now that we have seen how concurrency is naturally supported by Node and Go, we can also look at other reasons for which it is worth start considering them as effective tools to be added to our toolbox.
Node and the Holy Grail of one language for Front End and Back End
But what if you need to build also the Back End? With Node you can leverage the same language, the same constructs and the same ideas (asynchronous programming) also to build the Back End. Even in the serverless space Node plays a central role, having been the first platform to be supported by all major Cloud providers for their FaaS offering (Function as a Service, i.e. AWS Lambda, Google Cloud and Azure functions).
The ease of switching between Front End and Back End may be one of the reasons of the incredible success of Node which has led to a super vast echo system of packages (you have a Node package for practically everything) and a super vibrant community.
At the same time, not all types of Back End processing are efficiently supported by Node. For instance, CPU intensive logic is not for Node given its single threaded nature. And therefore you should not get into the trap of “one language fits all”.
Node, with the enormous amount of packages available in its ecosystem, can also be seen as “the Far West of programming”, a place where quality and safety of what you import has to be constantly checked (read this fictional story for a feeling of the risks). But this is probably true everywhere: the broader the ecosystem is, the higher the attention level has to be.
Simplicity and Performance rediscovered with Go
Go is simple. There are 25 reserved words and the language specs are 84 pages including extensive examples (if printed as pdf from the official site). Just for comparison, the Java specs 2000 edition was over 350 pages.
Simplicity makes it is easy to learn. Simplicity helps writing code which is easy to understand and maintain. Often with Go there is just one way to do what needs to be done and this is not to frustrate creativity, but rather to simplify the life of whoever has to read and understand the code.
Simplicity can also be felt limiting. Concepts like generics or exceptions are just not in the language since the authors did not consider them as necessary (even if generics seem to appear on the horizon). On the other side some really useful tools such as Garbage Collection are part of the core design.
Go is also about runtime performances and efficient use of resources. Go is a strongly typed compiled language that can be used to build programs which run fast and efficiently, specifically for scenarios where we can leverage the power of multi core concurrency.
Go also produces small self-contained binaries. A docker image containing a Go executable can be significantly smaller than one containing an equivalent Java program, and this is because Java requires a JVM to run while Go executables are standalone (according to benchmarks the size of an optimised docker image of an “Hello World” application is 85 MB for Java 8 and 7.6 MB for Go, an order of magnitude difference). Size of images is important: it speeds up Build and Pull time, reduces network band requirements and improves control on security.
The other “new kids” that are in town
It is not only Node and Go. The recent years have seen the entrance of other technologies that promise benefits over the traditional monolithical platforms.
Rust. It is an open source language, backed by Mozilla which presented it in 2010 (so it is one year younger than Node and Go). The main goal of Rust is to optimise performance like C/C++ with a much safer programming model, i.e. with less probability to stumble into obnoxious runtime bugs. The new way of thinking Rust introduces, specifically around owning/borrowing memory, is often considered challenging in terms of learning curve. If performance is supercritical though, then Rust is definitely an option to be considered.
Kotlin. It is a language that runs on the JVM, developed by JetBrain and released in 2016 (in 2017 Google announced its support for Kotlin as an official language for Android). It is more concise than Java and embeds in its original design concepts like functional programming and coroutines making it part of the modern languages league. It can be seen as a natural evolution of Java, with a low entry barrier for developers coming from that world.
GraalVM. This is a new promising approach to Java and other JVM based languages. GraalVM Native Image allows to compile Java code to native executable binaries. This can produce smaller images and improve performance both at startup and execution. The technology is still pretty young (released in 2019) and, at the moment, it shows some limitations. Given the Java popularity, it is likely to see significant improvements as it evolves to maturity.
The software world is evolving fast. New solutions are always popping up. Some never get to the forefront, some enjoy a period of hype and then get removed from mainstream. Choosing the right one is not easy and is always an exercise of balancing new capabilities with battle tested stability and available expertise.
Node and Go have definitely proved to be viable technologies at enterprise level. They have the potential to bring significant benefits compared to traditional OO languages, specifically improving the efficiency of containerised distributed applications. They are backed by great communities and have wide echo systems.
While enterprises must continue to use and support traditional platforms such as Java, given the incredible extension of their production code base, it is highly advisable that they start embracing also relatively new tools, such as Node and Go, if they want to rip off all the benefits of modern distributed computing.