By Axel Kloth, President & CEO at Scalable Systems Research Labs Inc.
The Cloud is one of the most-used and least well understood concepts in modern IT. Secrecy and lack of definitions surround “The Cloud”. There is a lot of controversy about what “The Cloud” is and what it is not.
Let’s start with what it is not.
It is not a magic space in which applications run that cannot be made to run in any other data center. It is not a magic space that allows for linear scaling of performance with the number of processor cores deployed for all workloads. It is not a magic space that keeps you from losing data, and without your help, it is not protecting your data either. It is not cheaper than your own data center. It usually is not cutting down on administrative effort and therefore cost either.
So what is “The Cloud”?
In essence, “The Cloud” is a collection of one or more of someone else’s data center. All servers in that data center are interconnected and – usually through a Virtual Private Network – connected to the user of Cloud services. “The Cloud” unfortunately is a diffuse and nebulous description of interconnected data centers. These data centers usually consist of industry-standard x86-64-based servers with or without special coprocessors, accelerators, with or without very large physical memory, and with or without mass storage. Usually, “The Cloud” is running Linux in one or more different flavors, and it may or may not be based on virtualized environments. What is common though is the fact that all services are web-based, i.e. they use interfaces that ultimately are making use of technologies invented for the WWW and for HTML. That means that it is irrelevant where the services are being executed and processed, in the same fashion that any web server could serve any web site because all that is different is the contents, and not the protocol, the parameters or the syntax. In general, all data centers that make up “The Cloud” run a stack that commonly is referred to as “LAMP”: Linux, Apache, mySQL and PHP/Perl. These days, virtualization is nearly always included, and so are a lot of other technologies such as Hadoop (MapR) and SPARK. For math, openCL is made available, and for AI and Machine and Deep Learning, oftentimes TensorFlow is provided.
The advantage of “The Cloud” is very clearly a financial one, and not a technical one. Prior to “The Cloud”, everyone’s data center was set up to deal with peak computational, storage or I/O demands. That was a very expensive value proposition. Since the advent of “The Cloud”, inhouse data centers can (and should) be set up to be able to cope with average computational, storage or I/O demands – all demand exceeding that can now be cheaply bought in “The Cloud”. Hence the advent of the “Hybrid Cloud”: web-services capable software in your internal cloud as well as in the external cloud, being purchased on demand, can be shifted from internal resources (the private cloud) to external resources (the public cloud) and vice versa. The use of hypervisors and virtualization makes it simple to push identical configurations of software across servers in the hybrid cloud, and by doing so dynamically adjusting workloads.
That is the true advantage and benefit of “The Cloud”: all services are web-based with software and API interfaces that are standardized. As a result, workloads can now be shifted, migrated and executed on any available resource, be it the internal on-premise Cloud, or the external Cloud provided and supplied by a Cloud services provider. More importantly, all applications are now written to use web services interfaces – the lingua franca of the Internet age. That means that applications written today towards cloud deployments are more robust, more portable and more scalable than applications from earlier times.
Other than the OS, the LAMP Stack, the virtualization software, the hypervisor and agreed-upon APIs there is very little software pre-installed on cloud-based servers. Hence, there is not much in administrative savings when migrating to the cloud, as all of the customer’s software must be installed in cloud-enabled applications the same way as it would be installed or built in in-house deployments.
As another side effect of better resource utilization some of the more progressive Cloud providers have started to add special-purpose coprocessors to their Cloud offerings. These could be coprocessors for workloads heavy in floating-point, for AI and Machine Learning using GPGPUs or other accelerators, for Big Data applications, and for more general-purpose acceleration through the use of FPGAs. Other modifications can include DRAM configurations that are larger than the average server memory sizes, or Flash on DIMMs for even higher densities of main memory. Since the cost of these more specialized servers is depreciated over many users, they have become more affordable to even an average user. In essence, cloud technologies have become a commonplace, and they have helped drive cost down for any kind of computational workloads that users encounter – whether the cloud is a private one, a public one, or a hybrid cloud. The biggest influence that “The Cloud” has had is the change in paradigm for interfacing to any kind of compute these days: through web services interfaces.