If your computer is Cloud, what its Operating System should look like?
Share this:
By Asher Sterkin @BlackSwan Technologies
Warming Up
Unlike containers, which were still an incremental change introducing smaller VMs with less isolation, the serverless technology brought in truly disruptive tectonic change, yet to be fully absorbed by the software industry. Here, we need to stop talking about faster horses and need to start talking about cars.
While cloud functions (aka AWS Lambda) constitute a crucial ingredient, they are just one piece of a bigger puzzle. To comprehend the whole meaning of the serverless revolution, we have to look at serverless storage, messaging, API, orchestration, access control, and metering services glued together by cloud functions. The cumulative effect of using them all is much stronger than the sum of individual parts. Using the cars vs horses analogy, it’s not only about just the engine, chassis, tires, steering wheel and infotainment taken separately, but rather assembled in one coherent unit.
Approaching such a new technology marvel with the legacy “faster horses” mindset will not do. Disruptive technologies are first and foremost disruptive psychologically. To utilize the full potential of disruption, one has to think anew and act anew.
The problem starts with that all existing mainstream development tools and programming languages were conceived 50 years ago, together with Internet and Unix, when overcoming computing resources scarcity was still the main challenge. Patching new features on the top of that shaky foundation made them fatter, but not more powerful. As such, they all are completely inadequate for the new serverless cloud environment. To make real breakthrough progress, a RADICAL RETHINKING of all familiar concepts is required:
What is Computer?
What is an Operating System?
What is Middleware?
What is Application?
What is Programming?
What is Testing?
What is Source Code Management?
What is Productivity?
What is Quality?
In this article, we will briefly analyze the first two of these topics one by one.
What is a serverless cloud computer?
Over the last decade, the “Data Center as a Computer” concept gradually acquired widespread recognition. When talking about cloud computing, by computer we mean a warehouse-size building filled with tens of thousands of individual boxes, performing various specialized functions and connected by a super-fast local network. If we take a traditional computer model of CPU, ALU, RAM and peripherals connected via BUS, we could say that now dedicated computers perform functions of individual chips and super-fast LAN plays a role of BUS.
From the serverless computing perspective, however, direct application of this metaphor has limited value. Not only individual data centers are not visible anymore, but the whole concept of Availability Zones also disappears. The minimal unit, we could reason about, is One Region of One Account of One Vendor as illustrated below:
Fig. 1: Serverless Cloud Computer
This is the serverless cloud computer we have at our disposal. Whether it is truly a super-computer or just a very powerful one is sometimes a subject of heated debates. That way or another, serverless cloud computer provides us, for a fraction of cost, with such enormous capacity, we could not even imagine 10 years ago in our wildest dreams.
Could or should we still apply chips/peripherals/bus analogy to this serverless cloud computer, and what would be a benefit of it? Yes, we can, and it’s still a useful metaphor. It is useful especially because it would help us to realize that all that we have now is a kind of machine code level programming environment and that we must raise up the level of abstraction in order to put it into productive usage at scale.
What is a serverless cloud computer ALU?
ALU stands for Arithmetic/Logical Unit — device which in traditional computers performs all basic arithmetic and Boolean logic calculations. Do we have something similar in our serverless cloud computer? Well, in a sense, we do. As with any analogy, it’s good to know where to stop, but we could treat cloud functions (e.g. AWS Lambda) as a sort of ALU device with certain constraints. As of now, any AWS serverless cloud computer in Ireland Region has 3000 of such logical units with 3GB local cache (some people would still call it RAM), 500MB non-volatile memory (aka local disk, split into two halves), 15 minutes hard context switch and approximately 1.5 hour of warmed cache lifespan. These logical devices run “micro-code” written in a variety of mainstream programming languages: Python, JavaScript etc. Whether one is going to use this capacity at full or only partially is another question. The Serverless Cloud Computer pricing model is such that you will pay for only what you use.
Unlike traditional ALUs, there are multiple ways to activate micro-code of such serverless cloud logical devices and to control whether they will perform a pure calculation or will also have some side effects. ALU is just an analogy, but within a limit it appears to be a useful one.
What is a serverless cloud computer CPU?
If we have 3K of serverless cloud ALUs, do we also have CPUs to control them, and do we really need such devices? The answer is there are such devices, but these serverless cloud CPUs, while useful in a very wide range of scenarios, are purely optional. Cloud orchestration services, such as AWS Step Functions, could play such a role with internal Parallel flows playing a role similar to individual cores. An AWS Ireland Region “Cloud CPU” could be occupied for up to 1 year with maximum 25K events. How many of such serverless cloud CPUs could we have? Well, we can get 1300 immediately, and then add by 300 every second. As with cloud functions, we will pay only for what we are using.
What is a serverless cloud computer memory?
Ok, we have ALUs (with some cache and NVM), we have, optional, CPUs to orchestrate them, do we have an analogy for RAM and disk storage? Yes, we do, but we might opt to stop speaking about an artificial separation between volatile and non-volatile memory. Modern CPUs make this separation meaningless anyhow. It’s better just to talk about memory. Serverless cloud computers have different types of memory, each one with its own volume/latency ratio and access patterns. For example, AWS S3 provides support for Key/Value or Heap memory services with virtually unlimited volume and relatively high latency, while DynamoDB provides semantically similar Key/Value and Heap Memory services with medium volume and latency. On the other hand, AWS Athena provides high volume, high latency tabular (SQL) memory services, while AWS Serverless Aurora provides the same tabular (SQL) memory services with medium volume and latency.
Interesting to note, some serverless cloud “memory” services, such as DynamoDB are directly accessible from Step Functions (aka serverless cloud CPUs), while others — only through cloud functions (serverless cloud ALUs). As for now, Step Functions have 32K limit of internal cache memory, and as such are suitable only for direct programming of control flows rather than voluminous data flows. Whether such a limit is a showstopper or a pragmatic trade off choice is a subject for a separate discussion.
A full analysis of available services, which would also include Serverless Cassandra, Cloud Directory and Timestream, is beyond the scope of this memo.
What are serverless cloud computer peripherals?
Ok, we have serverless cloud computer ALUs, CPUs and Memory (yes, yes, all metaphorical, of course), do we have something similar to peripherals in traditional computers? Yes, we do have something similar to ports, which connect our serverless computer to the external world. As with traditional ports, each one supports different protocols and has different price/performance characteristics. For example, AWS API Gateway supports REST and WebSockets protocols, while AWS AppSync supports GraphQL, and AWS ALB supports plain HTTP(s).
Full analysis of available services, which would include CloudFront CDN, IoT Gateway, Kinesis and AMQP, is beyond the scope of this memo.
What is a serverless cloud computer Bus?
So, we have metaphoric ALUs, CPUs, Memory and Ports for our serverless computer. Do we have something similar to Bus, and do we need one? The answer is yes, we do have several types, and we will sometimes need some of them. For example, AWS SQS provides Push high speed, medium volume service, while AWS SNS provides high speed, medium volume Pub/Sub notification service, and AWS Kinesis provides high speed, high volume Push service.
What else does the serverless cloud computer have?
Well, unlike traditional computers, quite a few more batteries are included: data flow unit (aka AWS Glue), machine learning unit (aka Sage Maker endpoint), access control (aka AWS IAM), telemetry (aka AWS Cloud Watch), packaging (aka AWS Cloud Formation), user management (aka AWS Cognito), encryption (AWS KMS), component repository (AWS Serverless Application Repository), and a slew of fully managed AI services such as AWS Comprehend, Rekognition, Textrat, etc.
Complete specification of the Serverless Cloud Computer “hardware” is illustrated below:
Fig. 2: Serverless Cloud Computer “Hardware”
Serverless Cloud Operating System
Following the good tradition established by E.W. Dijkstra, we will treat the Serverless Cloud Computer metaphorical “hardware” specification outlined above as the bottom layer of a “necklace string of pearls”, namely higher-level, domain-specific virtual machines stacked on the top of lower-level infrastructure virtual machines, as illustrated below:
Asher Sterkin, SVP Engineering at BlackSwan Technologies and General Manager at BST LABS. Software technologist and architect with 40 years of experience in software engineering. Chief Developer of the Cloud AI Operating System (CAIOS) project. Before joining BlackSwan Technologies worked as Distinguished Engineer at the Office of CTO, Cisco Engineering, and VP Technologies at NDS. Specialises in technological, process, and managerial aspects of state-of-the-art software engineering. Author of several patents in software system design. During the last couple of years focuses on building the next generation of serverless cloud computing systems, services, and infrastructure.