Thumbnail for Benefits of High Availability and Scalability in the Cloud - AZ-900 Certification Course by John Savill's Technical Training

Benefits of High Availability and Scalability in the Cloud - AZ-900 Certification Course

John Savill's Technical Training

15m 24s2,264 words~12 min read
YouTube auto captions
Transcript source

YouTube auto captions

This transcript was extracted from YouTube's auto-generated caption track. The transcript below is server-rendered so it can be read, searched, cited, and shared without opening the original YouTube player.

Pull quotes
[0:00]In this lesson, we're going to look at describing the benefits of high availability and scalability in the cloud.
[0:00]There's a joke, there's no such thing as the cloud, it's just someone else's PC, and there's a lot of truth there.
[0:00]Capacity is something that enables services to run, they have some management platform, some user experience, we can consume the service.
[0:00]Now, on premises, if I think about my local data center, in my on-premises world, my capacity is provided by, well, I have various types of server.
Use this transcript
Related transcript hubs

[0:00]In this lesson, we're going to look at describing the benefits of high availability and scalability in the cloud. Now, when we think about cloud computing, what exactly is it? There's a joke, there's no such thing as the cloud, it's just someone else's PC, and there's a lot of truth there. We often think about capacity. Capacity is something that enables services to run, they have some management platform, some user experience, we can consume the service. Now, on premises, if I think about my local data center, in my on-premises world, my capacity is provided by, well, I have various types of server.

[0:54]I'll have network switches providing network throughput, I'll have storage area networks, other types of network attached storage, but these are all providing capacity for the workloads I want to run on premises. And then on top of that capacity, well, maybe I'm running virtual machines, maybe I'm using containers, and so I'm creating pods that run a certain container image. But I can think of on that capacity, well, there's different types of service that capacity provides that I can then install my business applications on and actually use. On premises, I'm bound by the physical hardware I have, and obviously, I have to buy it in advance, and I'm always paying for it. If we now think about, well, how does that work in the cloud? In the cloud, that capacity is huge in scale, that are housed in data centers all around the world. I can think about there are specific sets of data centers. So if we now pivot to thinking about the cloud, I'm going to draw this massive idea, and this is Azure. All around the world, there are these groups of data centers. So there's these various groupings available, certain proximity to each other. And we think about we group these based on a certain amount of proximity. So I could say, well, okay, this is a region, you'll hear that term a lot, that might be region one. This is another region, so that's region two. And so on top of all of this capacity, I can then run the various different services. And in the cloud, there's more types of service available to you. Yes, there's obviously things like, hey, I can go and spin up a virtual machine, I can use container environments like Kubernetes, but I might also have database offerings that are managed for me. There might be artificial intelligence and machine learning services that I can go and consume, and many, many others. So we get this many different types of services, hundreds of them available that I can go and consume. And I have a lot of flexibility in where I can go and consume that. If we jump over, and we look at Microsoft's map, all of these blue dots are existing regions. And again, within those regions, I can click on one of them, they would have many data centers. And we can see information about, well, when did it open, different compliance offerings that are available, plus many more. And so I as a customer, have great flexibility in terms of, well, I might want them in lots of places for resiliency from any kind of failure. But also if my customers are distributed all around the world, well, it would be great to offer my services close to the customers so they get a fantastic experience. There's not a long delay in having to go a really long distance over the network. And so what all of these different types of services are available in so many different places gives me is this fantastic agility. Because I'm only paying for the service I consume, I don't get locked into I've bought this server, I have to use this server. I can change where I want to host my services, I could host it in more places, I could host it in fewer, I could change where I host it. I can switch sizes of VM, number of VM, I could switch from VMs to containers to app services, I can switch database sizes. There's no penalty for me, because I just pay for what I'm consuming, typically on a per second basis. It's also fantastic because it means if I do make a mistake, I've done a sizing exercise, I've picked a certain service, and then if I work out that's not the right thing, I can very easily change. Now, if I switch from VMs to containers or app service, there may be some application development effort. But that's completely different from being locked into a particular set of infrastructure I may have purchased. Another key concept when we architect any solution, but it includes the cloud, is high availability. High availability is about ensuring that if certain types of events occur around disruption, our service continues to function. Now, every single Azure resource has its own specific service level agreement or SLA. This is a financially backed guarantee around what you can expect in terms of its availability to be communicated with, to be used. And if it breaks that SLA, you get a certain financial credit back. We cover SLAs later in this course. It's important to understand the SLA your solution needs so I can then architect it accordingly. Now, very often to meet my high availability, it means I need multiple instances of my resource over different blast radiuses. And a blast radius you can think of, well, a server can fail, so I want to make sure my instances were different servers. An entire server rack could have a power supply or a network switch failure, so I want to make sure I'm in different racks. But data centers can have failures, maybe cooling, maybe their power, so I might want to distribute over multiple data centers. And so we think about, what is the blast radius I want high availability for, and then by being able to distribute them, I'm able to survive larger potential blast radius type problems. So, different servers, different racks, likely different data centers. So in this case, I would think about from a high availability, if I thought about this was a specific region, and we have this concept of availability zones, which again, we'll cover later, but they're isolated sets of data centers, independent power cooling networking. So for my high availability, at minimum, I'd want to make sure I got at least two instances of my service in completely different sets of data centers, which means I'm also getting resiliency from rack level or node level failures. Because, hey, I've got that nice distance between them. Now, additionally, you do have to think about disaster recovery. So we think typically of high availability, I might think within a certain location, because they're close together, I can do different types of replication without any risk of data loss. But then if there was this horrible region level outage, well then we might think about disaster recovery. And so disaster recovery would mean I have the ability to run my workload in another region. So you think of this as DR. And what I would want here is a very big distance. I might think hundreds of miles. Because I want to make sure if there was some natural disaster here, it doesn't impact this region as well. Now, the form that disaster recovery takes will vary greatly. It depends on how long I have for my service to be able to be up and running again. Maybe it's I create the new resource, I restore a backup. Maybe I'm constantly replicating data and it's kind of ready to start up. Or maybe I really don't have a lot of time at all, and so it's constantly synchronizing, it's running, and I would just switch something over. Modern architectures, I may even be running active in lots of regions, then there's some endpoint that enables the client to go to whichever one is currently active or is closest to them. And so that agility I spoke about, to have that flexibility and only pay for what I'm consuming when I need it. Well, that also gives me something called elasticity for my service's capacity, because I can think about, well, over time, the amount of work that my service needs to perform, is going to vary depending on maybe what's coming in. And so what I want is this ability to scale my various workloads. So let's now think about this idea that I want to provide scalability. And as part of that, I get this great elasticity. My service can grow and shrink, based on the work it needs to actually do, to service the number of incoming requests. And I could think about many different types of workload have some element of seasonality. And what I mean by that is if we think about time, and the work I need to do, very often, it's not a flat line. Very often, there's some peak of work to do, maybe a really quiet time, maybe another peak, maybe an more average amount of time. And that could vary, this could be daily, weekly, monthly, yearly, maybe even every few years. Consider a tax application, it's really busy for one month out of the year. A pizza site is busy for a few hours a night, it's even busier on a Friday and a Saturday night. The Olympics, hey, I'm hosting the Olympics, I'm busy every four years. And so there's this variance in the amount of work my service needs to do. And what I don't want to do is for my service to always operate to be able to handle the peak, that's what we have to do on premises. We have to buy all the servers for the busy times and often they're very idle. What I want instead to be able to do is for my service in the cloud, because I pay for only when it's running on a per second basis. I want to change the amount of service I have at any moment in time, to match the amount of work it actually needs to do. And that could be I manually doing that scaling, or maybe I want to automatically do that. For example, I could say, hey, look, for the instances I'm running, if the CPU is over 70%, we'll add some more instances. If the CPU is less than 30%, we'll let's remove some. So we have this great flexibility. Now, how can I do this scaling? One option would be to scale vertically. So I could do a vertical scale. And what this is meaning is I have my instance, and if I'm scaling vertically, well, I'm making it bigger. I'm adding more CPUs, I'm adding more memory. In reality, that would mean I have to stop the instance. I would stop the workload, I would resize it and start it again. That's because very few operating systems actually support adding, much less removing CPUs or memory, only a few applications would handle it, which means I'd have downtime. Also, I don't want just one instance, remember, that's not good for my high availability, where I'd rather have more smaller instances. And so, while I can do that, we give this a little bit of a frowny face, we try not to do that. The much better option is to scale horizontally. So I can do horizontal scale, which means obviously, I'm scaling this way. And here, we just have a certain number of instances, and if we're getting busier, we'll add some new ones. If we're getting quiet, we'll we could remove some. And this is the type of scaling we want to do in the cloud. We give that a happy face. And this is how the cloud is designed because now I've got no downtime. I'm just adding and removing instances. Again, there'd be some kind of balancing solution that the client would talk to, to then balance between the ones that do exist at a moment in time. And we would typically never scale to less than two, because remember, for our high availability, I always want at least two. Now, horizontal scaling does require your application to support more than one instance of it, but again, in modern architectures, that's becoming very, very common. And so when you think of the cloud, we think we want to pay on a per second basis, the amount of work we're actually doing. And horizontal scale lets us do that, and we could automatically say, hey, look, CPU's busier than a certain amount or a Q depth is above a certain number, let's add some. Hey, we're not doing very much work, let's remove some instances. And that could be automatic, it could be scheduled, it could be manual. So those are the key concepts from where to cover in this. We think about we have great agility, we can have many different types of service, we can change their sizes, we can change the service we want to use. We have this concept of high availability by having multiple instances spread over different blast radiuses. And then we have this elasticity, the ability to scale. Yes, vertically by making them bigger or smaller. But more commonly, we're going to scale horizontally by adding and removing based on the amount of work that varies over time.

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript