Thumbnail for Senior DevOps Engineer Mock Interview: Prepare Like a Pro! by Tech with Ajit

Senior DevOps Engineer Mock Interview: Prepare Like a Pro!

Tech with Ajit

17m 26s2,873 words~15 min read
Auto-Generated

[0:00]Hey, hi Pratik, good afternoon. Thanks for joining. Yeah, good afternoon. Yeah, thank you for giving me this opportunity. Yeah. So maybe we can get started with a quick round of introduction, then maybe we can deep dive further in the questions. Sure. So I'll start with the intro first. Um, so I am started my journey as a cloud engineer, uh, I think five and a half years back. Uh, so, um, in that process, I started with development and uh, I did it for like couple of months and then I slowly moved to the deployment part of it. Uh, and I started liking creating infrastructures, automating them, uh, creating CICD pipelines, and automating things in Python shell and whatever where it is possible because, um, making things easier was one of, uh, the thing that I, uh, loving about it. And, uh, then I, uh, moved to a different organization where I worked as a solutions engineer, uh, so understanding customer problems, um, like suggesting them a good architectural overview and how they can make, uh, or use the product in a better way. Uh, and after that, I, uh, moved to a different organization where I worked as a senior DevOps engineer, uh, and my entire responsibility was, uh, to like, um, automate the things and make processes easier, uh, how we can roll out the, uh, updates, uh, frequently without any, uh, interruption and making sure the reliability and highly high availability, um, as well. Uh, so, um, after that, um, I in my current organization, I'm working, um, as a consultant, um, and I've worked with different engagements, um, across, um, teams. Um, and, uh, yeah, so it's mostly about the DevOps best practices and automating things. Yeah, So thank you Pratik for your brief introduction. So, uh, to start with, uh, consider that you want to develop an e-commerce application, uh, that will be accessible throughout the world, okay? So it is an international application, uh, the requirement is it should be, uh, highly scalable, robust, and it should be highly available, okay? So you are free to use, uh, all the AWS services which you like. Uh, yeah, so the, the requirement is simple, it should be highly scalable, it should be reliable, and it should be highly available. Yeah. Okay, sure. Uh, so I'll start with, um, the networking part of it. Uh, so we will be, uh, using a VPC, uh, and a subnet, uh, subnet, so I think there should be at least six subnets, uh, so two for web layer, two for app layer and two for database layer, if, uh, uh, the, uh, if the database is like, um, resides in, uh, in a VPC, like RDS or DocumentDB. Uh, and, uh, in those private subnets, you will be having our application, uh, servers, uh, in the app subnets, and then in database subnets, it will be having our databases like RDS, it could be MySQL, it could be, uh, Aurora or postgress, um, or it could be DocumentDB cluster as well. Uh, and, uh, we will also be having, uh, like, uh, NAT Gateway, uh, as well as Internet Gateway for our VPC and then we will be having, uh, transit gateway as well if you want, if we want to access our data from our data center which resides, uh, in the organization or somewhere else as a local. And then, uh, we will be using, uh, load balancers for equally distributing the traffic across, uh, our, um, across our different, uh, different instances. Um, we will also be implementing auto scaling to make sure we can cover the load, um, uh, and increase, uh, the number of servers, um, whenever, um, there is a high traffic coming in. We will be utilizing, um, AWS RDS for, uh, structural databases, uh, which supports Oracle, Postgress, SQL, MySQL, MSSQL, etcetera. And, uh, we will also be using DynamoDB if we want to use no SQL database. Um, we can also utilize DocumentDB. Uh, and, uh, so networking, uh, we have good, uh, database is good, uh, for storage, we will be using S3, um, so for storing, um, uh, any object, uh, like data, we can use S3 and we can also use S3 for, um, hosting our, uh, static content, um, and over that, we will be utilizing CloudFront, uh, to as a CDN to deliver that content, uh, faster, uh, with the low latency. For, uh, DNS management, Route 53 is there, uh, and, um, for, uh, so network is done, storage is done, database is done, um, and, uh, I think, compute, yeah, for, uh, uh, for, uh, compute, we can utilize the EKS clusters, uh, to host the backend, uh, and ECS as well. It, it based, uh, it is based on, uh, how we are architecting our application, uh, and it if it is monolith, we can also, um, go with, uh, the EC2 instances. Uh, and, uh, AWS Camera is one of the thing that we can go if we are using servers. So I, I think this is a very high-level architecture that, uh, uh, that we can create for an e-commerce application. Okay, okay, perfect. So you mentioned monolith and micro services in your answer, right? So you mentioned EKS, ECS and normally EC2 instance based monolith. So, uh, under what circumstances or under what requirements you will choose either monolith or micro services and which is more beneficial against each other? Yeah, uh, so, uh, uh, it depends if we are just starting with our product, uh, and we don't have the expertise, uh, or, or the affordabliity I'll say, um, to have so much of resources, uh, that will learn Kubernetes, that it will, uh, manage those clusters. Because EKS you have to manage the clusters, you have to manage the EKS versions, uh, you have to learn how to implement all those things and it could be a bit difficult. But like if you have a, uh, if you are having a startup and if you just want to get started without, um, all those hurdles then you can just start with a monolith application. And also if your architecture does not require to be that complex then it it's better to go with monolith, but if you want flexibility and much configuration management and all, then it it's better to go with the container based system like EKS or ECS. Okay, so what is more complicated according to you, is it Kubernetes based systems or maybe monolith? So if you have a large team, okay, say let's say 10 or 12 developers or DevOps engineers working on your product, so which one will you choose? Will you choose monolith or microservices or maybe EKS? Uh, so, uh, it will majorly again depend, um, on, uh, on the architecture of the application. So let's say if it is having like 70 to 80 services, uh, so I need to manage like routes for all those, um, and also like the backend of all those applications. So it's easier for me as a DevOps to manage all these, uh, via, um, and via a Kubernetes based systems because I don't want to manage all those routes in a monolith system where I have to create the routes for all those. So let's say if I use Engines and I need to create all those routes for Engines, then it will pass through an ALB and then eventually an EC2 or something. Uh, but if it is EKS, I can have a, uh, service created for this. We can, we can easily automate it though it it is manageable in monolith as well but like having a Kubernetes based architecture make sure that our application is reliably running because whenever it goes up like EKS will automatically keep it running. Uh, it, it will alert us like monolith also does that, but EKS, uh, makes it, uh, makes it, uh, easier for us because it is managed. Okay. Okay.

[7:33]Um, yeah, so and, uh, in case of monitoring or maybe observability, uh, how would you monitor this entire ecosystem or maybe entire application? So what kind of services will you use? Will you use all the AWS services or maybe some external help as well? Yeah, so when it comes to monitoring, uh, we have, um, very good tools available outside of AWS. So, uh, I, uh, I'll go with Datadog, um, and Grafana, uh, for the, uh, visibility. Uh, because Datadog also provides all the infra level, um, loggings like what request we are, uh, getting and what is the latency time, um, and, uh, if there is any error, if there is any warning, if, so all the traces related to the application infra, uh, AWS X-Ray is, uh, one of the services within AWS that I'll go with for application level insights if there is any application bottlenecks within API or something. Uh, if we want a very simple architecture then it, um, then it is AWS CloudWatch because it also has a lot of log insights within it, uh, so every service has a log stream created if we configure it correctly. Okay. So if CloudWatch and X-Ray has all the required features, then why exactly you would require DataDog in the first place? Yeah, so it totally depends on our requirement. Um, CloudWatch just provide a lot of features, but when it comes to, um, configuration, so let's say if I want these much applications in a group, this much projects in a group, if I want to make sure that these much parameters or metrics outside of the supportability of CloudWatch, then I can go for those those custom metrics, um, with the open source tool, rather than AWS managed, because, uh, there are certain limitations when it's when it comes to configuring CloudWatch.

[9:17]So we, we can't, uh, like go out of those, um, limitations, so for that purpose, we can use, um, some open source tools as well. Okay, okay. So, uh, in case of a disaster, okay, so, uh, just in case, as I mentioned in the beginning, so it is a multi-region kind of a or maybe international application. Just in case any region goes down or maybe any availability goes down, availability zone goes down in your system, so how would you react or how would you build systems or maybe how would you automate stuff so that your application is not impacted at all? Yeah, uh, so, um, for disaster recovery processes, uh, we have to make sure on that, um, our core, um, database, wherever data is residing, um, is replicated not just, um, across EC but across region as well. So, uh, all though we will be having multi-AC configuration for our production deployment, we need to make sure that we are having a copy of it, uh, across region as well. So we, we can, uh, so we need to configure the cross region, uh, uh, replica of the RDS snapshots, um, uh, which is which should be synchronous, of course, so we don't lose any data. Um, and, uh, for like storage, uh, whatever storage that we have in our S3 bucket, we need to set up a cross region replication for that S3 bucket as well. And if we require, we can also, um, use like, um, global load balancer, uh, which will make sure that if a region goes down and it is not serving, then we can utilize the data that is available, uh, within the next available region. And, uh, for DynamoDB, uh, so global table is a thing, uh, that we can use, uh, make use of, uh, for those kind of, uh, uh, disasters. So, yeah, these are some of the, um, high-level things that, uh, uh, I'll, uh, implement whenever designing an architecture. Okay, hmm. So we spoke about observability, we spoke about, uh, fault tolerance and DR. So, if we want to secure this particular application, okay, if you want to secure this application as well as the cloud infrastructure. So, uh, what best practices or what measures you will take so that your application becomes secure. Yeah, uh, so, uh, when it comes to security of the application, um, so first thing as an e-commerce application, um, that the threat it gets is like DDoS attack. Uh, so we need to implement AWS shield, uh, which, which is a managed AWS service, which will make sure that whenever there is a traffic that is not expected and, uh, it is not legal traffic, I'll say. Uh, so it blocks it, uh, to a certain, uh, level, uh, so it does not impact the running application. Uh, and, uh, we can also implement AWS WAF, uh, which is like web application firewall, which makes sure, uh, that our application level, uh, application, application related, uh, threats, um, are like managed properly, uh, and whenever there is a compromise to the security, uh, so, uh, we'll make sure that, uh, we are, uh, securing that application, uh, using rules in that WAF. Uh, and when it comes to like accessing, uh, services, uh, within AWS, uh, then in that case, we can, uh, uh, use IAM roles for the, uh, least privileged accesses. So like if, uh, EKS wants to access DynamoDB or RDS, uh, it is only utilizing the data it needs to and not more than that. Okay, great. Uh, so yeah, so moving on to the cost optimization section. So the customer is quite cost sensitive, okay, and, uh, as we are using a lot of AWS services, right? So AWS is not cheap, right? So it is going to incur a lot of costs, right? So with respect to your observability, security and DR. So, uh, what cost optimization measures you will take, uh, so that it doesn't affect the customer intake so in in terms of taking a decision whether to choose this particular service or not. So basically any five key points that you will implement in your systems so that, uh, we are also kind of making sure there are no humongous costs. Okay, uh, so, uh, cost optimization as in like entire application or, uh, only within, uh, the storage part. Uh, it can be, it can be generic to AWS, so it can be with EC2 instances, it can be with S3 buckets, it can be with RDS instances. Anything that comes to your mind, uh, it is okay for me. Sure, uh, so I'll start with the compute part. Uh, so when it comes to like having the instances, so we need to make sure what our application actually requires. Whether it will be memory intensive, like compute intensive, uh, intensive or like GPU intensive. Uh, based on that, we need to select the right, uh, instance type for that. Uh, and, uh, we also need to make sure that we are having a reserved instances plans as well, uh, because we are if we are sure that we will be using these many instances for these many months or years, then it better to go with the reserved instances policies which gives us a discount up to like 70 to 80%. And saves us a lot of money. Uh, and we can also use the spot instances which, which are cheaper, uh, even though, uh, we, uh, we are not sure when when those instances can be taken, um, back, but, uh, it is better to go with the other spot instances for any non-prod, uh, use cases and, uh, for stateless application where we where we don't, uh, really need the reliability for the instances to run continuously.

[15:48]Uh, and, um, so yeah, and when it comes to like data transfer cost, uh, within AWS, we need to, um, uh, make sure we are using, um, like VPC endpoints for private connectivity, uh, to avoid data data transfer charges because like NAT Gateway costs a lot. So, uh, and, uh, so we need to make sure when we are transferring the data within availability zones or within the regions, uh, then these costs are are taken care of. Uh, and, uh, for, uh, yeah, for, um, RDS, we can use, um, the RDS reserved instances, same as the EC2s. And, uh, for, uh, like for CICD and all like when we use like CodeBuild or CodePipeline, so, uh, we can also optimize those builds, uh, or tweak those configurations, uh, to build it faster, uh, by having like stage build, um, and like using an Alpine image or something. So these are the, uh, like some of the things that I'll look, look into. Okay, and what would be your deployment tool? So you forgot to mention the deployment tool. So if it is a EKS based application, so how would you deploy your Docker image onto the EKS cluster?

[17:17]Yeah, uh, so, uh, if it is an EKS image, I'll go with, um, the ArgoCD, uh, which is a perfect for the EKS clusters.

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript