Thumbnail for TCP: How It Actually Works (System Design for Beginners – Episode 4) by Akhil Sharma

TCP: How It Actually Works (System Design for Beginners – Episode 4)

Akhil Sharma

10m 0s1,380 words~7 min read
Auto-Generated

[0:00]So let's talk about TCP. Now you remember in the second video, we talked about IP or internet protocol. This is the set of rules that defines how two devices can communicate with each other over the internet. And TCP or the Transmission Control Protocol is one of such protocols. It's actually the foundation layer for many protocols like HTTP, FTP, SMTP. Now, in many of these protocols, we are replacing TCP by quick, which is actually based on something called as UDP. And we'll be learning about UDP and quick in the series. But for now, let's just focus on TCP. So TCP is like a fundamental protocol, and it does everything from deciding how the packets are actually divided and how they're sent and how they're actually received and then reassembled. So this is what IP is supposed to do, right? And it also establishes the rule set based on which these two devices will establish a connection between them. So how does this happen? Let's let's walk through it. The client or the machine that wants to request the information from the server is actually going to send something called as the SYN flag, which is the synchronization flag. And the server then sends something called as SYN plus ACK. Right? They're trying to send SYN flags so that they're able to sync with each other for something called as the ISN, or the initial sequence numbers. And acknowledgment is that the server is saying that yes, I'm alive and I've received your package or your flag, which is a SYN flag, and we are, I'm ready to start sharing information with you. But the server does not know that the client has actually received this and is also ready to start accepting information. So this is why the client sends an ACK flag, right? So, essentially what you're doing is, you're sending two flags, which is SYN and ACK. And both the client and the server need to send a SYN and an ACK flag. And you can also notice is that the SYN comes before the ACK. So, SYN is like an initial ping, saying that, hey, are you alive? Are you there? Right, it's just checking. And then the server can just send an ACK, but to make things more efficient, we send a SYN and an ACK, so that we they're both able to sync, and also the client gets an acknowledgment, saying that, okay, I'm alive. I'm there, and I can send you information. Now, the client needs to send the ACK back, and after this, the the flow of information between them can start. So, it has to go through SYN, SYN, ACK. So, in cyber security and in in networking circles, it's very popularly called SYN, SYN, ACK, which is basically TCP, right? But it's very easy to remember how it works, which is SYN, SYN, ACK. So, if you need to remember it, this is how you remember it. I want to take a quick minute to share something very important with you. Now, as you move up in your engineering career, DSA and coding become less and less important, and system design becomes more and more important. To the extent that if you're going for a director, VP or CTO position, the only thing they're checking for are your system design skills because they assume that since you've spent so much time in the industry, you're very good with coding and DSA. Now, there are many resources on the internet for basic system design, but there isn't much for advanced level system design or using system design skills to actually work with complex systems. And this is why I've built a three-month advanced system design cohort. If you're someone who has some experience in the industry and you resonate with this problem statement, you might want to fill the form that's the comments and in the description of this video. 20 people have already joined up in this cohort. We'll check to see if you're a good fit once you fill the form and we'll set up a time with you. Now let's get back to the video. Now the information flow between these two devices has started, right? And every time this the client receives some information or some packets from the server, the client actually sends back an ACK flag again. So every time it receives few packets, it sends back an ACK. And this actually ensures reliability, which is what the TCP protocol is famous for. Because at any given point of time, the server is actually aware that the client has received the message or not, or received the package and communication or not. But it's not practical to send back an acknowledgment after every single packet, that'll be highly inefficient. So what happens is that the client when it realizes that it's received the packets, let's say, ordered from 1,000 to all the way to 2,000, it sends back something called as a combined ACK to tell the server that, yes, I've received all of these packages, and then the next packets are you're supposed to send me the next packets. Now, I mentioned that TCP is known for reliability. And the order of the packets is also quite important. So, a huge file sent in small packets over to the client from the server, what's also important is that the packets are received in the right order. Now, this can create a problem, because if one of the packets is lost, right? So, we keep waiting for, let's say, these are the four packets received, and the fifth packet is also received, but the sixth packet is lost. Now, the problem is, we'll keep waiting for the sixth packet to be retransmitted. We won't send seventh and eighth packet until we've received the sixth packet. So, here, the fifth and sixth packet need to be available, only then we'll accept the seventh and eighth packet, because the order is important. And this can lead to inefficiency because packet number seven and eight are actually ready at the server side, and they're sitting in a buffer. This can lead to inefficiencies. TCP is also not very good at scale. So, each of the devices talking to the server is another TCP connection. So, it's an individual TCP connection.

[7:05]Now, this is not a problem at small scale, but at large scale, it can be an issue because each of the connections requires kernel memory. Chat messaging with TCP protocol can also become challenging because with chats, the size of the packets is very small. And every single time you know that we have to send some packets, we also need to send back an acknowledgment message. So, because the size of the packets are small, we're we're sending a lot more packets and getting a lot more acknowledgment messages. All of these acknowledgment packages or packets, they need to external node on the network. Now, you remember that every single time a PC has to connect to the server, it sends a SYN message to check the status of the server. Now, this can be used by hackers. So, hackers can actually send thousands and thousands of SYN messages to make the server feel like a lot of clients are trying to connect to the server. And this can actually lead to a DDoS, or distributed denial of service attack.

[8:42]Now, you remember the client and the server need to send SYN messages to each other, to try to sync up the initial sequence numbers. But if a hacker is able to guess this number, the hacker is easily able to send malicious packets in the network. The other problem is that the TCP protocol is does not have any authentication. It's not secure by default. And this means that it's very easy to perform man-in-the-middle attacks. So, DDoS are simple, and man-in-the-middle attack is simple. There's also failure ambiguity with TCP, like the server at no point of time will know why this connection between the client and server failed, and this makes it difficult to debug. So, I hope you learned a lot about TCP. See you in the next one.

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript