Thumbnail for But what is a neural network? | Deep learning chapter 1 by 3Blue1Brown

But what is a neural network? | Deep learning chapter 1

3Blue1Brown

45s1,160 words~6 min read
YouTube auto captions
Transcript source

YouTube auto captions

This transcript was extracted from YouTube's auto-generated caption track. The transcript below is server-rendered so it can be read, searched, cited, and shared without opening the original YouTube player.

Pull quotes
[0:04]It's sloppily written and rendered at an extremely low resolution of 28 by 28 pixels, but your brain has no trouble recognizing it as a three.
[0:04]And I want you to take a moment to appreciate how crazy it is that brains can do this so effortlessly.
[0:04]I mean, this, this and this are also recognizable as threes, even though the specific values of each pixel is very different from one image to the next.
[0:04]The particular light sensitive cells in your eye that are firing when you see this three are very different from the ones firing when you see this three.
Use this transcript
Related transcript hubs

[0:04]This is a three. It's sloppily written and rendered at an extremely low resolution of 28 by 28 pixels, but your brain has no trouble recognizing it as a three. And I want you to take a moment to appreciate how crazy it is that brains can do this so effortlessly. I mean, this, this and this are also recognizable as threes, even though the specific values of each pixel is very different from one image to the next. The particular light sensitive cells in your eye that are firing when you see this three are very different from the ones firing when you see this three. But something in that crazy smart visual cortex of yours resolves these as representing the same idea, while at the same time recognizing other images as their own distinct ideas. But if I told you, hey, sit down and write for me a program that takes in a grid of 28 by 28 pixels like this, and outputs a single number between 0 and 10 telling you what it thinks the digit is. Well, the task goes from comically trivial to dauntingly difficult. Unless you've been living under a rock, I think I hardly need to motivate the relevance and importance of machine learning and neural networks to the present and to the future. But what I want to do here is show you what a neural network actually is, assuming no background, and to help visualize what it's doing, not as a buzzword, but as a piece of math. My hope is just that you come away feeling like the structure itself is motivated, and to feel like you know what it means when you read or you hear about a neural network, quote unquote learning. This video is just going to be devoted to the structure component of that, and the following one is going to tackle learning. What we're going to do is put together a neural network that can learn to recognize handwritten digits. This is a somewhat classic example for introducing the topic. And I'm happy to stick with the status quo here, because at the end of the two videos, I want to point you to a couple good resources where you can learn more and where you can download the code that does this and play with it on your own computer. There are many, many variants of neural networks, and in recent years there's been sort of a boom in research towards these variants. But in these two introductory videos, you and I are just going to look at the simplest plain vanilla form with no added frills. This is kind of a necessary prerequisite for understanding any of the more powerful modern variants, and trust me, it still has plenty of complexity for us to wrap our minds around. But even in this simplest form, it can learn to recognize handwritten digits, which is a pretty cool thing for a computer to be able to do. And at the same time, you'll see how it does fall short of a couple hopes that we might have for it. As the name suggests, neural networks are inspired by the brain. But let's break that down. What are the neurons and in what sense are they linked together? Right now, when I say neuron, all I want you to think about is a thing that holds a number. Specifically, a number between zero and one. It's really not more than that. For example, the network starts with a bunch of neurons corresponding to each of the 28 * 28 pixels of the input image, which is 784 neurons in total. Each one of these holds a number that represents the grayscale value of the corresponding pixel, ranging from zero for black pixels up to one for white pixels. This number inside the neuron is called its activation. And the image you might have in mind here is that each neuron is lit up when its activation is a high number. So, all of these 784 neurons make up the first layer of our network. Now, jumping over to the last layer, this has 10 neurons, each representing one of the digits. The activation in these neurons, again, some number that's between 0 and 1, represents how much the system thinks that a given image corresponds with a given digit. There's also a couple layers in between called the hidden layers, which for the time being should just be a giant question mark for how on Earth this process of recognizing digits is going to be handled. In this network, I chose two hidden layers each one with 16 neurons, and admittedly that's kind of an arbitrary choice. To be honest, I chose two layers based on how I want to motivate the structure in just a moment, and 16, well that was just a nice number to fit on the screen. In practice, there is a lot of room for experiment with the specific structure here.

[0:52]The way the network operates, activations in one layer determine the activations of the next layer. And of course, the heart of the network as an information processing mechanism comes down to exactly how those activations from one layer bring about activations in the next layer. It's meant to be loosely analogous to how in biological networks of neurons, some groups of neurons firing cause certain others to fire. said I'm here with Lisha Lee, who did her PhD work on the theoretical side of deep learning, and who currently works at a venture capital firm called amplify partners, who kindly provided some of the funding for this video. So, Lisha, one thing I think we should quickly bring up is the sigmoid function. As I understand it, early networks use this to squish the relevant weighted sum into that interval between zero and one, you know, kind of motivated by this biological analogy of neurons either being inactive or active? Exactly. But relatively few modern networks actually use sigmoid anymore, it's kind of old school, right? Yeah, or rather value seems to be much easier to train. And value, ReLU stands for rectified linear unit. Yes, it's this kind of function where you're just taking a max of zero and a where A is given by what you were explaining in the video, and what this was sort of motivated from, I think was a partially by a biological analogy with how neurons would either be activated or not. And so if it passes a certain threshold, it would be the identity function, but if it did not, then it would just not be activated so be zero. So it's kind of a simplification, using sigmoids didn't help training or it was very difficult to train its at some point and people just tried ReLU and it happened to work very well for these incredibly um deep neural networks.

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript