Thumbnail for null by null

7h 49m2,421 words~13 min read
Auto-Generated

[0:00]Hello, everybody. My name is Matt Holst. I'm a product manager on Google's Identity team. We're responsible for two-factor authentication, account recovery, things like that, pretty much anything that has to do with making sure that the right user can sign into their account. And today, I wanted to talk a little bit about a common pitfall that we see developers run into, which is around how to identify a user. And specifically, I wanted to talk about some of the ways that you might identify a user that are good and some of the ways that you might try to identify a user that are bad. And if you want to follow along, I've posted the slides at that URL. But don't worry about writing it down now, I'll put it up again at the end. So, just to give you a little bit of a roadmap for the next 20 minutes or so, I wanted to start by taking a brief look at how most of the web identifies users today. Then I want to talk a little bit about what it means to be a user of your service. And after that, I'm going to spend the bulk of the talk on some common anti-patterns that we see developers fall into when they're trying to figure out if it's the right person signing into their service. And then finally, I'll close with some best practices that you can use to improve the security and the usability of your own services. So, what are the common ways that users are identified on the web today? Well, by far, the most common is username and password. I'm sure we're all familiar with that, some variation of email address and password, things like that. It's been around for a while. It's generally well-understood. It's relatively easy to implement. But as many of you probably know, it does have a number of downsides. One of the biggest is that users often reuse their passwords across multiple sites. So if an attacker is able to compromise the password database from a site that you don't even control, they can then turn around and try to use those same username and password combinations to sign into your site. And if your users are using the same password on your site, as they are on a compromised site, then your users are at risk. Another common way to identify users is using something called a federated login. This is often something like sign in with Google or sign in with Facebook, sign in with Twitter, things like that. And this is good because it gives users the convenience of not having to create a separate account and password just for your site. It also helps a little bit with the password reuse problem because it offloads the password security to Google or to Facebook, to Twitter, to these big companies that have generally pretty good security around their authentication systems. But one of the downsides here is that you're creating a little bit of a dependency on those third-party services. If Google goes down, users may not be able to sign into your site. And so that's something to consider when you're looking at federated logins. And then finally, the last thing I wanted to talk about in terms of how most of the web identifies users is cookies. This is also something I'm sure we're all familiar with. When you sign in using a username and password or a federated login, what often happens is that the service will set a cookie in your browser. And that allows you to then access the site without having to sign in again for every page load. And it does this by setting a random string of characters, a random identifier in your browser that the site can use to recognize you in the future. So, that's what's out there today. Now, I want to talk a little bit about what it means to be a user of your service. Because it's important to keep in mind that users often have different relationships with your service over time. So, when a user first comes to your site, they're probably an anonymous user. They may not have an account, they may not have signed in. But you still want to provide some kind of consistent experience for them. You want to be able to tell if it's the same person coming back to your site for the second or third time. Then they may decide to create an account, maybe using a username and password, maybe using a federated login. And at that point, they become what we call an unverified user. This is a user who has an account, but they haven't yet proven to you that they own the account. They haven't yet proven to you that they own the email address that they used to sign up or the phone number that they used to sign up. And this is an important distinction to make, because what we see sometimes is that developers will equate having an account with being verified. And that can be a really dangerous thing to do. And then finally, there are verified users. These are users who have proven to you that they own the credentials that they've signed up with, that they own the email address, that they own the phone number. And this is the highest level of confidence that you can have in a user. And you should treat these users accordingly. So, why am I talking about this? Well, it's because there are a number of anti-patterns that we see developers fall into. And these are things that you should avoid if you want to keep your users secure and your service usable. The first one I wanted to talk about is identifying users by something that's publicly available. So, this could be something like a user's IP address. This could be something like a user's email address if it's publicly available on their website. This could be something like their public PGP key if they've published that. And the problem here is that if you're identifying a user by something that's publicly available, then anyone can impersonate that user. So, if I know your IP address, I can try to make it look like I'm coming from your IP address and then sign into your account. This is something that's often used in botnets and things like that. So, you want to avoid identifying users by anything that's publicly available. The next anti-pattern I wanted to talk about is identifying users by something that's mutable. And what I mean by mutable is something that can change over time. So, this could be something like a user's display name. This could be something like their email address, if they can change it in their profile. This could be something like their phone number, if they can change it in their profile. And the problem here is that if you're identifying a user by something that's mutable, then when that thing changes, you lose the ability to identify that user. So, if a user changes their email address and you're identifying them by their email address, then you've lost the ability to link their old activity to their new activity.

[7:49:40]And that can be a really bad user experience. It can also be a security problem if you're using that mutable identifier to grant access to things. So, you want to avoid identifying users by anything that's mutable. The next anti-pattern I wanted to talk about is identifying users by something that's guessable. And what I mean by guessable is something that an attacker could reasonably guess. So, this could be something like a user's first name, last name, and birth date. This could be something like their social security number. This could be something like their employee ID. And the problem here is that if you're identifying a user by something that's guessable, then an attacker can simply guess that identifier and then try to sign into that user's account. This is often used in phishing attacks and things like that. So, you want to avoid identifying users by anything that's guessable. The next anti-pattern I wanted to talk about is identifying users by something that's shared. And what I mean by shared is something that multiple users could have. So, this could be something like a user's IP address if they're behind a NAT. This could be something like their browser user agent string if they're using a common browser. This could be something like their operating system if they're using a common operating system. And the problem here is that if you're identifying a user by something that's shared, then you can't distinguish between different users. So, if I'm trying to identify a user by their IP address and they're behind a NAT, then everyone behind that NAT is going to look like the same user to me. And that can be a really bad user experience if you're trying to personalize content or things like that. It can also be a security problem if you're using that shared identifier to grant access to things. So, you want to avoid identifying users by anything that's shared. And the final anti-pattern I wanted to talk about is identifying users by something that's client-controlled. And what I mean by client-controlled is something that the user can easily manipulate. So, this could be something like a cookie that's not signed. This could be something like a URL parameter that's not validated. This could be something like a hidden form field that's not validated. And the problem here is that if you're identifying a user by something that's client-controlled, then an attacker can simply change that value and then impersonate that user. This is often used in cross-site scripting attacks and things like that. So, you want to avoid identifying users by anything that's client-controlled. So, we've talked about a lot of bad ways to identify users. Now I want to talk about some good ways to identify users. And these are things that you should be doing if you want to keep your users secure and your service usable. The first one is identifying users by a strong, immutable, and unique identifier that's generated by your server. So, this could be something like a UUID or a GUID that you generate when the user first signs up. This could be something like a random string of characters that you generate when the user first signs up. And the key here is that it's generated by your server, it's immutable, and it's unique. It's not something that the user can change. It's not something that an attacker can guess. It's not something that multiple users can have. And this is the strongest way to identify a user. The next best practice I wanted to talk about is using strong, cryptographically secure cookies. So, when you set a cookie in the user's browser, you want to make sure that it's a cryptographically secure cookie. And what I mean by that is that it's signed with a secret key that only your server knows. And this prevents an attacker from being able to forge a cookie and then impersonate a user. You also want to make sure that your cookies are set with the secure flag, the HTTP only flag, and the same site flag. And these are all things that help to protect your cookies from being stolen or manipulated by an attacker. The next best practice I wanted to talk about is using multi-factor authentication. And what I mean by multi-factor authentication is requiring more than one factor to authenticate a user. So, this could be something like a username and password, plus a one-time code sent to their phone. This could be something like a username and password, plus a physical security key. And the reason why this is so important is because it significantly increases the security of your users' accounts. Even if an attacker is able to steal a user's password, they still won't be able to sign into their account unless they also have access to their phone or their security key. And this is something that we highly recommend for all services. The next best practice I wanted to talk about is using account recovery flows that are secure and usable. So, when a user forgets their password or loses access to their account, you want to make sure that you have a way for them to regain access to their account that's both secure and usable. And what I mean by secure is that it's not guessable, it's not shared, it's not client-controlled, it's not mutable. And what I mean by usable is that it's easy for the user to understand and to follow. And this is a really important balance to strike, because if your account recovery flow is too secure, it might be unusable for your users. But if it's too usable, it might be insecure. So, you want to find that sweet spot in the middle. And then finally, the last best practice I wanted to talk about is logging and monitoring. So, you want to make sure that you're logging all of your authentication events. You want to be logging when users sign in, when users sign out, when users try to reset their password, when users change their email address, things like that. And you want to be monitoring these logs for any suspicious activity. So, if you see a user trying to sign in from a new country every five minutes, that's probably something you want to look into. If you see a user trying to reset their password 100 times in a row, that's probably something you want to look into. And this is something that can help you to detect and respond to attacks more quickly. So, just to recap, we talked about how most of the web identifies users today. We talked about what it means to be a user of your service. We talked about some common anti-patterns that you should avoid. And then finally, we talked about some best practices that you can use to improve the security and the usability of your own services. And if you want to get in touch, my email address is there, and I'll be happy to take any questions that you have. Thank you.

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript