Thumbnail for For Your Eyes Only: Roles, Privileges, and Security in PostgreSQL | Citus Con 2023 by Microsoft Developer

For Your Eyes Only: Roles, Privileges, and Security in PostgreSQL | Citus Con 2023

Microsoft Developer

29m 17s4,889 words~25 min read
YouTube auto captions
Transcript source

YouTube auto captions

This transcript was extracted from YouTube's auto-generated caption track. The transcript below is server-rendered so it can be read, searched, cited, and shared without opening the original YouTube player.

Timestamped outline
Pull quotes
[0:06]Hi everyone, my name is Ryan Booz, I'm a developer advocate with Redgate Software, and I'm so excited to be with you today for SitusCon 2023.
[0:06]The second one we've had, and uh it's great to see the lineup and so many excellent talks uh this year.
[0:06]I'm going to talk to you today about PostgreSQL uh privileges, roles and security, and how you can uh take better advantage of them within your Postgres installation.
[0:06]So again, I'm Ryan Booz, I'm a PostgreSQL and DevOps advocate with Redgate Software.
Use this transcript
Related transcript hubs

[0:06]Hi everyone, my name is Ryan Booz, I'm a developer advocate with Redgate Software, and I'm so excited to be with you today for SitusCon 2023. The second one we've had, and uh it's great to see the lineup and so many excellent talks uh this year. I'm going to talk to you today about PostgreSQL uh privileges, roles and security, and how you can uh take better advantage of them within your Postgres installation. So again, I'm Ryan Booz, I'm a PostgreSQL and DevOps advocate with Redgate Software. Uh, there's my contact information on Twitter and LinkedIn and and please feel free to visit my blog where I am doing more with Postgres content specifically and trying to involve the community in other blogging events. This presentation and any others that I give, I do my best to get into my GitHub repos quickly as I can. The slides usually in a PDF form and any scripts that I show, if I do SQL or something like that, should be with the presentations and I do my best to keep them updated as we go. So we have a very short period of time this uh today, we have about 25 minutes and so there's a lot to get through and this is going to be a whirlwind tour of the basics around PostgreSQL roles and privileges and how ownership really impacts the working of your database. And so let's go ahead and get started. Two disclaimers before we get going. One is that obviously this talk won't be able to cover everything. We only have now about 23 and a half minutes, so we're going to move quickly. My goal is that something here might spark uh, you know, an idea of a trouble or a situation you've had in how you might use some of this information to go dig further and understand what you need to do to overcome that problem. And then the second is that everything I'm talking about here should be applicable in one form or another to Postgres 11 plus, those are the currently supported versions of Postgres. Now there are some things that are only mentioned for maybe say Postgres 14 and above or Postgres 15. I'll do my best to note that as we move along. So let's go ahead and dive into the wonderful and yet somewhat complex world of Postgres uh roles and privileges. So here are the building blocks that I want to make sure we are all on the same page with. Now sometimes it can be challenging, uh particularly in this very hosted cloud-hosted connected world, to think about what's really happening at the server level. So with Postgres, we talk about these things called clusters. That's just a terminology that is a little bit confusing. The cluster is like the running instance of the Postgres back end on the computer, whether it's a VM or local computer or Docker image. You can have multiple clusters, those are the running backends on one computer, they just all have to be on a different port. So you could host Postgres 12 on a port and Postgres 13 on a different port. They're all hosted within one computer, but they are serving and they are completely independent of one another. So just recognize that the term cluster does not actually mean multiple Postgres instances working together. Now, when it comes to authentication and how then those things work together within the cluster itself, the actual running instance of Postgres, we have these two other concepts that are really uh integrated together within talking about roles and privileges. So one is is roles, we have users and groups and so forth, they exist at the cluster or the instance level, and also databases, and then all the objects contain therein. But they aren't actually separate. They really are very co-mingled together. You can't have a user without a database and you actually can't have a database or objects in that database without users and roles. The reason is, databases need to be owned by somebody or something and those roles can't log in unless there's the database that's owned. And so we'll see how that works together. So they are both existing at the server level, but they are also very much integrated in how they work together. Now, when it comes to authentication, we really have this concept of, you know, how do you first connect to the Postgres server? And that is controlled by this pg_hba.conf file that stands for host-based authentication. And you can think of it like a firewall rule set for PostgreSQL. It specifies through configuration which hosts and roles can connect to what databases using what authentication method? Now we're not going to spend much time on it here because the reality is this this is what the HBA Conf file a template might look like. But the reality is that most of us, unless you are actually a DBA managing Postgres instances on VMs and servers for other people, probably don't have to think about the HBA Conf file that much anymore. Most of the hosting providers that that you spin up instances for are taking care of this at some level for you, or they provide a console maybe to help you manage this. It is worth understanding and knowing about. You cannot connect to Postgres at all or any database if this isn't set up correctly, so I've given you a link there to go read more about it and understand. The two things I would mention is this just as a word of warning. The first is, avoid using trust as an authentication method in in really every case. The only case it might be viable is a very local 100% developer running instance of Postgres. You just don't want to worry about passwords and authentication, but trust can be very dangerous and in today's uh, you know, Postgres community we would say there's almost no reason you should ever use trust. So, really think about it if if that happens to be how you've set something up. And then the second is, if you're using passwords, there are a number of good talks over the last couple years about this change, you should move to scram-sha-256. It's a newer authentication hassing, password hashing within the database and you should work towards moving there. So that's as much as we're going to talk about as far as connecting to the database server itself. We know that we have a server, a host, we have a running instance of Postgres that we can authenticate to as long as the uh HBA file set up correctly. But now we have to actually work towards the uh users and roles themselves. And so roles are um, really the central part of what hap is happening in Postgres with the authentication. Uh, so roles own database objects, tables, functions, et cetera. They have cluster-level privileges, which we call attributes, we'll look at in a second. Roles can be granted uh, you know, privileges within the database, you know, to do various things on objects like tables and stored procedures and functions and and so forth. And if they're given the correct permissions or privileges, they can also grant privileges to other roles through something called inheritance, which we'll look at later. Now, you'll notice that I've been saying roles and not users and groups as much as possible. In Postgres terminology, all authenticated kind of principles are known as roles. The concept or the actual terminology of user and group within the database code was removed back in Postgres 8, I think it was 8.2 or so. By convention, we in the Postgres community, we refer to user roles as roles that are allowed to log in. And we refer to group roles as roles that are not allowed to log in. And then we can use them through inheritance to begin to, you know, share privileges in relation to, you know, ability to do things in the database. And really, I mentioned this for this reason. There are still aliases uh in Postgres code for create user and create group. You can actually run these commands to create a user which is allowed to log in and a group uh, which by default is not allowed to log in. But those are simply aliases to the create role command and that really is what you should be using now and specifically say whether or not the role can log in or not. Now the role attributes are these predefined settings that really exist kind of at the server level, right, the cluster, the instance running instance level. Uh they're um, they really uh, if you look in the catalog, the PG catalog PG roles table, you'll see that they co uh coordinate with the columns in that table. And so it's kind of a boolean, does the role have this attribute or not? Are they a super user, are they allowed to create databases or it's false, they aren't allowed to do those things. These are the current role attributes in Postgres 15. Now I I don't actually think these have changed in in a number of versions, so these probably are the same in Postgres 9 and 10 as well. The ones that we'll be touching on over the next 10 minutes or so are the ones that are underlined, which is what most users uh really can be coordinated at in some level. Bypass row level security, connection limits and replication, really become a a more specific attribute that you have to deal with if you're managing a uh a larger database environment. The only thing I'll mention is that if you simply create a role and don't provide any attribute specific settings, every role by default, Unless otherwise set, uh when you create a new user with no attributes, they will have uh the only the ability to inherit privileges from other roles and they will have unlimited connections. And so uh, it doesn't necessarily mean one thing or the other. It's just nice to know that when you set up a user by default, that is all that they are allowed to do. Now there are special roles within Postgres. Now we've probably all heard of the super user role and I think of super user, I think about some kind of super hero and usually the super hero that's thoughtful and and uh, you know, the neighborhood friendly Spider-Man, right? Knows exactly how to take care of everything that's happening within the database and has this privilege for a good reason. The reality is, as Postgres has grown and developed, we've come to understand that Postgres super user is much more like the ability to wipe anything out that they want. We need a super user to actually manage the database at some level, but there's a lot of power inherent at that role. And so, when you create a database because it's required, you will always get a super user within a Postgres cluster. The name of that user is the name of the running, whoever owns the running Postgres process is what will be created as the user. So typically it's Postgres. And then that user has to have a database to connect to and that's usually the Postgres database. So it can be changed, it's kind of a little bit, it's a little bit difficult and not often recommended to do so, but just recognize that that's how it works. Now the thing that's really important to know though is the super user bypasses all privilege checks except for login. And that's just kind of part of the Postgres code base. If uh a super user tries to delete a table, they'll be able to delete that table, drop that table because there's no privilege checks done on a super user role. And so really is something you need to be uh sensitive to. You need to really treat the super user like you would a root user say on Linux or something to that effect. Most cloud providers for that reason will not provide your super user access. So if you spin up a new instance of a running process in Azure or Amazon, you will not get a super user role most of the time, unless it's a contained running VM or instance in that case. Instead you want something that's very super user-like, a role that has just enough privilege but knows how to be responsible with that privilege and is looking out for your good. And so we in the Postgres space, we would say that the right level of control might differ depending on what you need to do. Uh the recommendation, if you are managing most of your own services, would be to have a role that can create other roles and can create databases. Again, databases need to be owned, they need to be managed, and so someone that can manage roles and databases is advantageous. The only limitation may be that if you don't have a super user, there are some things that might not work like maybe installing some extensions that require a super user privilege. Understand that as you move ahead with management. And then there's one last role that we need to talk about and that is the public role. Again, a role that's often misunderstood, particularly for new users of Postgres. All roles are granted implicit membership to public. Public is kind of like a code-based role. It's not something you can delete, it's not something you can change. Everybody always has membership in public. You can manage the privileges that public is allowed to have on a database. And so that is a way that you can manage what happens. Now by default, if you do nothing, the public role, which everyone is a member of, has the ability to connect to a database. It has usage, temporary and execute. Now again, connect simply means they're allowed to connect, usage means if they have the privilege to do something like select from a database then they have the usage ability on the database to then do that operation. So these permissions don't actually give a user any privilege to do anything except connect, but it's nice because otherwise you would have to specifically grant those privileges to every role as you create them, which becomes really cumbersome. That's one of the reasons the public role exists. Now Postgres 14 and below, it also had the create permission on the public schema, and we've come to find that's a little bit of a security problem if you don't understand or don't remove the right privileges. And so with Postgres 15 and above, there's no longer the create privilege on the public schema for the public role. And that's why the security best practice, which is actually been out there for many years if you do any kind of search on best practices with Postgres and security, you'll find that the recommendation is that when you spin up a new database, you remove all privileges on the public schema from the public role. And maybe remove all database privileges from the public role. That would mean that you will then have to go back and grant connect and use the privileges to all of the roles that you want to be able to do that, but this allows you to have full control over the database. Now we've talked about privileges a few times now. So we have a data we have a server, we have a running instance of Postgres, which we understand has roles. Those roles have some kind of privileges, what are those privileges? They are the access rights to the database and the database objects. Think select and update and usage and delete, things of that nature. Now they can be granted or revoked to a role by an authority that has something like create role privilege. So they're allowed to manage the role for a user, either a database level and so forth. Now, any time you explicitly grant or revoke a privilege, that is for only for the objects that are in existence at that moment. So if I were to say grant select on all tables to all privilege, to all uh roles, it would only grant access at that moment for any objects currently in existence. These are the current privileges in Postgres 15, and the ones that I underlined again are the, as of Postgres 15, are the default privileges given to public, which then everyone inherits from by default. Uh so let's talk about a couple of things where it comes to privileges. It's fairly easy to to apply privileges at a database, schema, or an object level. We say grant the kind of privilege we want to grant, on what kind of object, a database, a schema, maybe a table, to whom uh, and and you know, what role is it that you're actually applying it to? And the the confusing part can be at times, specifically, you know, these little objects in between. So what does create mean in context of a database? What does create mean in context of a schema? Well, the way that we apply that, um and again just to, I forgot to have the slide here. One more reminder that anytime you run a grant statement manually, it only applies to the objects currently in force. To kind of figure out how various privileges like create impact different objects like schemas or databases, you can tie these two documentation web pages together. To the grant and revoke web page will show you the syntax for granting and revoking on various object types. And then when you look at the privilege that you can grant or revoke, you can go to the DDL priv page and that'll actually show you the or give you the description of what that privilege means in context of that kind of object.

[16:29]All right, we're almost there. Now that we have these roles, we've granted them privileges, how do we actually manage privileges across a multi-tenant database, multi-user database, maybe in my dev environment where I have many developers accessing a common database? And that's where role inheritance and privilege inheritance comes into play. So remember I said that you can grant membership into a role, just like you can grant privileges. This is really the the preferred way for managing roles within a Postgres database. You create groups that cannot log in and you provide uh the privileges to those roles for the various object types like tables and schemas and uh functions. And then you grant membership into those roles to the user roles that can log in.

[17:46]Now graphically, that might look something like this. This is probably not surprising, but if I had maybe a schema called app, and I had just granted access uh at various levels to these group roles, I can then apply membership and they overlap. Basically, the the roles inherit all of the privileges of all the roles that they are a member of, so they can do uh what they need to to various objects. And that really brings us to the last major piece of roles and privileges within Postgres, and that is object ownership, probably the one last culminating piece that really can be frustrating to a lot of people. In Postgres, your running instance, remember that roles are an instance level concept. You might have many databases and those roles have privilege to do something in the database. But if I backup that database and move it to another server that doesn't have those roles, I would be stuck if Postgres can't do something else. And the way that Postgres works is that whenever you do something like a restore or you create an object, whoever's creating the object at that moment will become the owner of that object. And the owner of the object really has special privileges. It's kind of like a super user of that specific object. And so, in Postgres, we really treat privileges, really with this concept of the principle of least privilege. Meaning, when I create an object, only the owner has access at that moment, or a super user. And then the owner has to decide what kind of privilege do I want to grant to other roles in the system. And so it's very methodical about how uh other roles get access. Now, uh, one of the problems can be then, you might imagine if I have many different kinds of roles, connecting to a database and creating many different kinds of objects, this gets really cumbersome to manage access. Instead, what we want to do, if you notice at the bottom, is we want to really think about creating a role, maybe a typically a group role that then has uh is the creator of all my objects. I can I can log in as someone else, I can switch to that role, create my objects. Now, I only have to manage privileges from one role, which is a lot more manageable over the life of a database. And that then uses what we call default privileges. So remember I said that as a user, I create an object. Let's say a table. And now I have to actually manually give privileges to other roles, that's really cumbersome uh and and just hard to manage in total. Instead, I can tell Postgres ahead of time, I want to set a default privilege in this case of select to the public role anytime I create a table. And because everyone else, all the other roles are part of the public role, they will then have select on the table. And then I don't have to do it every time I create an object and then grant privilege, create an object and then grant privilege. So default privilege really are a key component in providing access to other objects within your database. And so, in just a moment, I'm going to show you a very brief demo of what that looks like. And you just to recap, you really have four options here. One is you could explicitly grant permission every time you create an object. Absolutely, it will work. You'll get the privilege access you want, but it will be a little bit cumbersome. Second, as the owner of the objects, all of maybe many developers, they could all set their own default privileges so that when they create objects in the database, the other roles get access as well. I would say that the way that you really want to work particularly in an application environment is to have that application role, like a group role, that you can set and um then that role is where you set your default privileges. And so things like migration scripts, you would run those scripts as your app role and then you would only have to manage default privileges from one specific role type within your database, a lot easier to manage. Now, there are these new things called predefined roles in Postgres. They've been around for a couple of versions now. Uh in Postgres 14, we added a pg read all data and because that is a cluster level setting, you can apply this role to any role in the database and it as long as they can connect to a database, they can select from every table. It might not be exactly what you want, but it's something to think about.

[22:42]So in two minutes or less, I'm going to briefly show you a demo of what that might look like. So I've created exactly the uh situation that I had earlier. I have a developer no login role, therefore it is a group role by convention. I've created some select privileges and the ability to create objects on the public schema. I created two developer roles, and those developer roles, I then granted membership to the developer group role. And so now I'm going to pretend that I've logged in with the the user role dev one. In Postgres, anytime I set role, you can think of that as connecting to a separate session as that user and I get that role's privileges.

[23:44]So I'm going to connect as dev one and I'm going to go ahead and create a table. And I can see that that table's owned by dev one, I can see it down here, table owner dev one. Now, I'm going to pretend to log in as dev two and you saw above that dev one and two have the exact same privileges. They had nothing else given, they were simply granted uh membership into that developer group role that had select and create privileges. So I'm going to go ahead and alter this table because I would assume that I could do that, but because dev two is not owner of that table, I cannot alter that table. And that's one thing that can really trip you up. And so I'm going to go ahead and delete that table, drop that table again. And now this time I'm going to connect again, we're pretending to connect as the developer role. Because I am a member of developer, I can set my role to the group role, even though they're not allowed to log in, I can create that table again. And now you'll see that it is owned by the developer role. And as dev two, I can now alter that table. I can even select from it.

[25:22]Because I am a member and that role owns it, I have the ability to do things with it.

[25:35]Now, the other thing to recognize is that this can impact things like other users that you need to provide some level of access. So remember I said that we could alter the default privileges. So let's see what that looks like briefly. I'm going to create a new report user role, and maybe this is a role that just has to create some read-only select data. So I'll create that role. I'm going to go ahead and switch over to that role. And now, as we have learned, I can't select from that table. I'm not an owner, and no one has granted me privileges. Now I could do that manually. So I'm going to go ahead and grant select to that user. And that will probably get me the ability to select from that table. So as report user, I can now select from that table, but as we know, that's a little bit cumbersome. So what we said as the option number three was, here's what I'm going to do. I'm going to go ahead and go back to the developer role. I'm going to go ahead and create a new table. And you'll see that I still can select because I I still haven't been given those privileges. So what we're going to do instead is we're going to recreate that table after we set some uh default privileges. Now remember, I have to set these default privileges as the role that will create and own the object. So as the developer, I'm simply going to create a default privilege that says grant select on the tables that I create to the RPT user. Then I'm going to create that table, and let's see if that worked. I have not explicitly granted access to anything, and now that report user can actually look into those tables. So, uh you'll see that actually managing roles can be uh fairly straightforward with impose with a little bit of forthought. The last very brief thing to talk about is predefined roles. And so I did mention them briefly. Uh it's something that's been we've been working on in the last couple of versions of Postgres. Uh things like read all data. These are roles that are applied at the cluster level and they apply to the role across the cluster depending on what that privilege is. Now there are a lot of them, and this link will take you to the page in documentation that talks about what they are, and you'll notice in Postgres 15 and 16, there are more more roles coming to fruition, really to help management of your Postgres database without having to be a super user. That's really the reason that this is getting a lot of attention in uh roles and privileges within Postgres. I hope that is helpful and really gives you a a first step into what that might look like to to more efficiently manage roles and privileges within Postgres, and and hope this has been helpful today. Please, uh take a moment to check out the other sessions at SitusCon and and download this session if you uh want to, and always feel free to reach out. It's been a pleasure. Take care.

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript