[0:00]Hello, this is a demonstration of how to work with large data sets in Power BI and how to make sure that the experience for your users is very smooth. And one of the things that we've found is that if you use an aggregation in your model, you can do some really interesting things. For example, you can take a look at a data set like this. It's a fairly large data set. I have just under 13 million rows in this particular data set that I'm working with. And when I connect up to this particular data set, I can use the aggregation feature to bring in the larger data set and perform some interesting queries against it. So let me show you how this works. I'm going to jump over to Power BI Desktop, and I have connected up to that particular data set. And one of the things that I've done is I've brought in the data and I've created an aggregation table. Now, the way that the aggregation table works is if I go to the transform data section, I have two tables in here. One is my fact table. And again, this is about 13 million rows of data, and it is a direct query connection. And then I have my fact table that has an aggregation. Now, this is important. My aggregation is in import mode, and the aggregation is based off of the grouping that I do in here. So if I was to scroll over to the right here, you'll see that I'm doing a group by operation, and I'm basically grouping by the order year and the order month, and the country and the product name. And then I'm just taking the sum of the sales for that particular grouping. And this particular data set, because I've grouped it, is much, much smaller. So it's down to about 25,000 rows in this particular data set now. What happens is once I have done this grouping and I have saved this as an imported table, I can then go ahead and manage the aggregations. And the way that I do that is I right click on the table that I want to aggregate, and I can say Manage Aggregations. And you'll notice that the manage aggregations is going to allow me to specify the granularity of this particular aggregation. So for example, for my order year, I'm saying that's grouped by order year. For my order month, that's grouped by order month. For my country, that's grouped by country. For my product name, that's grouped by product name. And then for my sales, I'm doing a sum operation. Now this is really important because what happens is when I create a visualization, for example, like this, where I'm doing the sales by country and by year. Power BI is smart enough to know that it needs to go to the aggregated table for that particular query because I'm querying at a level of granularity that the aggregation can handle. So in other words, if I'm grouping by year and country, my aggregation has year and country. So it knows to go to that smaller data set that's in memory that's in import mode. So when I go and create a report page that leverages this, you'll notice how quickly this report page renders. So for example, I'm going to go to a blank page here, and I'm going to add in my sales data. And I'm going to go ahead and add in my year and my country, and let me move my sales to the values and my country to the legend. And you'll notice that this renders very quickly. In fact, if I was to go and monitor this with something like a performance analyzer, you would see that the query that's being sent by Power BI is going to the in-memory aggregation. It's not going to the direct query data set, and that makes this very performant. So this is an example of how you can use aggregations to work with large data sets and still have a very performant experience for your users.
Watch on YouTube
Share
MORE TRANSCRIPTS



