It has been a minute since I have posted. I could tell you that I have been too busy to post, but alas, that wouldn’t be the truth. The truth is that I struggle with maintaining a regular schedule for posting. I admire the bloggers/posters who can stay on a consistent schedule. But hey, I am not going to let me inconsistency stop me for jumping back in. So I will continue my series on basic Power BI concepts, but I wanted to kick off this round with an Aha moment that I had yesterday. (Some housekeeping: I am using the Northwind dataset, which is a free, practice dataset created by Microsoft.)
I have been practicing my DAX skills in DAX Studio; specifically I was practicing using SUMMARIZECOLUMNS. The best way for me to learn anything is to perform steps iteratively. So I start with a very simple DAX expression and then build it out to be more and more complex. That’s way I can see what works and where it stops working. As I said, I was working on SUMMARIZECOLUMNS. Here are the expressions I used:
“Sales Amount”, [TotalOrderValue])
Commentary: here I am including the ProductID column from the ‘Orders Details’ table, and I am creating another column called “Sales Amount”, which will contain the values that result from the measure [TotalOrderValue].
Easy peasy so far right? From there I progressively built it out. (I will spare you the interim steps.) I finally ended up with this example:
“Sales Amount”, [TotalOrderValue],
“Average Order Value”, [AverageOrderValue])
Here’s the critical thing to notice. I am using columns from multiple tables. Specifically:
[ProductID] from the Orders Details table
[ProductName] from the Product table
[CategoryName] from the Categories table
It didn’t surprise me that I could pull columns from multiple tables, because I knew that these tables are connected by one-to-many relationships. Here’s a picture of the data model.
Because I am a trainer, I think about why this is working. Because the direction of the relationships, intuitively I would have thought that it wouldn’t work. But it does, inarguably. Why? This is where I had my Aha moment.
I remembered a concept called “expanded tables”. I read about this concept, and like a lot of concepts I read about, I didn’t fully understand it. But it is the concept of expanded tables that makes my example with SUMMARIZECOLUMNS work. When two tables are connected with a one-to-many relationship, the downstream table contains all of the columns of the upstream table. In this example, it means that the Orders Details table contains all of the columns from the Product table, and the Product table contains all of the columns from the Categories table. This is a very good reason to make sure that you strip out any unnecessary columns when you are transforming the data. It is also something you need to be aware of when you use a filter function. Try to limit the filtering to a specific column (or set of columns) and not an entire table, because behind the scenes the table you are filtering probably contains many more columns than you are seeing.
For a great video on this concept, please go to SQLBI.com and watch Marco’s whiteboard video. Here’s the link: