When you connect to a relational database like SQL Server in Power BI/Power Query/Excel Get & Transform you have two choices about how to get the data you need:
- You can choose a table from the database and then either use the Query Editor UI or write some M to get the data you need from that table. For example, you might choose a table that has one row for every product that your company sells and then, using the UI, filter that down to only the products that are red.
- You can enter a SQL query that gets the data you need.
Something that you might not realise is that if you choose the second option and then subsequently use the UI to apply even more filtering or transformation, then those subsequent steps will not be able to make use of query folding.
As an example of option (1), imagine you connect to the DimProduct table in the SQL Server Adventure Works DW database like so:
The following M query is generated by the Query Editor when you filter the table to only return the red products and remove all columns except EnglishProductName. That’s very easy to do so I won’t describe it, but here’s the M:
let Source = Sql.Databases("localhost"), #"Adventure Works DW" = Source{ [Name="Adventure Works DW"] }[Data], dbo_DimProduct = #"Adventure Works DW"{ [Schema="dbo",Item="DimProduct"] }[Data], #"Filtered Rows" = Table.SelectRows( dbo_DimProduct, each ([Color] = "Red") ), #"Removed Other Columns" = Table.SelectColumns( #"Filtered Rows", {"EnglishProductName"} ) in #"Removed Other Columns"
Using the View Native Query option, you can find out that the following SQL is generated to get this data:
select [_].[EnglishProductName] from [dbo].[DimProduct] as [_] where [_].[Color] = 'Red'
It’s pretty clear that query folding is taking place for the filter on “red” and for the selection of the required column.
However, if you enter the following SQL query when you first connect to the database:
select * from dimproduct
And then, after that, filter the table and remove columns in exactly the same way, you get the following M query:
let Source = Sql.Database( "localhost", "Adventure Works DW", [Query="select * from dimproduct"]), #"Filtered Rows" = Table.SelectRows( Source, each ([Color] = "Red")), #"Removed Other Columns" = Table.SelectColumns( #"Filtered Rows", {"EnglishProductName"}) in #"Removed Other Columns"
If you now try to use the View Native Query option on either the Removed Other Columns or Filtered Rows steps you’ll find it’s greyed out, indicating query folding is not taking place for those steps:
The query you enter is run and then Power BI applies the filter and selects the column itself in the resultset that the SQL query returns.
This obviously has big implications for performance. The lesson here is that if you’re going to write your own SQL query in the Query Editor, you should make sure it does all of the expensive filters and transformations you need because anything else you do in the query will happen outside the database in Power BI or Excel.