When you import data from a relational database like SQL Server in Power BI you have the option of entering your own SQL query to use as a starting point:
Here’s the M code for a query that does this:
let Source = Sql.Database( "localhost", "AdventureWorksDW2017", [Query = "SELECT [DateKey]#(lf) ,[FullDateAlternateKey]#(lf) , [DayNumberOfWeek]#(lf) ,[EnglishDayNameOfWeek]#(lf) FROM [AdventureWorksDW2017].[dbo].[DimDate]"] ) in Source
If you’re confident writing SQL this might seem like a good option, but as I said in this blog post it has the side-effect of disabling query folding inside the Power Query query, so if you add any other transformations they will always be performed inside the Power Query engine – which may be less efficient than performing them in the data source.
There’s also another drawback: when you refresh your dataset in Power BI Desktop (although not in the Power BI Service) you’ll see that your SQL query is run twice. Here’s the evidence from SQL Server Profiler showing what happens when the query above is refreshed in Power BI Desktop:
If your query is slow, or if each query execution costs you money, then this is something you want to avoid.
Why is this happening? It turns out this is just another example of what I blogged about here: Power BI wants to know the schema of the table before the query actually runs, so it asks Power Query to return the top 0 rows. Unfortunately, in this case query folding can’t take place and the top 0 filter can’t be pushed back to the database, so the entire query gets run once to get the schema and once to get the data.
The solution is the same as the blog post I just mentioned too: use the Table.View M function to hard-code the schema returned by the query and implement query folding manually. Here’s the adapted version of the new query:
let Source = Sql.Database( "localhost", "AdventureWorksDW2017", [Query = "SELECT [DateKey]#(lf) ,[FullDateAlternateKey]#(lf) , [DayNumberOfWeek]#(lf) ,[EnglishDayNameOfWeek]#(lf) FROM [AdventureWorksDW2017].[dbo].[DimDate]"] ), OverrideZeroRowFilter = Table.View( null, [GetType = () => type table[ DateKey = Int32.Type, FullDateAlternateKey = DateTime.Type, DayNumberOfWeek = Byte.Type, EnglishDayNameOfWeek = Text.Type ], GetRows = () => Source, OnTake = (count as number) => if count = 0 then #table(GetType(), {}) else Table.FirstN(Source, count)] ) in OverrideZeroRowFilter
Generally speaking, I think there’s a lot to be said for creating views (if possible) instead of embedding your own SQL into a Power BI dataset – it makes maintenance and tuning much easier, and of course if you can connect straight to the view without writing any SQL in Power BI, then query folding will work and Power BI Desktop will only query the view once when you refresh.