A really useful new Power Query performance enhancement was added to Power BI Desktop in an update to the May release via the Microsoft Store a week or so ago (if you’re not installing Power BI Desktop through the Microsoft Store you’ll have to wait for the June release I’m afraid). You can read the documentation here:
https://docs.microsoft.com/en-us/power-bi/create-reports/desktop-evaluation-configuration
However if you have just read the docs you may be wondering what these two new registry key settings actually do. In this post I’m only going to talk about one, MaxEvaluationWorkingSetInMB; I’ll leave ForegroundEvaluationContainerCount for a future post.
At various times in the past I have blogged about how, when you run a Power Query query, the query itself is executed inside a separate process called an evaluation (or mashup) container and how this process has a limit on the amount of memory it can use. Some transformations such as sorting a table, doing a group by, pivoting and unpivoting require an entire table of data to be held in memory and if these operations require more memory than the evaluation container is able to use then it starts paging and query performance gets a lot worse. This post provides more details:
Two things have now changed though. First of all, the default of amount of memory available to an evaluation container in Power BI Desktop has been increased from 256MB to 432MB. This on its own will make many Power Query queries run a lot faster. Secondly, it is now possible to define how much memory an evaluation container can use yourself via the new MaxEvaluationWorkingSetInMB registry setting described in the documentation.
Here’s an example that shows how much of an impact this can have. In Power BI Desktop I created a Power Query query that reads data from a csv file with around one million rows in it and then sorts the resulting table by the values in one column:
let
Source = Csv.Document(
File.Contents("C:\demo.csv"),
[
Delimiter = ",",
Columns = 16,
Encoding = 1252,
QuoteStyle = QuoteStyle.None
]
),
#"Sorted Rows" = Table.Sort(
Source,
{{"Column2", Order.Ascending}}
)
in
#"Sorted Rows"
Using SQL Server Profiler in the way described here, I found that the Power Query query took almost 87 seconds to start returning data and a further 19 seconds to return all the data:
What’s more, in Task Manager I could see that the evaluation container doing the work was limited to using around 423MB of RAM:
I then used Regedit to set MaxEvaluationWorkingSetInMB to 4096, giving each evaluation container a maximum of 4GB of RAM to use:
After restarting Desktop I reran the same query. This time Task Manager showed the evaluation container doing the work using around 1.2GB of RAM:
…and Profiler showed that the query started returning data after only 14 seconds and returned all the data in a further 12 seconds:
As you can see, that’s a massive performance improvement. Before you get too excited about this, though, a few things need to be made clear.
First, this setting only affects the performance of Power Query queries in Power BI Desktop. It does not affect the performance of queries in the Power BI Service, although there is another setting that (I think) will have the same effect for queries that go through an on-premises data gateway – but that’s yet another for a future post. So while this will make development much quicker and easier it won’t make dataset refreshes in the Power BI Service quicker.
Second, you need to be very careful when changing this setting. There’s no safety net here – you can set MaxEvaluationWorkingSetInMB to whatever value you want – and so some care is needed. When a dataset is refreshed then multiple evaluation containers may be used to handle the Power Query transformations, each of which can use the amount of memory specified by MaxEvaluationWorkingSetInMB. Since there’s a finite amount of memory on your development PC it’s important you don’t set MaxEvaluationWorkingSetInMB too high because if you do there’s a risk that Power BI will try to use more memory than you have available and bring your PC to a grinding halt. What’s more there’s no way of knowing how much memory any given query will need without some experimentation, so my advice is that if you do change MaxEvaluationWorkingSetInMB you should only increase it by a small amount and then increase it only if you are sure you need it.
I’d love to hear how much changing this setting improves the performance of your queries. If it does prove to be useful to a large number of people I hope we can get it added to the Options dialog in Power BI Desktop (which is much more convenient than changing a registry key); I also think it would be very useful in Excel Power Query. Please leave a comment with your findings!