The Power BI Preview includes the ability to share Power Query queries with other people in your organisation; I was thinking about writing a blog post on this topic, but to be honest the documentation covers it so well there’s little point repeating what it says. My assumption is that it will be data stewards who do the majority of query creation and sharing, and that it will be less able users who consume these prebuilt queries.
Sharing a query is all very well, but in Power Query it’s also possible to share functions (which are, after all, just a different type of query) too – which is the subject of this post. What’s the point of doing this, I hear you ask? Well, if a data steward shares a query, they are sharing a fixed table of data that the end user can then manipulate however they want. If however a data steward shares a function, this means that they are giving end users something slightly different – an interface that returns a different table of data depending on the parameters that are passed into it, and which may be much easier to use. Sharing a query is a bit like creating a view in SQL Server; sharing a function is more like creating a stored procedure.
Here’s an example. Let’s imagine that a user who only has the most basic Power Query skills wants to get some data from the DimDate table in the Adventure Works database. A data steward could share a query that returns the entire table, but if the user only wants a few rows from that table they are going to have to use the query as a starting point and apply a filter themselves in their own query. On the other hand, a data steward could share a function that applied a filter to the DimDate table, and then when the user came to use that function they would just have to fill in some basic parameter values and they could get the data they needed much more easily.
The M script for the function could be something like this (note: I created this function by first creating a regular query in Power Query then altering it manually to make it into a function – it wasn’t very difficult):
let
GetDates=(StartYear, StartMonth, StartDate, EndYear, EndMonth, EndDate)=>
let
Source = Sql.Databases("localhost"),
#"Adventure Works DW" = Source{[Name="Adventure Works DW"]}[Data],
DimDate = #"Adventure Works DW"{[Name="DimDate"]}[Data],
FilteredRows = Table.SelectRows(DimDate, each [FullDateAlternateKey] >=
#date(StartYear, StartMonth, StartDate)
and [FullDateAlternateKey] <= #date(EndYear, EndMonth, EndDate))
in
FilteredRows
in
GetDates
This function takes six parameters: the year, month and date for a start date and an end date. It then returns the rows from DimDate that fall between the start date and the end date. The data steward would just need to share this like any other query.
The end user, in Excel, would use the Online Search dialog in Power Query to find the function they wanted:
Clicking on Filter and Shape would bring up the Query Editor dialog:
They would then need to click the Invoke button to enter the parameter values:
And once they had clicked OK, they’d have the rows they wanted (though they would need to enable Fast Combine first):
The whole process isn’t as slick as I think it could be, but I still think it’s easy enough for anyone who has mastered Power Pivot. Of course this is a very simple example; the real advantage of this would come with more complex functions that contain expressions that an end user would never be able to write on their own.
One thing I’m still not sure of, though, is what happens exactly when a data steward updates the definition of a shared query or function. It seems as though once you have imported a shared query its definition is fixed, and even if the data steward updates the definition of the query any existing users of it are stuck with the old version (I’ve asked a question on the forum about this and will update this post when I get an answer). This might be the safe thing to do – after all, you don’t want a data steward changing what a shared query returns and breaking any code you have built on top of it – but it also means that if you did need to use the updated version you would need to import it a second time. It would be nice to have the option to refresh the definition of a query after you have imported it.
UPDATE: I got a reply to my question on shared queries from Miguel Llopis of the Power Query dev team:
“That’s correct: when sharing a query into the Data Catalog, you are sharing a copy of such query; when a user searches and uses a query from the results, they are making a local copy of the query into their workbook; when a shared query owner makes updates to the query, these updates overwrite the definition of the shared query in the catalog, but doesn’t propagate (or notify) changes to existing copies of the query.”