Quantcast
Channel: Chris Webb's BI Blog: Power Query
Viewing all articles
Browse latest Browse all 248

Avoiding Duplication Of Database Connection Information In Power BI

$
0
0

In a year’s time there will be a great opportunity for someone to present a session on “Power BI Development Best Practices” at the PASS Summit. Before then, we will all have to work out what those best practices actually are – probably the hard way. With that in mind, here’s a suggestion for one good practice that came out of a conversation at this year’s PASS Summit (thank you Prathy).

If you’re connecting to a data source like SQL Server in Power BI (or Power Query for that matter) you’re probably going to be importing multiple tables of data. For example, if I was importing data from my local SQL Server instance and the Adventure Works DW database, I might see something like this in the Navigator pane in Power BI:

image

Clicking the Load or Edit buttons would create five different queries to get data, one from each of the selected tables:

image

The problem here is that each query duplicates the connection information for the SQL Server database; for example the M code for the FactInternetSales query looks like this:

let
    Source =
        Sql.Database("chriszbook", "adventure works dw"),
    dbo_FactInternetSales =
        Source{[Schema="dbo",Item="FactInternetSales"]}[Data]
in
    dbo_FactInternetSales

That means that if you ever need to change the server or database that the queries point to – maybe because the server has migrated, or because you’re moving the reports from dev to test to production – then you have to edit each of these five queries. Which would be a pain.

Ideally Power BI would create a single connection that each of these queries could share, something like a data source in SSRS. In fact I can see that this has already been raised as a suggestion on the forum here and is under review; I’m sure some more votes would raise its profile. However there are certainly a number of different ways you can avoid this kind of duplication by making your own changes to these queries though.

One possible approach would be to create new Power BI queries that returned the names of the SQL Server instance and the database name, and for each of your main queries to reference these queries. To do this you would need to:

1) Create a new query using the Blank Query option:

image

2) Call the query SQLServerInstanceName:

image

3) Open the Advanced Editor window by clicking on the Advanced Editor button on the Home tab, deleting all the automatically generated code in there and replacing it with the name of the SQL Server instance that you want to connect to in double quotes:

image

4) Repeat steps 1-3 to create a new query called SQLServerDatabaseName that returns the name of the database that you want to connect to. It might also be a good idea to create a new group to separate these new queries from the ones that load the data:

image

5) Edit the queries that actually return data so that instead of hard-coding the instance name and database name, they take these values from the queries you have just created. A Power BI query can return a value of any data type (not just a table), and the queries created in steps 1-4 return values of type text – the names of the SQL Server instance and database to connect to. These queries can now be used as variables in other queries, so after editing the FactInternetSales query shown above would look like this:

let
    Source =
        Sql.Database(SQLServerInstanceName, SQLServerDatabaseName),
    dbo_FactInternetSales =
        Source{[Schema="dbo",Item="FactInternetSales"]}[Data]
in
    dbo_FactInternetSales

image

Now, if you ever need to change the connection you just need to change the values in these two queries rather than edit every single query that returns data.

There are other ways of solving this problem: for example you could have a query that returns the output of Sql.Database() (as used in the Source step in the FactInternetSales query shown above)and have all the other data load queries reference that. I like the approach I show here though because it makes it very easy to see the values for the SQL Server instance and database that are currently in use. If you’re being even more ambitious – maybe because you have many queries in many .pbix files that connect to the same database – you could even store connection information somewhere outside the .pbix file, maybe in another SQL Server database. But if you did that, you would then need to worry about the connection information for that database too…



Viewing all articles
Browse latest Browse all 248

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>