Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Optimized null checking: measures or power query?

I am creating a number of "health reports" for my client, where I visualize the completeness and accuracy of their source system data. We're starting with the simplest question: what % of rows in each field is null? To show this quickly, I'm setting a series of stacked column or bar charts, where each bar shows the ration of "healthy":"missing" rows in a given field.

 

From an optimization standpoint, I'm curious which of the following methods will get the best performance out of my dashboard, both in terms of query load speed and in terms of responsiveness to user interaction on the Service.

 

Method 1 - Added Logical Columns in Power Query

For each field I'm auditing, I run a function that adds another column to the dataset, e.g.:

 

fieldNull = Table.AddColumn(prevStep, "fieldName", each fnNullBool([field2Check]), type logical)

 

So if I have six fields I want to check for nulls, that means six extra columns. Even if they're all type logical, this adds to the weight of the dataset and forces these calculations at the Power Query level.

 

For the stacked bar chart, I set the fieldNull as the Legend, and the Count of Rows as the Value.

 

Method 2 - Measure

I load the dataset I'm checking for nulls "as is". For each field I want to check for nulls, I create two Measures:

 

field_missing = COUNTBLANK(field2Check)

field_healthy = [allTableRows] - [field_missing]

 

For the stacked bar chart, I place both Measures in the Value field.

 

---------------------------------------------------------------------------

 

From a user perspective, Method 1 might be preferable because they can click on the stacked bar chart and it willl auto-filter all other charts on the visual, plus any "full-details" tables that let them see the dataset beyond just null counts. But I'm concerned this method will make the dashboard muuuuuuch slower.

 

Thoughts from the back-end experts? What's the performance trade-off here, if any?

1 ACCEPTED SOLUTION
v-ljerr-msft
Employee
Employee

Hi @Anonymous,

 

From my point of view, if you're not using direct query mode for your dataset, Method 1 could be a better choice. Smiley Happy

 

Regards

View solution in original post

2 REPLIES 2
v-ljerr-msft
Employee
Employee

Hi @Anonymous,

 

From my point of view, if you're not using direct query mode for your dataset, Method 1 could be a better choice. Smiley Happy

 

Regards

Anonymous
Not applicable

Forgot to explain the fnNullBool function:

 

(x) =>
let
nullCheck = Logical.From(if x = null then true else false)
in
nullCheck

 

 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.