Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Choosing what data to load VS Filtering what data to analyze

Happy new year everyone!

 

I developed a dashboard that lets me analyse manufacturing data for multiple tests for a single client.

 

For example: Client A had 5 tests:

 

1.PNG

 

Each test (i.e. .xlsx file) contains multiple sensor readings indexed as a time series. For instance:

2.png

 

The main part of the dashboard was to compute maximums, averages, etc etc, for each test. I used a summarized table:

 

Summary_Tests =

SUMMARIZE(
    Data,
    Data[TestNumber],

    "Max Sensor 1",
    MAX(Data[Sensor 1])
 
)
 
I had other queries and calculated tables but the concept was the same: I loaded a single client, and then I used caculated and plotted sensor values for the different tests. Everything worked wonderfully, since I only had a single client and "test number" was a unique identifier.
 
And then, I was asked about to see if I could include multiple clients in the repository folder. The idea that was thrown was to have a single "master" slicer to filter everything in the dashboard by client. In other words:
 
3.png
 
Result 1 wouldn't just be easier but would be more efficient, since I wouldn't need to load data that I won't use.
 
Result 2 has me defining a bunch of relationships (based on client name and test number) that are not taking me anywhere.
 
How would you tackle this?
 
Note: I first thought about asking for the relationships (assuming that result 2 is the way to go) but I decided to go back 1 step to see the big picture first.
 
Thanks
1 ACCEPTED SOLUTION

Hi @Anonymous ,

 

Check the file attach with a transformation of the files for more than one client and without additional table for summarization.

The Source files are random numbers for each sensor I calculated in excel.

 

If you need any assistance please tell me.

 

Regards,

MFelix


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



View solution in original post

4 REPLIES 4
MFelix
Super User
Super User

Hi @Anonymous ,

 

Not really sure if I understand your question, and the fact that you add a summarizing table to your model lost me.

 

Questions:

  • Are you transforming your files from a folder?
    • Are you using a custom function to merge all files together?
  • Are the Sensor 1 ... Sensor 5 always the same for each client or different column names?
  • Is the Summary_test table one additional table on your model based on the initial data?
  • Why do you refer you neded to add relationship between clients and Tests?

Without knowing the full setup of your model believe that your best option would be:

  • Make the transformation of the file based on a folder (or sharepoint folder)
  • Use the file name column after the function use on the query to make a split of test / client
  • Unpivot the Sensor column
  • Make use of the client columns to make the slicer
  • Using the DAX or the options within the visual get the max, or whatever value you need to avoid using additional tables (if Summary_test table is additional)

Can you please share a sample of your PBI and base files? If information is sensitive you can make a mockup or send it by private message.

 

Regards,

MFelix


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Anonymous
Not applicable

Hi Felix,

 

It's hard to articulate the question.

 

I guess another way to put it is that a whole new variable (client) appeared at a higher level of everything I had. I cannot seem to find a simple way to filter it without ruining all the relationships, measurements, and calculated tables I already had.

 

The solution should be easy to replicate. Next week, I could have another variable thrown in next week (hopefully I won't), e.g. technology X, Y, Z. So I would hope I could have the same summary table showing the maximums for sensors 1 to 5 (always 5 sensors) for (Client A, Technology X) separetelly from (Client B, Technology X) and (Client A, Technology Y).

 

Let me tell you:

  • Are you transforming your files from a folder?
    • Yes I am
  • Are you using a custom function to merge all files together?
    • Yes I am
  • Are the Sensor 1 ... Sensor 5 always the same for each client or different column names?
    • Yes, they are the same columns for any test or client
  • Is the Summary_test table one additional table on your model based on the initial data?
    • Yes it is
  • Why do you refer you neded to add relationship between clients and Tests? 
    • Let's say that I have:
      • Query "Test_Files" that has a row per test per client, with columns for client and test number
      • Query "Time_Data" that has all the values for sensor 1 along columns for client and test number
      • A summary table that has the maximum value for sensor 1 for each test
    • My logic was to set a slicer based on Test_Files[Client] and make a relationship where Test_Files[Client] filters Time_Data[Client] and then the summary table would use that to summarize by Test
    • If the above logic doesn't work (it hasn't), the calculated tables and measurements that I have will use Test 1 for both Client A and Client B to find an overall max.

 

This is an oversimplified account of the model that I have, which was working. The issue again is that I'm having multiple "Test 1" when I have multiple clients.

 

Thank you!

 

 

Hi @Anonymous ,

 

As I refer believe that you are overcomplicating some part of the models and having additional table on the setup will introduce complexety and increase the issues on scaling.

 

Making use of the folder will allow you to have additional information taken out from the files (name, size, etc) that you can use to your advantage to create columns for your data not having the need to have more table and columns, or have manual inputs.

 

Also you need to make use of the dax and visualization based on the data in order to reduce the number of tables you have.

 

Question if I make some files based on the setup you have on the image of the first post will this be a correct representation of your model?

 

I will try to show you how I would setup this to be scalable to other requests.

 

Regards,

MFelix


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Hi @Anonymous ,

 

Check the file attach with a transformation of the files for more than one client and without additional table for summarization.

The Source files are random numbers for each sensor I calculated in excel.

 

If you need any assistance please tell me.

 

Regards,

MFelix


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.