Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
gemcityzach
Helper III
Helper III

Summarizing like data across multiple tables into a single report

Hey folks,

 

I'm trying to come up with a way to aggregate and summarize records from multiple SharePoint lists into a single set of reports. Each list has a collection of columns that are from a sub-process that seeks to identify if each record has some 'special' features that needs further investigation. I.e., there are distinct business processes that at some point in time in the process, a set of questions and prompts have to be answered. It's the same questions and prompts across all of these business processes, though the column names are are different, they collect the same types of data. Other than the fact that the sub-process data collection is the same across all of these, the rest of the business processes are distinct and share only commonality in organizational hierarchies and the fact that they all share this 'sub-process' requirement.

 

The business who designed this sub-process thought it would be a 'better user experience' to have the user have to enter these questions in each of the primary processes as opposed to doing it all centrally in a single SharePoint process/list. This is even though we told them we could automatically generate the record from the primary process into the subprocess using PowerAutomate to give a seamless user experience with minimal duplication. So, the number of processes we continue to build, then the more complex the reporting will likely get as we're just stacking/appending table queries.

 

What is the most performant way you can think of to achieve my goal of being able to do count total records (tbl1 + tbl2 + tbl3, etc) and to be able to do other time series aggregations like DATEDIFFs on Created Dates vs Identified Dates, etc. Basically, I need to be able to quantify every piece of data across all of this primary processes and their sub processes. Some aggregations:

 

Total Records across all tables

Total Records within individual tables

Total Records across all table that are Open / Closed

Total Records within individual tables that are Open / Closed

Total records that across all tables that are Past Due

Total records that across all tables that are Out of SLA

Total records that across individuial tables that are Past Due

Total records that across individual tables that are Out of SLA

Records from all tables due in 30 days, 60 days, 90 days

Records from individual tables due in 30 / 60/ 90

Average Age of all records in an Open Status

Average Age of all records in a Closed Status

Average Age of all Records

I need to plot in a stacked bar chart using Time Intelligence (time filtering) Total Items Opened and Total Items Closed

And ALL of this needs to be able to be differentiated by BU1 or BU2, which is Organizational Hierachy data

 

I've included at least three examples of the data I'm working with. These are just sample columns, but assume they all have different column names to represent roughly the same data being extracted from this common sub-process they share.

 

tblSrccreatedsome_tracking_idevent (list)identified_dtdecision_dtclosed_dtdecisionpast_due_flagout_of_SLAboolean1boolean2boolean3boolean4concat_value1concat_value2bu1bu2
tbl11/1/2024123a;b;c12/15/20231/31/20241/31/2024some_stringNonoYesNoYesNo1234;55555;6666667777;7771CSuite1CSuite1-1
tbl11/1/20241234a;b;c12/18/20231/31/20241/31/2024some_stringNonoNoNoNoNo1234;55555;6666668833Csuite2CSuite1-2
tbl11/1/202412345a;b;c;d;12/20/20231/31/20241/31/2024some_stringNonoYesNoNoNo1234;55555;666666 Csuite3CSuite1-3
tbl11/1/2024123456a;b;12/5/20231/31/20241/31/2024some_stringNonoYesYesYesNo1234;55555;66666615555CSuite4CSuite1-4
tbl11/1/20241234567a;c;f12/7/20231/31/20242/5/2024some_stringNonoYesYesYesNo222215;30CSuite5CSuite1-5

 

 

tblSrccreateda_diff_tracking_ididentified_dtdecision_dtclosed_dtdecisionpast_due_flagout_of_SLAboolean1boolean2boolean3boolean4concat_value1concat_value2bu1bu2
tbl21/2/2024init-12312/16/20231/31/20241/31/2024some_stringNonoYesNoYesNo1234;55555;6666667777;7771CSuite1CSuite1-1
tbl21/2/2024init-123412/19/20231/31/20241/31/2024some_stringNonoNoNoNoNo1234;55555;6666668833Csuite2CSuite1-2
tbl21/2/2024init-1234512/20/20231/31/20241/31/2024some_stringNonoYesNoNoNo1234;55555;666666Csuite3CSuite1-3
tbl21/2/2024init-12345612/15/20231/31/20241/31/2024some_stringNonoYesYesYesNo1234;55555;66666615555CSuite4CSuite1-4
tbl21/2/2024init-123456712/17/20231/31/20242/5/2024some_stringNonoYesYesYesNo1234;55555;66666615555CSuite5CSuite1-5

 

tblSrccreatedanother_diff_tracking_ididentified_dtdecision_dtclosed_dtdecisionpast_due_flagout_of_SLAboolean1boolean2boolean3boolean4concat_value1concat_value2bu1bu2
tbl31/2/2024tp-12312/16/20231/31/20241/31/2024some_stringNonoYesNoYesNo1234;55555;6666667777;7771CSuite1CSuite1-1
tbl31/2/2024tp-123412/19/20231/31/20241/31/2024some_stringNonoNoNoNoNo1234;55555;6666668833Csuite2CSuite1-2
tbl31/2/2024tp-1234512/20/2023   NonoYesNoNoNo1234;55555;666666Csuite3CSuite1-3
tbl31/2/2024tp-12345612/15/20231/31/20241/31/2024some_stringNonoYesYesYesNo1234;55555;66666615555CSuite4CSuite1-4
tbl31/2/2024tp-123456712/17/20231/31/20242/5/2024some_stringNonoYes

 

Yes

YesNo1234;55555;6666663333;99999CSuite5CSuite1-5

 

I did some testing using Append and I can append all of the tables together, after cleansing them of course. And then did some data cards. I've also incldued some examples of my current visuals just for Tbl1 data. These are what my measures look like just to get total count of records for two tables in a test of append or cross-query:

 

//Measure within the appended table

totalRecords = CALCULATE (
    COUNTROWS ( 'append' )
    , FILTER ( 'append', [init-tracking_id] <> ""))
     //end initiatives count
    + // add next table records
    CALCULATE(
        COUNTROWS( 'append' )
        , FILTER( 'append', [re-Tracking_ID] <> "")
    ) //end risk event count
 
 //Measure within a custom table "Key Measures"
totalRecordsKM = CALCULATE( //querying teach table individually
    COUNTROWS( 'initiatives_items' )
    , initiatives[init-tracking_id] <> ""
    )
    +
    CALCULATE(
        COUNTROWS( trigger_event_items )
        , events[re-Tracking_ID] <> ""
        )
 

The DAX queries from the Append took about 1 ms whereas the dual query took 7 ms. Meaningless at this level and I doubt we'd ever have more than 10,000 records in the LIFETIME of this. I'm just trying to find the easiest and most performant way to make this work so I can get all of my aggregations and visualizations together with the least amount of hassle.

 

2024-04-29_10-33-46.png

 

 

2 REPLIES 2
v-jingzhan-msft
Community Support
Community Support

Hi @gemcityzach 

 

Appending similar data into an "Append" query and creating measures based on this query would be more efficient. This can help you avoid creating many duplicated measures based on different tables. From your test, you already found that the DAX queries from the Append took less time than the dual query. 10,000 records in a query is not a problem. Power BI is able to deal with this data size quickly. By adding tblSrc column to a slicer or a filter, we can use the slicer/filter to switch viewing data of all tables or a specific table. 

 

Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!

Thanks Jing. If I use this append table, then all of my measures will need to account for the multitude of table-column 'relationships'? I.e., if I want to get a count of total records, it'll have to be on all of the particular columns of each segment of the appended table? E.g., tbl1_specific_col + tbl2_specific_col + tbl3_specific_col and if I need filtering then I'll have to ensure that I apply any conditional logic specific to each filterable table-column resource that I want included. I.e., if I only need a measure for tbl1 and tbl2 data, then I have to make sure I'm specific to pick columns and filter variables that only affect the columns and rows from those particular fields?

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.