Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
laratari
Frequent Visitor

Compare two excel files in PowerBI

Good morning/afternoon/evening PowerBI'ers 😀

 

I am looking for help on a project I'd wish to carry out on BI ( This is one of my first approaches to BI, have deep-knowledge in PowerApp & Flow instead ).

 

The scope of the project is:

Given an "X" numbers of Excel files as input, which are saved in Sharepoint, have the possibility of directly compare two of them at a time and present/visualize those rows that a difference is found, either it be:

- Same rows with different column values between the two files

- New rows that either do exist only in one of the two files compared

 

Where I am with this:
- I am able to import as single Dataset all the files in the specified sharepoint folder, where the "Source name" is actually a carried over column

- I can display side-by-side, filtering on source name, the content of the files.

What would be next step:
- Generate a third table, with the above mentioned kind-of differences

Additional info / technical question:
In general terms of "Data modelling" wouldn't probably be "better" already at the Data input query, to generate different data tables based on Source name? So that they're already packed and ready to be used / compared?

 

screenshot.png

1 ACCEPTED SOLUTION

@laratari 

Here is one way. (Apologies since I worked on a sample dataset)

1) Create two disconnected tables to use as slicers with the unique values for the file names. Here is what my model looks like:

model.JPG

 


2) add a column ("String" in the my example) which concatenates the values of each column in the table containg the appended rows from all the tables

CalcColumnString.JPG

 3) Create a measure to use in the filter pane for slicer 2. This filter will filter out the values we select on slicer 1 from slicer 2:

Filter Slicer 2 = COUNTROWS(EXCEPT('File Slicer 2', 'File Slicer 1'))

filter slicer.JPG

 

3) Create a measure to return the rows which are not present in both tables

Different Rows = 
VAR Table1 = CALCULATETABLE(VALUES('Full Table'[String]), FILTER('Full Table', 'Full Table'[File] = SELECTEDVALUE('File Slicer 1'[File])))
VAR Table2 = CALCULATETABLE(VALUES('Full Table'[String]), FILTER('Full Table', 'Full Table'[File] = SELECTEDVALUE('File Slicer 2'[File])))
RETURN
IF(OR(COUNTROWS(EXCEPT(Table1, Table2)) =1, COUNTROWS(EXCEPT(Table2, Table1)) =1), 1)

4) Create a measure to return the file namefor each row based on the previous measure 

File with unique row = 
CALCULATE(SELECTEDVALUE('Full Table'[File]), FILTER('Full Table', [Different Rows] =1))

You can now create a visual using the fields from the table and this [File with unique row] measure:

main visual.JPG

 This table will display the rows which are unique to both of the files selected in the slicers.

 

5) to get the rows which have differing values in some columns is more tricky. For my example, which the files have only 3 columns, I created this measure to use as a filter:

Rows with diff values = 
VAR Channel  = CALCULATE(COUNT('Full Table'[Channel]), FILTER(ALLEXCEPT('Full Table', 'Full Table'[Channel]), [Different Rows] = 1))
VAR _Item = CALCULATE(COUNT('Full Table'[Item]), FILTER(ALLEXCEPT('Full Table', 'Full Table'[item]), [Different Rows] = 1))
VAR _Code = CALCULATE(COUNT('Full Table'[Code]), FILTER(ALLEXCEPT('Full Table', 'Full Table'[Code]), [Different Rows] = 1))
RETURN
Channel + _Item + _Code

You can then make a copy of the main visual and add this measure in the filter pane for the visual

diff columns.JPG

 

I have set the threshold to greater than 3 because my dataset has only three columns (apart from the calculated column for "String" I added later). You will need to alter the measure to cater for the number of columns and adjust the threshold accordingly.

 

This is the final result:

Result.JPG

 

I've attached the sample PBIX file for your reference

 





Did I answer your question? Mark my post as a solution!
In doing so, you are also helping me. Thank you!

Proud to be a Super User!
Paul on Linkedin.






View solution in original post

4 REPLIES 4
laratari
Frequent Visitor

Good morning @v-yuaj-msft , thanks for your reply!

 

Please find here an example of two input files and an ideal result ( We Transfer Link ) in Excel.

Please note, the ideal setup in PowerBI shall accept / load / take as input an entire folder set of Excel files but then compare "dinamically" in a third table / visualization only the two that are being selected in the filters.

 

Below an example setup of my ideal screen in PowerBI:

laratari_0-1620041221292.png

 

 

 

 

@laratari 

Here is one way. (Apologies since I worked on a sample dataset)

1) Create two disconnected tables to use as slicers with the unique values for the file names. Here is what my model looks like:

model.JPG

 


2) add a column ("String" in the my example) which concatenates the values of each column in the table containg the appended rows from all the tables

CalcColumnString.JPG

 3) Create a measure to use in the filter pane for slicer 2. This filter will filter out the values we select on slicer 1 from slicer 2:

Filter Slicer 2 = COUNTROWS(EXCEPT('File Slicer 2', 'File Slicer 1'))

filter slicer.JPG

 

3) Create a measure to return the rows which are not present in both tables

Different Rows = 
VAR Table1 = CALCULATETABLE(VALUES('Full Table'[String]), FILTER('Full Table', 'Full Table'[File] = SELECTEDVALUE('File Slicer 1'[File])))
VAR Table2 = CALCULATETABLE(VALUES('Full Table'[String]), FILTER('Full Table', 'Full Table'[File] = SELECTEDVALUE('File Slicer 2'[File])))
RETURN
IF(OR(COUNTROWS(EXCEPT(Table1, Table2)) =1, COUNTROWS(EXCEPT(Table2, Table1)) =1), 1)

4) Create a measure to return the file namefor each row based on the previous measure 

File with unique row = 
CALCULATE(SELECTEDVALUE('Full Table'[File]), FILTER('Full Table', [Different Rows] =1))

You can now create a visual using the fields from the table and this [File with unique row] measure:

main visual.JPG

 This table will display the rows which are unique to both of the files selected in the slicers.

 

5) to get the rows which have differing values in some columns is more tricky. For my example, which the files have only 3 columns, I created this measure to use as a filter:

Rows with diff values = 
VAR Channel  = CALCULATE(COUNT('Full Table'[Channel]), FILTER(ALLEXCEPT('Full Table', 'Full Table'[Channel]), [Different Rows] = 1))
VAR _Item = CALCULATE(COUNT('Full Table'[Item]), FILTER(ALLEXCEPT('Full Table', 'Full Table'[item]), [Different Rows] = 1))
VAR _Code = CALCULATE(COUNT('Full Table'[Code]), FILTER(ALLEXCEPT('Full Table', 'Full Table'[Code]), [Different Rows] = 1))
RETURN
Channel + _Item + _Code

You can then make a copy of the main visual and add this measure in the filter pane for the visual

diff columns.JPG

 

I have set the threshold to greater than 3 because my dataset has only three columns (apart from the calculated column for "String" I added later). You will need to alter the measure to cater for the number of columns and adjust the threshold accordingly.

 

This is the final result:

Result.JPG

 

I've attached the sample PBIX file for your reference

 





Did I answer your question? Mark my post as a solution!
In doing so, you are also helping me. Thank you!

Proud to be a Super User!
Paul on Linkedin.






@PaulDBrown 

Thank you so much for your reply, it's so full of valuable informations for me!

 

I will take my time now to study it, and hopefully I'll be able to adapt it to my specific needs - in fact the example you provided is EXACTLY what I was looking for - again, THANK YOU! 

 

v-yuaj-msft
Community Support
Community Support

Hi @laratari ,

 

Maybe I'm not getting what actually you are trying to do. Do you want to create a new table based on two different tables, or just want to compare two tables?

You can use LOOKUP() function to add a new caculated column or create a relationship between two tables with a unique key and create a matrix visual.

Could you please share some sample data and the expected result to have a clear understanding of your question? I can do some tests for you.

You can save your files in some cloud sharing platforms and share the link here.

How to provide sample data in the Power BI Forum - Microsoft Power BI Community

 

Best Regards,

Yuna

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.