Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

binary / combine files ( duplicates )

Hello,

I have imported a number of Excel files (all within the same folder), using the binary/combine files approach. Each of these files is a "snapshot" in time of opportunities from our CRM. The problem is that each file contains rows that the other files have as well, so I'm ending up with tons of duplicates. And it's not as easy as just "removing duplicates".

When you import data like this, Power Bi creates a column called Source.Name. You can see an example below. There are files all the way thru May (only 1-13-20 and 1-20-20 are showing). I am only concerned with the earliest file from each month. The file with the 1-13-20 date in the case of January. For April the file date is 4-6-20.

A simple way to look at this is that I could go into the Excel file for January 13 2020, go to the column called Est Close Date, and delete all of the dates that did not fall within January 2020. Rinse and repeat for every file. Then load into Power Bi.

Surely there is an elegant approach to this using the query editor or DAX?

Binary.png

2 ACCEPTED SOLUTIONS
mahoneypat
Employee
Employee

If you want to keep just the earliest file in each month, you can use the approach in this post to add an index column that restarts with each month.  If you don't have a good column to group by, create one off of your file date (add Month column).

 

1. Sort by file date descending

2. Group By your month column, and keep "All Rows" instead of calculating an aggregation

3. Add an index while each month is grouped in Table form

4. Expand the grouped tables

5. Filter to where the nex subgroup index value = 1

 

If this solution works for you, please mark it as the solution.  Kudos are appreciated too.  Please let me know if not.

Regards,

Pat

 

 

 

https://www.myonlinetraininghub.com/numbering-grouped-data-power-query

 

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


View solution in original post

v-lionel-msft
Community Support
Community Support

Hi @Anonymous ,

 

Or do like this.

1. Split Column 'Source.Name' by Delimiter.

n1.PNG

2. Group rows.

n2.PNG

n3.PNG

 

Best regards,
Lionel Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

2 REPLIES 2
v-lionel-msft
Community Support
Community Support

Hi @Anonymous ,

 

Or do like this.

1. Split Column 'Source.Name' by Delimiter.

n1.PNG

2. Group rows.

n2.PNG

n3.PNG

 

Best regards,
Lionel Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

mahoneypat
Employee
Employee

If you want to keep just the earliest file in each month, you can use the approach in this post to add an index column that restarts with each month.  If you don't have a good column to group by, create one off of your file date (add Month column).

 

1. Sort by file date descending

2. Group By your month column, and keep "All Rows" instead of calculating an aggregation

3. Add an index while each month is grouped in Table form

4. Expand the grouped tables

5. Filter to where the nex subgroup index value = 1

 

If this solution works for you, please mark it as the solution.  Kudos are appreciated too.  Please let me know if not.

Regards,

Pat

 

 

 

https://www.myonlinetraininghub.com/numbering-grouped-data-power-query

 

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.