cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Post Prodigy
Post Prodigy

binary / combine files ( duplicates )

Hello,

I have imported a number of Excel files (all within the same folder), using the binary/combine files approach. Each of these files is a "snapshot" in time of opportunities from our CRM. The problem is that each file contains rows that the other files have as well, so I'm ending up with tons of duplicates. And it's not as easy as just "removing duplicates".

When you import data like this, Power Bi creates a column called Source.Name. You can see an example below. There are files all the way thru May (only 1-13-20 and 1-20-20 are showing). I am only concerned with the earliest file from each month. The file with the 1-13-20 date in the case of January. For April the file date is 4-6-20.

A simple way to look at this is that I could go into the Excel file for January 13 2020, go to the column called Est Close Date, and delete all of the dates that did not fall within January 2020. Rinse and repeat for every file. Then load into Power Bi.

Surely there is an elegant approach to this using the query editor or DAX?

Binary.png

2 ACCEPTED SOLUTIONS

Accepted Solutions
Highlighted
Community Champion
Community Champion

Re: binary / combine files ( duplicates )

If you want to keep just the earliest file in each month, you can use the approach in this post to add an index column that restarts with each month.  If you don't have a good column to group by, create one off of your file date (add Month column).

 

1. Sort by file date descending

2. Group By your month column, and keep "All Rows" instead of calculating an aggregation

3. Add an index while each month is grouped in Table form

4. Expand the grouped tables

5. Filter to where the nex subgroup index value = 1

 

If this solution works for you, please mark it as the solution.  Kudos are appreciated too.  Please let me know if not.

Regards,

Pat

 

 

 

https://www.myonlinetraininghub.com/numbering-grouped-data-power-query

 

 

View solution in original post

Highlighted
Community Support
Community Support

Re: binary / combine files ( duplicates )

Hi @texmexdragon ,

 

Or do like this.

1. Split Column 'Source.Name' by Delimiter.

n1.PNG

2. Group rows.

n2.PNG

n3.PNG

 

Best regards,
Lionel Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

2 REPLIES 2
Highlighted
Community Champion
Community Champion

Re: binary / combine files ( duplicates )

If you want to keep just the earliest file in each month, you can use the approach in this post to add an index column that restarts with each month.  If you don't have a good column to group by, create one off of your file date (add Month column).

 

1. Sort by file date descending

2. Group By your month column, and keep "All Rows" instead of calculating an aggregation

3. Add an index while each month is grouped in Table form

4. Expand the grouped tables

5. Filter to where the nex subgroup index value = 1

 

If this solution works for you, please mark it as the solution.  Kudos are appreciated too.  Please let me know if not.

Regards,

Pat

 

 

 

https://www.myonlinetraininghub.com/numbering-grouped-data-power-query

 

 

View solution in original post

Highlighted
Community Support
Community Support

Re: binary / combine files ( duplicates )

Hi @texmexdragon ,

 

Or do like this.

1. Split Column 'Source.Name' by Delimiter.

n1.PNG

2. Group rows.

n2.PNG

n3.PNG

 

Best regards,
Lionel Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

Helpful resources

Announcements
June 2020 Community Highlights

June 2020 Community Highlights

Featured community members, changes to the Community, and more! Read up on recent Power BI community news.

Using the Community

Using the Community

Need help with the Power BI Community? Our 'Using the Community' support articles are a great place to start.

Community Summit North America

Community Summit North America

Innovate, Collaborate, Grow. The top training and networking event across the globe for Microsoft Business Applications

Power Platform 2020 release wave 2 plan

Power Platform 2020 release wave 2 plan

Features releasing from October 2020 through March 2021

Top Solution Authors
Top Kudoed Authors