Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Custom Column Expand Creates Duplicates

Hey,

 

I'm bringing in files from a folder:

- File 1

- File 2

- File 3

 

I add a custom column formula:

- Excel.Workbook([Content], true)

 

I expand the custom column formula and the output is multiple duplicates:

- File 1, Name: Sheet

- File 1, Name null

- File 2, Name: Name1

- File 2, Name: Name2

- File 3, Kind: Sheet

- File 3, Kind: DefinedName

 

Etc. I cannot simply remove duplicates as only half these files are valid (contain the columns I'm expecting), and there doesn't appear to be a standard rule. For instance, If I remove Name: Sheet and keep Name: null I get the right columns, but if I remove Kind: Sheet and keep Kind: DefinedName I get a bad result.

 

I have set this up once accurately, but it needs to run itself going forward as I won't be around to detect issues. Any ideas on how to successfully pull the content from multiple files without getting these seemingly random duplicates?

 

Thanks

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Sorting them into files is a great idea, however it would require me to rely on users which I would prefer not to do 😉 Combine and edit produces different results than what are needed.

 

I did find a couple of the reasons for why it duplicates the file. If there is a filter applied in the file then it duplicates the file, so removing any files where name contains FilterDatabase is one step. Where there are multiple options for Kind, removing any that contain DefinedName and leaving all null or sheet also seems to work.

 

Unsure if these are permanent solutions or not but they work for now.

 

Thank you for your help!

View solution in original post

4 REPLIES 4
etheil
Helper I
Helper I

Hello,

 

What option are you selecting when combining the files? I'm guessing here, but it looks like you're selecting the Edit option which allows you to manually control the combination steps. If that's the case, do you get the same results when selecting the Combine & Edit option? 

 

Eric

 

Checklist.png

 

 

 

 

 

Anonymous
Not applicable

Thanks for the reply etheil.

 

Indeed, I manually edit. Within the folder there are four file groups possible:

- Group 1

- Group 2

- Group 3

- Group 4

 

I have Power BI automatically sort the files into groups based on filenames, then I combine Group 1 & 2, Group 3 & 4 as these are similar, then transformations on group 3 & 4 and finally I can combine all groups together.

 

The combine and edit option lacks the nuanced control required to complete the above steps, and the error skipping makes me hesitant to trust it even if I could piecemeal it into the process.

Hello,

 

I will admit I'm not sure I completely understand, but the process of using the Combine and Edit might provide you with slightly different M code that avoids the duplicate problem. If you select Combine and Edit, do you get the duplicates or does selecting that option give you results that are different enough that it's not comparable to your manual process? Also, what about grouping these files based on folders, creating multiple import processes, and then combining the results in the query editor? I don't have a solution for you, but was hoping to provide you with a way to determine what is causing the duplcates (maybe you're already past that point).

 

Is it possible the sorting process is pulling the same file twice (a single file might somehow be part of more than one group)? What happens if you start with a single file, run the import, and then add another file, retesting the process each time? At what point do duplicates appear?

 

Eric

Anonymous
Not applicable

Sorting them into files is a great idea, however it would require me to rely on users which I would prefer not to do 😉 Combine and edit produces different results than what are needed.

 

I did find a couple of the reasons for why it duplicates the file. If there is a filter applied in the file then it duplicates the file, so removing any files where name contains FilterDatabase is one step. Where there are multiple options for Kind, removing any that contain DefinedName and leaving all null or sheet also seems to work.

 

Unsure if these are permanent solutions or not but they work for now.

 

Thank you for your help!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.