Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
k_mathana
Helper II
Helper II

Batch Import multiple excel files with different column name from a folder

Hi

I have trouble to perform batch import from multiple excel files which are located in a particular folder where in some files column names differs for example in some file table column name mentioned as "Week ID" and in some files "week" and in some files "WEEK" in some some files just "ID" and they are not in the same column some files it is in "col1" in some files it is in "col5' etc. How to manage this? I need you guidelines on this Thanks in advance

2 ACCEPTED SOLUTIONS

Hello @k_mathana 

 

this should be your code

give it a try and let us know if its working.

 

let

Source = Folder.Files("C:\Users\k_mathana\Desktop\Sample"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Excel.Workbook([Content], true)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Name.1", "Data", "Item", "Kind", "Hidden"}),
ChangeColumnName =(tableint)=>
let
    TransformTable =  #table({"old", "new"},{{"Week ID2", "WEEK"}, {"Week ID", "WEEK"}}),
    FilterForWeek = Table.ToRows(Table.SelectRows(TransformTable, each List.Contains(Table.ColumnNames(tableint), _[old]))),
    RenameWeek = try Table.RenameColumns(tableint, FilterForWeek) otherwise tableint
in
    RenameWeek,
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each ([Kind] = "Table")),
GetAllColumnNames =   List.Distinct(List.Combine(Table.TransformColumns(RenameColumnsData, {{"Data", each Table.ColumnNames(_)}})[Data])),
RenameColumnsData = Table.TransformColumns
(
    #"Filtered Rows",
    {
        {
            "Data",
            (tableint)=> ChangeColumnName(tableint)
        }
    }

),
    #"Removed Other Columns" = Table.SelectColumns(RenameColumnsData,{"Data"}),
    #"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", GetAllColumnNames, GetAllColumnNames)
in
    #"Expanded Data"

 

In this line you can maintain your names to be renamed

image.png

The first value specifies the old, the second the new one. Be aware that Power query is case sensitive

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

 

View solution in original post

Hello @k_mathana 

 

this should be your code

give it a try and let us know if its working.

 

let

Source = Folder.Files("C:\Users\k_mathana\Desktop\Sample"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Excel.Workbook([Content], true)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Name.1", "Data", "Item", "Kind", "Hidden"}),
ChangeColumnName =(tableint)=>
let
    TransformTable =  #table({"old", "new"},{{"Week ID2", "WEEK"}, {"Week ID", "WEEK"}}),
    FilterForWeek = Table.ToRows(Table.SelectRows(TransformTable, each List.Contains(Table.ColumnNames(tableint), _[old]))),
    RenameWeek = try Table.RenameColumns(tableint, FilterForWeek) otherwise tableint
in
    RenameWeek,
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each ([Kind] = "Table")),
GetAllColumnNames =   List.Distinct(List.Combine(Table.TransformColumns(RenameColumnsData, {{"Data", each Table.ColumnNames(_)}})[Data])),
RenameColumnsData = Table.TransformColumns
(
    #"Filtered Rows",
    {
        {
            "Data",
            (tableint)=> ChangeColumnName(tableint)
        }
    }

),
    #"Removed Other Columns" = Table.SelectColumns(RenameColumnsData,{"Data"}),
    #"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", GetAllColumnNames, GetAllColumnNames)
in
    #"Expanded Data"

 

In this line you can maintain your names to be renamed

image.png

The first value specifies the old, the second the new one. Be aware that Power query is case sensitive

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

 

View solution in original post

20 REPLIES 20
Syndicate_Admin
Administrator
Administrator

@k_mathana 

I have a solution where you import 1500+ files from a folder, extract all of their field names, then de-duplicate them, use them to make a comparison table between the old and new names, and finally use this comparison table to implement file renaming and merge table operations in PQ with code.

I've done this myself

Jimmy801
Community Champion
Community Champion

Hello all,

 

@k_mathana yesterday asked how to rename a column, that on multiple tables are called differently. And so I provided a possible solution to that. After that it turned out that there is also a need to read a folder, to read a Excel-file and afterwards to apply my solution. I asked to closed that threat and open a new one for reading folders and Excel-files. So @k_mathana did, but asked again the same question. The question of this threat would be actually how to read a folder with multiple Excel-files.

This could be rather tricky, as you need to know how exactly your Excel-files are looking like (sheets, tables, printing areas?, all the same name?).

@k_mathana try this

- Add a new folder source and select your folder

- Transform it

- Filter the table to the files you need (maybe all, then keep it, otherwise filter for .xlsx or name or whatever)

- Add a new custom column and use this formula "Excel.Workbook([Content], true)",

- Expand the column, by clicking on the right side of your newly created column and click okay. You will get now all objects (sheets, tables, etc) of all your Excel-file.

- Filter them (by the column "Kind") to sheet only or table or whatever you need to read.

 

After this you will have a list of tables, that contains your data.

Next please read one Excel-sheet how you need to and apply all transformation you need (promote headers, reformatting etc.). Post the code here and I will transform the solution of renaming the columns adapt to your needs. Then we can do the final step to apply the function to the filtered tables of your Excel-files created above

 

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

 

 

Mr. Jimmy, Thanks for the response. Here the code for my 4 sample files import.

let

Source = Folder.Files("C:\Users\k_mathana\Desktop\Sample"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Excel.Workbook([Content], true)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Name.1", "Data", "Item", "Kind", "Hidden"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each ([Kind] = "Table")),
#"Expanded Data" = Table.ExpandTableColumn(#"Filtered Rows", "Data", {"Week ID", "Task", "A", "Date", "Week ID2", "B", "C", "ID", "WEEK"}, {"Week ID", "Task", "A", "Date", "Week ID2", "B", "C", "ID", "WEEK"})
in
#"Expanded Data"

Hello @k_mathana 

 

this should be your code

give it a try and let us know if its working.

 

let

Source = Folder.Files("C:\Users\k_mathana\Desktop\Sample"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Excel.Workbook([Content], true)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Name.1", "Data", "Item", "Kind", "Hidden"}),
ChangeColumnName =(tableint)=>
let
    TransformTable =  #table({"old", "new"},{{"Week ID2", "WEEK"}, {"Week ID", "WEEK"}}),
    FilterForWeek = Table.ToRows(Table.SelectRows(TransformTable, each List.Contains(Table.ColumnNames(tableint), _[old]))),
    RenameWeek = try Table.RenameColumns(tableint, FilterForWeek) otherwise tableint
in
    RenameWeek,
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each ([Kind] = "Table")),
GetAllColumnNames =   List.Distinct(List.Combine(Table.TransformColumns(RenameColumnsData, {{"Data", each Table.ColumnNames(_)}})[Data])),
RenameColumnsData = Table.TransformColumns
(
    #"Filtered Rows",
    {
        {
            "Data",
            (tableint)=> ChangeColumnName(tableint)
        }
    }

),
    #"Removed Other Columns" = Table.SelectColumns(RenameColumnsData,{"Data"}),
    #"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", GetAllColumnNames, GetAllColumnNames)
in
    #"Expanded Data"

 

In this line you can maintain your names to be renamed

image.png

The first value specifies the old, the second the new one. Be aware that Power query is case sensitive

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

 

Hello @k_mathana 

 

this should be your code

give it a try and let us know if its working.

 

let

Source = Folder.Files("C:\Users\k_mathana\Desktop\Sample"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Excel.Workbook([Content], true)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Name.1", "Data", "Item", "Kind", "Hidden"}),
ChangeColumnName =(tableint)=>
let
    TransformTable =  #table({"old", "new"},{{"Week ID2", "WEEK"}, {"Week ID", "WEEK"}}),
    FilterForWeek = Table.ToRows(Table.SelectRows(TransformTable, each List.Contains(Table.ColumnNames(tableint), _[old]))),
    RenameWeek = try Table.RenameColumns(tableint, FilterForWeek) otherwise tableint
in
    RenameWeek,
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each ([Kind] = "Table")),
GetAllColumnNames =   List.Distinct(List.Combine(Table.TransformColumns(RenameColumnsData, {{"Data", each Table.ColumnNames(_)}})[Data])),
RenameColumnsData = Table.TransformColumns
(
    #"Filtered Rows",
    {
        {
            "Data",
            (tableint)=> ChangeColumnName(tableint)
        }
    }

),
    #"Removed Other Columns" = Table.SelectColumns(RenameColumnsData,{"Data"}),
    #"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", GetAllColumnNames, GetAllColumnNames)
in
    #"Expanded Data"

 

In this line you can maintain your names to be renamed

image.png

The first value specifies the old, the second the new one. Be aware that Power query is case sensitive

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

 

Mr. Jimmy, @Jimmy801 

Thats a perfect solution. Thank you so much for your efforts and time. Exactly thats what I am looking for!

 

Capture.PNG

Hello @k_mathana 

 

great to read that my effort was good for something😊

 

all the best

 

Jimmy

ziying35
Impactful Individual
Impactful Individual

@k_mathana 

I have a solution where you import 1500+ files from a folder, extract all of their field names, then de-duplicate them, use them to make a comparison table between the old and new names, and finally use this comparison table to implement file renaming and merge table operations in PQ with code.

I've done this myself

@ziying35 Thats great, Could you please share it? it will be very useful.

@ziying35 Thats great, Could you please share it? it will be very useful.

ziying35
Impactful Individual
Impactful Individual

@k_mathana 

I'll simulate a few data files and then write and post a code solution.

Anonymous
Not applicable

you coul benefit from feature of Table.FuzzyNestedJoin function

 

You have to demote the header of your table, then transpose.

 

After that merge with a table like this

 

image.png

 

using this parameters

 

image.png

 

 

The Table dictionary is like this:

 

image.png

 

 

 

 

ziying35
Impactful Individual
Impactful Individual

@k_mathana 

This situation is not without a solution, but it's best not to have the situation you describe, and I suggest you standardize the base file format at a later stage.

You can upload some files that have been cleared of sensitive information to the cloud drive and then share the link here.It is important to indicate which field names are used to express the same meaning and which field names are ultimately needed to present the data.Your final desired table effect is also simulated in the file.

 

Yes @ziying35 I can understand what you are trying to say. But what if I can maintain the information some where for example what ever like "Week ID", "WeekID", "week", "WEEK", "ID" should read as "Week" and before transforming in to table the rename should happen. I can't correct this in the base files because they are maintained by different departments and more over it is over 1500 excel file which accumulated over the years.

The best you can hope to do I believe @k_mathana is do a separate process per department format, then combine the harmonized format at the end.

 

If that isn't helpful, what exactly are you requesting we help with? RIght now I am getting a "I need a crystal ball transformation" vibe. 😂



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

@edhans  I am well of aware of given solution, I was seeking for a way to rename a column before we transform the data.

Column renames are easy @k_mathana of course. The trick is figuring out the column name. You can do this with the Table.ColumnNames() function to get the column name, but that is usually based on the index of the column. However, you said the columns are not consistently located. Sometimes it is in Column 1, others Column 5, etc. You may also not be consistenly starting on a specific row. 

 

It becomes very tricky. Like I said, it can be done, but would be very tedious coding, and I suspect you'd have to modify it every few weeks as someone invented a new column name. "Week1 2020" for example because that is what they felt like doing on a given day. 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

@edhans See the below response which I got it from Mr. Jimmy but I couldnt use this code to connect the excel folder.

 

https://community.powerbi.com/t5/Power-Query/Batch-Import-excel-files-with-different-column-name/m-p... 

 

 

Why did you create another thread with the same question? You'll have to ask Jimmy. 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
edhans
Super User
Super User

If they were in the same columns, you could just eliminate the "Use first row as column headers" step in the combine operation. But since they are not, I think you are going to have to import one at a time, or provide structure for your users to use. I suspect it is possible, but I cannot imagine the complexity and tediousness of the M code to find the data in random places (col1, col5, col8) then consolidate into a single usable table. And when you think you have it, someone will do something to their file to break it.

 

Power Query is quite powerful, but not magic. It does rely somewhat on some level of consistency and predictability in the source data.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors