Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi
I have trouble to perform batch import from multiple excel files which are located in a particular folder where in some files column names differs for example in some file table column name mentioned as "Week ID" and in some files "week" and in some files "WEEK" in some some files just "ID" and they are not in the same column some files it is in "col1" in some files it is in "col5' etc. How to manage this? I need you guidelines on this Thanks in advance
Solved! Go to Solution.
Hello @k_mathana
this should be your code
give it a try and let us know if its working.
let
Source = Folder.Files("C:\Users\k_mathana\Desktop\Sample"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Excel.Workbook([Content], true)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Name.1", "Data", "Item", "Kind", "Hidden"}),
ChangeColumnName =(tableint)=>
let
TransformTable = #table({"old", "new"},{{"Week ID2", "WEEK"}, {"Week ID", "WEEK"}}),
FilterForWeek = Table.ToRows(Table.SelectRows(TransformTable, each List.Contains(Table.ColumnNames(tableint), _[old]))),
RenameWeek = try Table.RenameColumns(tableint, FilterForWeek) otherwise tableint
in
RenameWeek,
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each ([Kind] = "Table")),
GetAllColumnNames = List.Distinct(List.Combine(Table.TransformColumns(RenameColumnsData, {{"Data", each Table.ColumnNames(_)}})[Data])),
RenameColumnsData = Table.TransformColumns
(
#"Filtered Rows",
{
{
"Data",
(tableint)=> ChangeColumnName(tableint)
}
}
),
#"Removed Other Columns" = Table.SelectColumns(RenameColumnsData,{"Data"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", GetAllColumnNames, GetAllColumnNames)
in
#"Expanded Data"
In this line you can maintain your names to be renamed
The first value specifies the old, the second the new one. Be aware that Power query is case sensitive
Copy paste this code to the advanced editor in a new blank query to see how the solution works.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
Hello @k_mathana
this should be your code
give it a try and let us know if its working.
let
Source = Folder.Files("C:\Users\k_mathana\Desktop\Sample"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Excel.Workbook([Content], true)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Name.1", "Data", "Item", "Kind", "Hidden"}),
ChangeColumnName =(tableint)=>
let
TransformTable = #table({"old", "new"},{{"Week ID2", "WEEK"}, {"Week ID", "WEEK"}}),
FilterForWeek = Table.ToRows(Table.SelectRows(TransformTable, each List.Contains(Table.ColumnNames(tableint), _[old]))),
RenameWeek = try Table.RenameColumns(tableint, FilterForWeek) otherwise tableint
in
RenameWeek,
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each ([Kind] = "Table")),
GetAllColumnNames = List.Distinct(List.Combine(Table.TransformColumns(RenameColumnsData, {{"Data", each Table.ColumnNames(_)}})[Data])),
RenameColumnsData = Table.TransformColumns
(
#"Filtered Rows",
{
{
"Data",
(tableint)=> ChangeColumnName(tableint)
}
}
),
#"Removed Other Columns" = Table.SelectColumns(RenameColumnsData,{"Data"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", GetAllColumnNames, GetAllColumnNames)
in
#"Expanded Data"
In this line you can maintain your names to be renamed
The first value specifies the old, the second the new one. Be aware that Power query is case sensitive
Copy paste this code to the advanced editor in a new blank query to see how the solution works.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
I have a solution where you import 1500+ files from a folder, extract all of their field names, then de-duplicate them, use them to make a comparison table between the old and new names, and finally use this comparison table to implement file renaming and merge table operations in PQ with code.
I've done this myself
Hello all,
@k_mathana yesterday asked how to rename a column, that on multiple tables are called differently. And so I provided a possible solution to that. After that it turned out that there is also a need to read a folder, to read a Excel-file and afterwards to apply my solution. I asked to closed that threat and open a new one for reading folders and Excel-files. So @k_mathana did, but asked again the same question. The question of this threat would be actually how to read a folder with multiple Excel-files.
This could be rather tricky, as you need to know how exactly your Excel-files are looking like (sheets, tables, printing areas?, all the same name?).
@k_mathana try this
- Add a new folder source and select your folder
- Transform it
- Filter the table to the files you need (maybe all, then keep it, otherwise filter for .xlsx or name or whatever)
- Add a new custom column and use this formula "Excel.Workbook([Content], true)",
- Expand the column, by clicking on the right side of your newly created column and click okay. You will get now all objects (sheets, tables, etc) of all your Excel-file.
- Filter them (by the column "Kind") to sheet only or table or whatever you need to read.
After this you will have a list of tables, that contains your data.
Next please read one Excel-sheet how you need to and apply all transformation you need (promote headers, reformatting etc.). Post the code here and I will transform the solution of renaming the columns adapt to your needs. Then we can do the final step to apply the function to the filtered tables of your Excel-files created above
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
Mr. Jimmy, Thanks for the response. Here the code for my 4 sample files import.
let
Source = Folder.Files("C:\Users\k_mathana\Desktop\Sample"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Excel.Workbook([Content], true)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Name.1", "Data", "Item", "Kind", "Hidden"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each ([Kind] = "Table")),
#"Expanded Data" = Table.ExpandTableColumn(#"Filtered Rows", "Data", {"Week ID", "Task", "A", "Date", "Week ID2", "B", "C", "ID", "WEEK"}, {"Week ID", "Task", "A", "Date", "Week ID2", "B", "C", "ID", "WEEK"})
in
#"Expanded Data"
Hello @k_mathana
this should be your code
give it a try and let us know if its working.
let
Source = Folder.Files("C:\Users\k_mathana\Desktop\Sample"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Excel.Workbook([Content], true)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Name.1", "Data", "Item", "Kind", "Hidden"}),
ChangeColumnName =(tableint)=>
let
TransformTable = #table({"old", "new"},{{"Week ID2", "WEEK"}, {"Week ID", "WEEK"}}),
FilterForWeek = Table.ToRows(Table.SelectRows(TransformTable, each List.Contains(Table.ColumnNames(tableint), _[old]))),
RenameWeek = try Table.RenameColumns(tableint, FilterForWeek) otherwise tableint
in
RenameWeek,
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each ([Kind] = "Table")),
GetAllColumnNames = List.Distinct(List.Combine(Table.TransformColumns(RenameColumnsData, {{"Data", each Table.ColumnNames(_)}})[Data])),
RenameColumnsData = Table.TransformColumns
(
#"Filtered Rows",
{
{
"Data",
(tableint)=> ChangeColumnName(tableint)
}
}
),
#"Removed Other Columns" = Table.SelectColumns(RenameColumnsData,{"Data"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", GetAllColumnNames, GetAllColumnNames)
in
#"Expanded Data"
In this line you can maintain your names to be renamed
The first value specifies the old, the second the new one. Be aware that Power query is case sensitive
Copy paste this code to the advanced editor in a new blank query to see how the solution works.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
Hello @k_mathana
this should be your code
give it a try and let us know if its working.
let
Source = Folder.Files("C:\Users\k_mathana\Desktop\Sample"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Excel.Workbook([Content], true)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Name", "Data", "Item", "Kind", "Hidden"}, {"Name.1", "Data", "Item", "Kind", "Hidden"}),
ChangeColumnName =(tableint)=>
let
TransformTable = #table({"old", "new"},{{"Week ID2", "WEEK"}, {"Week ID", "WEEK"}}),
FilterForWeek = Table.ToRows(Table.SelectRows(TransformTable, each List.Contains(Table.ColumnNames(tableint), _[old]))),
RenameWeek = try Table.RenameColumns(tableint, FilterForWeek) otherwise tableint
in
RenameWeek,
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each ([Kind] = "Table")),
GetAllColumnNames = List.Distinct(List.Combine(Table.TransformColumns(RenameColumnsData, {{"Data", each Table.ColumnNames(_)}})[Data])),
RenameColumnsData = Table.TransformColumns
(
#"Filtered Rows",
{
{
"Data",
(tableint)=> ChangeColumnName(tableint)
}
}
),
#"Removed Other Columns" = Table.SelectColumns(RenameColumnsData,{"Data"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", GetAllColumnNames, GetAllColumnNames)
in
#"Expanded Data"
In this line you can maintain your names to be renamed
The first value specifies the old, the second the new one. Be aware that Power query is case sensitive
Copy paste this code to the advanced editor in a new blank query to see how the solution works.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
Mr. Jimmy, @Jimmy801
Thats a perfect solution. Thank you so much for your efforts and time. Exactly thats what I am looking for!
I have a solution where you import 1500+ files from a folder, extract all of their field names, then de-duplicate them, use them to make a comparison table between the old and new names, and finally use this comparison table to implement file renaming and merge table operations in PQ with code.
I've done this myself
I'll simulate a few data files and then write and post a code solution.
you coul benefit from feature of Table.FuzzyNestedJoin function
You have to demote the header of your table, then transpose.
After that merge with a table like this
using this parameters
The Table dictionary is like this:
This situation is not without a solution, but it's best not to have the situation you describe, and I suggest you standardize the base file format at a later stage.
You can upload some files that have been cleared of sensitive information to the cloud drive and then share the link here.It is important to indicate which field names are used to express the same meaning and which field names are ultimately needed to present the data.Your final desired table effect is also simulated in the file.
Yes @ziying35 I can understand what you are trying to say. But what if I can maintain the information some where for example what ever like "Week ID", "WeekID", "week", "WEEK", "ID" should read as "Week" and before transforming in to table the rename should happen. I can't correct this in the base files because they are maintained by different departments and more over it is over 1500 excel file which accumulated over the years.
The best you can hope to do I believe @k_mathana is do a separate process per department format, then combine the harmonized format at the end.
If that isn't helpful, what exactly are you requesting we help with? RIght now I am getting a "I need a crystal ball transformation" vibe. 😂
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI Reporting@edhans I am well of aware of given solution, I was seeking for a way to rename a column before we transform the data.
Column renames are easy @k_mathana of course. The trick is figuring out the column name. You can do this with the Table.ColumnNames() function to get the column name, but that is usually based on the index of the column. However, you said the columns are not consistently located. Sometimes it is in Column 1, others Column 5, etc. You may also not be consistenly starting on a specific row.
It becomes very tricky. Like I said, it can be done, but would be very tedious coding, and I suspect you'd have to modify it every few weeks as someone invented a new column name. "Week1 2020" for example because that is what they felt like doing on a given day.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI Reporting@edhans See the below response which I got it from Mr. Jimmy but I couldnt use this code to connect the excel folder.
Why did you create another thread with the same question? You'll have to ask Jimmy.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingIf they were in the same columns, you could just eliminate the "Use first row as column headers" step in the combine operation. But since they are not, I think you are going to have to import one at a time, or provide structure for your users to use. I suspect it is possible, but I cannot imagine the complexity and tediousness of the M code to find the data in random places (col1, col5, col8) then consolidate into a single usable table. And when you think you have it, someone will do something to their file to break it.
Power Query is quite powerful, but not magic. It does rely somewhat on some level of consistency and predictability in the source data.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingCovering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.