Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi everybody,
I need help to finish my import of Excel files from a folder. My inspiration for this is the YouTube video from Goodly (Chandeep), that is importing data without creating a sample file.
This works fine for me, when I just need to import all files in folder, but in this case, I create a column with a date that is based on the name of the file.
But I need help to do a Table.Combine to expand all the files in the correct way, but to keep either the date column or the column that has the filename on all rows after combining data, when I do the Table.combine.
I can not use the Table.ExpandTableColumn as the headers for the different columns can and will vary over time. The Table.Combine handles this, but I can not get this extra column called "Date" to be included on all rows. Please Help
I can send you sample files of what I want to import (I can't find where to upload), but basically it is an export from a booking system, where I will skip the first 5 rows and the result is data formatted as a pivot table that I will convert into records afterwards (that part works fine).
I hope the community can help.
Best regards
Jørgen
let
/* Here is an alternative way to import a folder and all the filtered subfolders
Using the M Script instead of the User Interface provided in Power Query gives you a lot more options
This Script is created by Jørgen Wulff Rasmussen, Zealand Data ApS
Contact jwr@zealand-data.com phone +45 23732009
Inspired by the YouTube video "Combine Data from Multiple Excel Files | Most Dynamic Method (Dynamic Columns & Sheets)" fra Goodly (Chandeep) */
// Step 1: Source - Load files from specified folder
Source = Folder.Files(FolderName),
// Lowercase transformation for consistency
LowerCase_All_TextColumns = Table.TransformColumns(Source,{{"Name", Text.Lower, type text}, {"Extension", Text.Lower, type text}, {"Folder Path", Text.Lower, type text}}),
// Filter files that include 'balance' in the name
Filter_NameColumn_To_Files_Named_Sales = Table.SelectRows(LowerCase_All_TextColumns, each Text.StartsWith([Name], "product")),
// Step 2: Custom Function to process each Excel file
ProcessWorkbook = (excelFile as binary) =>
let
// Load the workbook, preserve existing headers for identification
ExcelContent = Excel.Workbook(excelFile, true),
// Apply transformations to each sheet in the workbook
TransformedSheets = Table.TransformColumns(ExcelContent, {"Data", each
let
// Skip the first row
SkippedFirstRow = Table.Skip(_, 3),
// Promote the next row as headers
PromotedHeaders = Table.PromoteHeaders(SkippedFirstRow, [PromoteAllScalars=true]),
// Remove columns that contain only null values
ColumnsToRemove = List.Select(Table.ColumnNames(PromotedHeaders), each List.NonNullCount(Table.Column(PromotedHeaders, _)) = 0),
CleanedTable = Table.RemoveColumns(PromotedHeaders, ColumnsToRemove)
in
CleanedTable
}),
// Combine data from all sheets
CombinedSheets = Table.Combine(TransformedSheets[Data])
in
CombinedSheets,
// Step 3: Transform the 'Content' column using the custom function
Transform_Content_Column_To_Table = Table.TransformColumns(
Filter_NameColumn_To_Files_Named_Sales,
{"Content", each ProcessWorkbook(_)}
),
#"Added Custom" = Table.AddColumn(Transform_Content_Column_To_Table, "Date", each Date.FromText("01-" & Text.Middle([Name],8,2) & "-" & Text.Middle([Name],11,2))),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Content", "Date"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Other Columns",{"Date", "Content"}),
#"Changed Type" = Table.TransformColumnTypes(#"Reordered Columns",{{"Date", type date}})
in
#"Changed Type"
I can not use the Table.ExpandTableColumn as the headers for the different columns can and will vary over time.
Red flag. You will have endless problems trying to refresh this in the service. Use Table.SelectColumns (rather than Table.RemoveColumns) to ignore newly added columns and keep your Power Query output meta data stable. If you need these new columns then you need to factor in the manual maintenance effort.
The script you listed seems a bit excessive. Should not be needed for at least vagualy similar Excel files.
If you like more assistance please post a handful of the Excel file samples and indicate expected outcome.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.