Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
WulffJoergen
Frequent Visitor

Importing Excel files from folder without sample files using Table.Combine -but with an extra column

Hi everybody,
I need help to finish my import of Excel files from a folder. My inspiration for this is the YouTube video from Goodly (Chandeep), that is importing data without creating a sample file.

This works fine for me, when I just need to import all files in folder, but in this case, I create a column with a date that is based on the name of the file.

Power Query Question.jpg

But I need help to do a Table.Combine to expand all the files in the correct way, but to keep either the date column or the column that has the filename on all rows after combining data, when I do the Table.combine.

I can not use the Table.ExpandTableColumn as the headers for the different columns can and will vary over time. The Table.Combine handles this, but I can not get this extra column called "Date" to be included on all rows. Please Help


I can send you sample files of what I want to import (I can't find where to upload), but basically it is an export from a booking system, where I will skip the first 5 rows and the result is data formatted as a pivot table that I will convert into records afterwards (that part works fine). 

I hope the community can help.

Best regards
Jørgen
 

 

let
    /* Here is an alternative way to import a folder and all the filtered subfolders
    Using the M Script instead of the User Interface provided in Power Query gives you a lot more options
    This Script is created by Jørgen Wulff Rasmussen, Zealand Data ApS
    Contact jwr@zealand-data.com phone +45 23732009
    Inspired by the YouTube video "Combine Data from Multiple Excel Files | Most Dynamic Method (Dynamic Columns & Sheets)" fra Goodly (Chandeep) */

    // Step 1: Source - Load files from specified folder
    Source = Folder.Files(FolderName),
    // Lowercase transformation for consistency
    LowerCase_All_TextColumns = Table.TransformColumns(Source,{{"Name", Text.Lower, type text}, {"Extension", Text.Lower, type text}, {"Folder Path", Text.Lower, type text}}),
    // Filter files that include 'balance' in the name
    Filter_NameColumn_To_Files_Named_Sales = Table.SelectRows(LowerCase_All_TextColumns, each Text.StartsWith([Name], "product")),

    // Step 2: Custom Function to process each Excel file
    ProcessWorkbook = (excelFile as binary) =>
    let
        // Load the workbook, preserve existing headers for identification
        ExcelContent = Excel.Workbook(excelFile, true),
        // Apply transformations to each sheet in the workbook
        TransformedSheets = Table.TransformColumns(ExcelContent, {"Data", each 
            let
                // Skip the first row
                SkippedFirstRow = Table.Skip(_, 3),
                // Promote the next row as headers
                PromotedHeaders = Table.PromoteHeaders(SkippedFirstRow, [PromoteAllScalars=true]),
                // Remove columns that contain only null values
                ColumnsToRemove = List.Select(Table.ColumnNames(PromotedHeaders), each List.NonNullCount(Table.Column(PromotedHeaders, _)) = 0),
                CleanedTable = Table.RemoveColumns(PromotedHeaders, ColumnsToRemove)
            in
                CleanedTable
            }),
        // Combine data from all sheets
        CombinedSheets = Table.Combine(TransformedSheets[Data])
    in
        CombinedSheets,

    // Step 3: Transform the 'Content' column using the custom function
    Transform_Content_Column_To_Table = Table.TransformColumns(
        Filter_NameColumn_To_Files_Named_Sales, 
        {"Content", each ProcessWorkbook(_)}
    ),
    #"Added Custom" = Table.AddColumn(Transform_Content_Column_To_Table, "Date", each Date.FromText("01-" & Text.Middle([Name],8,2) & "-" & Text.Middle([Name],11,2))),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Content", "Date"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Removed Other Columns",{"Date", "Content"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Reordered Columns",{{"Date", type date}})
in
    #"Changed Type"

 

 

1 REPLY 1
lbendlin
Super User
Super User

I can not use the Table.ExpandTableColumn as the headers for the different columns can and will vary over time.

Red flag.  You will have endless problems trying to refresh this in the service.  Use Table.SelectColumns  (rather than Table.RemoveColumns) to ignore newly added columns and keep your Power Query output meta data stable.  If you need these new columns then you need to factor in the manual maintenance effort.

 

The script you listed seems a bit excessive. Should not be needed for at least vagualy similar Excel files.

 

If you like more assistance please post a handful of the Excel file samples and indicate expected outcome.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors