Solved: Perform Query Steps on Excel file dynamically base...

Anonymous · ‎03-21-2020

So i'm importing a bunch of Excel files from SharePoint. For some reason, the underlying data structure is a bit strange. Each Excel file has column headers starting on row 12. I have confirmed this by going into each file. However, when imported into Power BI some files have column headers starting on row 11 and some on row 12. These files are generated from an automated job. Some files have the sheet name as the date the file was created and some files just have Sheet1 as the date. This makes me think that some underlying data is in csv format if the sheet name has the date created instead of Sheet1?

That's the root of the problem.

My solution was to somehow develop a way in Power BI to look at each incoming Excel file and see if the column headers start on row 11 then perform query steps A and if they come in on row 12 then perform Query Steps B.

1.) I duplicated the transform sample file step and converted it to a list to identify which row the column headers are coming in at and assigned it as a parameter:

let
    Source = Excel.Workbook(Parameter2, null, true),
    FirstSheet = Table.SelectRows(Source, each [Kind] = "Sheet"){0}[Data],
    #"Removed Top Rows" = Table.Skip(FirstSheet,10),
    #"Duplicated Column" = Table.DuplicateColumn(#"Removed Top Rows", "Column1", "Column1 - Copy"),
    #"Replaced Value" = Table.ReplaceValue(#"Duplicated Column",null,"",Replacer.ReplaceValue,{"Column1 - Copy"}),
    #"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each if Text.Contains([#"Column1 - Copy"], "Shift") then [#"Column1 - Copy"] else "Adjust"),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Column1 - Copy"}),
    #"Filled Down" = Table.FillDown(#"Removed Columns1",{"Custom"}),
    #"Removed Other Columns" = Table.SelectColumns(#"Filled Down",{"Custom"}),
    #"Kept First Rows" = Table.FirstN(#"Removed Other Columns",1),
    Custom = #"Kept First Rows"[Custom]



in
    Custom

Parameter:

"Shift" meta [IsParameterQuery=true, ExpressionIdentifier=#"Transform Sample File (3)", Type="Text", IsParameterQueryRequired=true]

2.) In the original transform Sample file, do conditional branching to do Query Steps A or B based on the parameter setup previously:

let
    Source = Excel.Workbook(Parameter2, null, true),
    FirstSheet = Table.SelectRows(Source, each [Kind] = "Sheet"){0}[Data],
    #"Removed Top Rows" = Table.Skip(FirstSheet,10),
    #"Duplicated Column" = Table.DuplicateColumn(#"Removed Top Rows", "Column1", "Column1 - Copy"),
    #"Replaced Value" = Table.ReplaceValue(#"Duplicated Column",null,"",Replacer.ReplaceValue,{"Column1 - Copy"}),
    #"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each if Text.Contains([#"Column1 - Copy"], "Shift") then [#"Column1 - Copy"] else "Adjust"),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Column1 - Copy"}),
    #"Filled Down" = Table.FillDown(#"Removed Columns1",{"Custom"}),
    //Adjust
    #"Removed Top Rows1000" = Table.Skip(#"Filled Down",1),
    #"Promoted Headers100" = Table.PromoteHeaders(#"Removed Top Rows1000", [PromoteAllScalars=true]),
    #"Changed Type100" = Table.TransformColumnTypes(#"Promoted Headers100",{{"Shift", Int64.Type}, {"Column2", type any}, {"EmpNbr", Int64.Type}, {"Name", type text}, {"Column5", type any}, {"Column6", type any}, {"Extn", type text}, {"Area", type text}, {"Location", type text}, {"Duration", Int64.Type}, {"Arvl", Int64.Type}, {"Lunch", Int64.Type}, {"Dptr", Int64.Type}, {"Total", Int64.Type}, {"Column15", type text}}),
    #"Removed Top Rows100" = Table.Skip(#"Changed Type100",2),
    #"Removed Columns100" = Table.RemoveColumns(#"Removed Top Rows100",{"Shift", "Column2", "Column5", "Column6", "Column15"}),
    #"Removed Blank Rows100" = Table.SelectRows(#"Removed Columns100", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
    #"Adjust Result" = #"Removed Blank Rows100",

    //Shift No adjust
    #"Promoted Headers" = Table.PromoteHeaders(#"Filled Down", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Shift", Int64.Type}, {"Column2", type any}, {"EmpNbr", Int64.Type}, {"Name", type text}, {"Column5", type any}, {"Column6", type any}, {"Extn", type text}, {"Area", type text}, {"Location", type text}, {"Duration", Int64.Type}, {"Arvl", Int64.Type}, {"Lunch", Int64.Type}, {"Dptr", Int64.Type}, {"Total", Int64.Type}, {"Column15", type text}}),
    #"Removed Top Rows1" = Table.Skip(#"Changed Type",2),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Top Rows1",{"Shift", "Column2", "Column5", "Column6", "Column15"}),
    #"Removed Blank Rows" = Table.SelectRows(#"Removed Columns", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
    #"Shift Result" = #"Removed Blank Rows",

    result =
        if ManufacturingTeam= "Adjust"
        then #"Adjust Result"
        else #"Shift Result",
    #"Removed Columns2" = Table.RemoveColumns(result,{"Shift_1"})



in
    #"Removed Columns2"

Added this also to the transform file function:

let
    Source = (Parameter2 as binary, ManufacturingTeam as text) => let
        Source = Excel.Workbook(Parameter2, null, true),
        FirstSheet = Table.SelectRows(Source, each [Kind] = "Sheet"){0}[Data],
        #"Removed Top Rows" = Table.Skip(FirstSheet,10),
        #"Duplicated Column" = Table.DuplicateColumn(#"Removed Top Rows", "Column1", "Column1 - Copy"),
        #"Replaced Value" = Table.ReplaceValue(#"Duplicated Column",null,"",Replacer.ReplaceValue,{"Column1 - Copy"}),
        #"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each if Text.Contains([#"Column1 - Copy"], "Shift") then [#"Column1 - Copy"] else "Adjust"),
        #"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Column1 - Copy"}),
        #"Filled Down" = Table.FillDown(#"Removed Columns1",{"Custom"}),
        //Adjust
        #"Removed Top Rows1000" = Table.Skip(#"Filled Down",1),
        #"Promoted Headers100" = Table.PromoteHeaders(#"Removed Top Rows1000", [PromoteAllScalars=true]),
        #"Changed Type100" = Table.TransformColumnTypes(#"Promoted Headers100",{{"Shift", Int64.Type}, {"Column2", type any}, {"EmpNbr", Int64.Type}, {"Name", type text}, {"Column5", type any}, {"Column6", type any}, {"Extn", type text}, {"Area", type text}, {"Location", type text}, {"Duration", Int64.Type}, {"Arvl", Int64.Type}, {"Lunch", Int64.Type}, {"Dptr", Int64.Type}, {"Total", Int64.Type}, {"Column15", type text}}),
        #"Removed Top Rows100" = Table.Skip(#"Changed Type100",2),
        #"Removed Columns100" = Table.RemoveColumns(#"Removed Top Rows100",{"Shift", "Column2", "Column5", "Column6", "Column15"}),
        #"Removed Blank Rows100" = Table.SelectRows(#"Removed Columns100", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
        #"Adjust Result" = #"Removed Blank Rows100",
    
        //Shift No adjust
        #"Promoted Headers" = Table.PromoteHeaders(#"Filled Down", [PromoteAllScalars=true]),
        #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Shift", Int64.Type}, {"Column2", type any}, {"EmpNbr", Int64.Type}, {"Name", type text}, {"Column5", type any}, {"Column6", type any}, {"Extn", type text}, {"Area", type text}, {"Location", type text}, {"Duration", Int64.Type}, {"Arvl", Int64.Type}, {"Lunch", Int64.Type}, {"Dptr", Int64.Type}, {"Total", Int64.Type}, {"Column15", type text}}),
        #"Removed Top Rows1" = Table.Skip(#"Changed Type",2),
        #"Removed Columns" = Table.RemoveColumns(#"Removed Top Rows1",{"Shift", "Column2", "Column5", "Column6", "Column15"}),
        #"Removed Blank Rows" = Table.SelectRows(#"Removed Columns", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
        #"Shift Result" = #"Removed Blank Rows",
    
        result =
            if ManufacturingTeam= "Adjust"
            then #"Adjust Result"
            else #"Shift Result",
        #"Removed Columns2" = Table.RemoveColumns(result,{"Shift_1"})
    
    
    
    in
        #"Removed Columns2"
in
    Source

Now when I let this run in my original query, it breaks at the invoking of the function saying an error that 1 arguments were passed to a function which expects 2.

I think I have the groundwork set for what I need, but I think I'm not doing the steps in the correct order which is causing the misfiring.

Anyone have any ideas?

Anonymous · ‎03-21-2020

Hello @Anonymous , check out this blog article by PowerPivotPro. This should work for you.

The idea is to find the value in a certain "cell"...and consider that the beginning of the table. In your case it's either A11 or A12. Note that P3's solution only works for 16k rows. This is because the solution "transposes the table" (swaps columns and rows), and Power BI can only support 16k columns.

I wrote a comment midway through that addresses the 16k row limit: search for this text to jump to my comment:

Chris H

August 2, 2019 at 10:31 am

Here's the M code from my comment. this is a function that you can apply to each table to remove the right number of columns and rows.

let
Source =
,Starting_Value = “Label” // change as needed

// these first 3 steps are the same as Mike Ward’s solution
,Add_HeaderCheck_ROW = Table.AddColumn(Source, “Wanted”, each
List.Contains(Record.FieldValues(_), Starting_Value)),
Skip_RowsAboveHeader_ROW = Table.Skip(Add_HeaderCheck_ROW, each
[Wanted] true)
,Remove_HeaderCheck_ROW = Table.RemoveColumns(Skip_RowsAboveHeader_ROW
,{“Wanted”})

// keep only the first row to avoid transpose errors for large tables
,FirstRow_forColumns = Table.FirstN(Remove_HeaderCheck_ROW, 1)
,Transpose = Table.Transpose(FirstRow_forColumns)

// these 2 steps are the same as Mike Ward’s solution
,Add_HeaderCheck_COL = Table.AddColumn(Transpose, “Wanted”, each
List.Contains(Record.FieldValues(_), Starting_Value))
,Skip_RowsAboveHeader_COL = Table.Skip(Add_HeaderCheck_COL, each
[Wanted] true)

// count how many rows the previous step removed…this is how many columns to remove from the un-transposed table
,Rows_Removed = Table.RowCount(Add_HeaderCheck_COL) – Table.RowCount(Skip_RowsAboveHeader_COL)

// uses Datachant’s method of removing the first N columns from a table…without referencing column names
,Remove_NColumns = Table.RemoveColumns(Remove_HeaderCheck_ROW
,List.FirstN( Table.ColumnNames(Remove_HeaderCheck_ROW), Rows_Removed) )
,Headers = Table.PromoteHeaders(Remove_NColumns, [PromoteAllScalars=true])
in
Headers

I also converted this M code into a function that you can call for every table. If you had each file as a separate row, you could do a Table.AddColumn() and create the "proper" table. I need to dig up that code. Let me know if you need it.

Cheers,

~ Chris Haas

http://www.precisiondax.com

View solution in original post

Anonymous · ‎03-21-2020

Hello @Anonymous , check out this blog article by PowerPivotPro. This should work for you.

The idea is to find the value in a certain "cell"...and consider that the beginning of the table. In your case it's either A11 or A12. Note that P3's solution only works for 16k rows. This is because the solution "transposes the table" (swaps columns and rows), and Power BI can only support 16k columns.

I wrote a comment midway through that addresses the 16k row limit: search for this text to jump to my comment:

Chris H

August 2, 2019 at 10:31 am

Here's the M code from my comment. this is a function that you can apply to each table to remove the right number of columns and rows.

let
Source =
,Starting_Value = “Label” // change as needed

// these first 3 steps are the same as Mike Ward’s solution
,Add_HeaderCheck_ROW = Table.AddColumn(Source, “Wanted”, each
List.Contains(Record.FieldValues(_), Starting_Value)),
Skip_RowsAboveHeader_ROW = Table.Skip(Add_HeaderCheck_ROW, each
[Wanted] true)
,Remove_HeaderCheck_ROW = Table.RemoveColumns(Skip_RowsAboveHeader_ROW
,{“Wanted”})

// keep only the first row to avoid transpose errors for large tables
,FirstRow_forColumns = Table.FirstN(Remove_HeaderCheck_ROW, 1)
,Transpose = Table.Transpose(FirstRow_forColumns)

// these 2 steps are the same as Mike Ward’s solution
,Add_HeaderCheck_COL = Table.AddColumn(Transpose, “Wanted”, each
List.Contains(Record.FieldValues(_), Starting_Value))
,Skip_RowsAboveHeader_COL = Table.Skip(Add_HeaderCheck_COL, each
[Wanted] true)

// count how many rows the previous step removed…this is how many columns to remove from the un-transposed table
,Rows_Removed = Table.RowCount(Add_HeaderCheck_COL) – Table.RowCount(Skip_RowsAboveHeader_COL)

// uses Datachant’s method of removing the first N columns from a table…without referencing column names
,Remove_NColumns = Table.RemoveColumns(Remove_HeaderCheck_ROW
,List.FirstN( Table.ColumnNames(Remove_HeaderCheck_ROW), Rows_Removed) )
,Headers = Table.PromoteHeaders(Remove_NColumns, [PromoteAllScalars=true])
in
Headers

I also converted this M code into a function that you can call for every table. If you had each file as a separate row, you could do a Table.AddColumn() and create the "proper" table. I need to dig up that code. Let me know if you need it.

Cheers,

~ Chris Haas

http://www.precisiondax.com

Perform Query Steps on Excel file dynamically based on Dynamic Parameter

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly