Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
So i'm importing a bunch of Excel files from SharePoint. For some reason, the underlying data structure is a bit strange. Each Excel file has column headers starting on row 12. I have confirmed this by going into each file. However, when imported into Power BI some files have column headers starting on row 11 and some on row 12. These files are generated from an automated job. Some files have the sheet name as the date the file was created and some files just have Sheet1 as the date. This makes me think that some underlying data is in csv format if the sheet name has the date created instead of Sheet1?
That's the root of the problem.
My solution was to somehow develop a way in Power BI to look at each incoming Excel file and see if the column headers start on row 11 then perform query steps A and if they come in on row 12 then perform Query Steps B.
1.) I duplicated the transform sample file step and converted it to a list to identify which row the column headers are coming in at and assigned it as a parameter:
let
Source = Excel.Workbook(Parameter2, null, true),
FirstSheet = Table.SelectRows(Source, each [Kind] = "Sheet"){0}[Data],
#"Removed Top Rows" = Table.Skip(FirstSheet,10),
#"Duplicated Column" = Table.DuplicateColumn(#"Removed Top Rows", "Column1", "Column1 - Copy"),
#"Replaced Value" = Table.ReplaceValue(#"Duplicated Column",null,"",Replacer.ReplaceValue,{"Column1 - Copy"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each if Text.Contains([#"Column1 - Copy"], "Shift") then [#"Column1 - Copy"] else "Adjust"),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Column1 - Copy"}),
#"Filled Down" = Table.FillDown(#"Removed Columns1",{"Custom"}),
#"Removed Other Columns" = Table.SelectColumns(#"Filled Down",{"Custom"}),
#"Kept First Rows" = Table.FirstN(#"Removed Other Columns",1),
Custom = #"Kept First Rows"[Custom]
in
Custom
Parameter:
"Shift" meta [IsParameterQuery=true, ExpressionIdentifier=#"Transform Sample File (3)", Type="Text", IsParameterQueryRequired=true]
2.) In the original transform Sample file, do conditional branching to do Query Steps A or B based on the parameter setup previously:
let
Source = Excel.Workbook(Parameter2, null, true),
FirstSheet = Table.SelectRows(Source, each [Kind] = "Sheet"){0}[Data],
#"Removed Top Rows" = Table.Skip(FirstSheet,10),
#"Duplicated Column" = Table.DuplicateColumn(#"Removed Top Rows", "Column1", "Column1 - Copy"),
#"Replaced Value" = Table.ReplaceValue(#"Duplicated Column",null,"",Replacer.ReplaceValue,{"Column1 - Copy"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each if Text.Contains([#"Column1 - Copy"], "Shift") then [#"Column1 - Copy"] else "Adjust"),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Column1 - Copy"}),
#"Filled Down" = Table.FillDown(#"Removed Columns1",{"Custom"}),
//Adjust
#"Removed Top Rows1000" = Table.Skip(#"Filled Down",1),
#"Promoted Headers100" = Table.PromoteHeaders(#"Removed Top Rows1000", [PromoteAllScalars=true]),
#"Changed Type100" = Table.TransformColumnTypes(#"Promoted Headers100",{{"Shift", Int64.Type}, {"Column2", type any}, {"EmpNbr", Int64.Type}, {"Name", type text}, {"Column5", type any}, {"Column6", type any}, {"Extn", type text}, {"Area", type text}, {"Location", type text}, {"Duration", Int64.Type}, {"Arvl", Int64.Type}, {"Lunch", Int64.Type}, {"Dptr", Int64.Type}, {"Total", Int64.Type}, {"Column15", type text}}),
#"Removed Top Rows100" = Table.Skip(#"Changed Type100",2),
#"Removed Columns100" = Table.RemoveColumns(#"Removed Top Rows100",{"Shift", "Column2", "Column5", "Column6", "Column15"}),
#"Removed Blank Rows100" = Table.SelectRows(#"Removed Columns100", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
#"Adjust Result" = #"Removed Blank Rows100",
//Shift No adjust
#"Promoted Headers" = Table.PromoteHeaders(#"Filled Down", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Shift", Int64.Type}, {"Column2", type any}, {"EmpNbr", Int64.Type}, {"Name", type text}, {"Column5", type any}, {"Column6", type any}, {"Extn", type text}, {"Area", type text}, {"Location", type text}, {"Duration", Int64.Type}, {"Arvl", Int64.Type}, {"Lunch", Int64.Type}, {"Dptr", Int64.Type}, {"Total", Int64.Type}, {"Column15", type text}}),
#"Removed Top Rows1" = Table.Skip(#"Changed Type",2),
#"Removed Columns" = Table.RemoveColumns(#"Removed Top Rows1",{"Shift", "Column2", "Column5", "Column6", "Column15"}),
#"Removed Blank Rows" = Table.SelectRows(#"Removed Columns", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
#"Shift Result" = #"Removed Blank Rows",
result =
if ManufacturingTeam= "Adjust"
then #"Adjust Result"
else #"Shift Result",
#"Removed Columns2" = Table.RemoveColumns(result,{"Shift_1"})
in
#"Removed Columns2"
Added this also to the transform file function:
let
Source = (Parameter2 as binary, ManufacturingTeam as text) => let
Source = Excel.Workbook(Parameter2, null, true),
FirstSheet = Table.SelectRows(Source, each [Kind] = "Sheet"){0}[Data],
#"Removed Top Rows" = Table.Skip(FirstSheet,10),
#"Duplicated Column" = Table.DuplicateColumn(#"Removed Top Rows", "Column1", "Column1 - Copy"),
#"Replaced Value" = Table.ReplaceValue(#"Duplicated Column",null,"",Replacer.ReplaceValue,{"Column1 - Copy"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each if Text.Contains([#"Column1 - Copy"], "Shift") then [#"Column1 - Copy"] else "Adjust"),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Column1 - Copy"}),
#"Filled Down" = Table.FillDown(#"Removed Columns1",{"Custom"}),
//Adjust
#"Removed Top Rows1000" = Table.Skip(#"Filled Down",1),
#"Promoted Headers100" = Table.PromoteHeaders(#"Removed Top Rows1000", [PromoteAllScalars=true]),
#"Changed Type100" = Table.TransformColumnTypes(#"Promoted Headers100",{{"Shift", Int64.Type}, {"Column2", type any}, {"EmpNbr", Int64.Type}, {"Name", type text}, {"Column5", type any}, {"Column6", type any}, {"Extn", type text}, {"Area", type text}, {"Location", type text}, {"Duration", Int64.Type}, {"Arvl", Int64.Type}, {"Lunch", Int64.Type}, {"Dptr", Int64.Type}, {"Total", Int64.Type}, {"Column15", type text}}),
#"Removed Top Rows100" = Table.Skip(#"Changed Type100",2),
#"Removed Columns100" = Table.RemoveColumns(#"Removed Top Rows100",{"Shift", "Column2", "Column5", "Column6", "Column15"}),
#"Removed Blank Rows100" = Table.SelectRows(#"Removed Columns100", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
#"Adjust Result" = #"Removed Blank Rows100",
//Shift No adjust
#"Promoted Headers" = Table.PromoteHeaders(#"Filled Down", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Shift", Int64.Type}, {"Column2", type any}, {"EmpNbr", Int64.Type}, {"Name", type text}, {"Column5", type any}, {"Column6", type any}, {"Extn", type text}, {"Area", type text}, {"Location", type text}, {"Duration", Int64.Type}, {"Arvl", Int64.Type}, {"Lunch", Int64.Type}, {"Dptr", Int64.Type}, {"Total", Int64.Type}, {"Column15", type text}}),
#"Removed Top Rows1" = Table.Skip(#"Changed Type",2),
#"Removed Columns" = Table.RemoveColumns(#"Removed Top Rows1",{"Shift", "Column2", "Column5", "Column6", "Column15"}),
#"Removed Blank Rows" = Table.SelectRows(#"Removed Columns", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
#"Shift Result" = #"Removed Blank Rows",
result =
if ManufacturingTeam= "Adjust"
then #"Adjust Result"
else #"Shift Result",
#"Removed Columns2" = Table.RemoveColumns(result,{"Shift_1"})
in
#"Removed Columns2"
in
Source
Now when I let this run in my original query, it breaks at the invoking of the function saying an error that 1 arguments were passed to a function which expects 2.
I think I have the groundwork set for what I need, but I think I'm not doing the steps in the correct order which is causing the misfiring.
Anyone have any ideas?
Solved! Go to Solution.
Hello @Anonymous , check out this blog article by PowerPivotPro. This should work for you.
The idea is to find the value in a certain "cell"...and consider that the beginning of the table. In your case it's either A11 or A12. Note that P3's solution only works for 16k rows. This is because the solution "transposes the table" (swaps columns and rows), and Power BI can only support 16k columns.
I wrote a comment midway through that addresses the 16k row limit: search for this text to jump to my comment:
let
Source =
,Starting_Value = “Label” // change as needed
// these first 3 steps are the same as Mike Ward’s solution
,Add_HeaderCheck_ROW = Table.AddColumn(Source, “Wanted”, each
List.Contains(Record.FieldValues(_), Starting_Value)),
Skip_RowsAboveHeader_ROW = Table.Skip(Add_HeaderCheck_ROW, each
[Wanted] true)
,Remove_HeaderCheck_ROW = Table.RemoveColumns(Skip_RowsAboveHeader_ROW
,{“Wanted”})
// keep only the first row to avoid transpose errors for large tables
,FirstRow_forColumns = Table.FirstN(Remove_HeaderCheck_ROW, 1)
,Transpose = Table.Transpose(FirstRow_forColumns)
// these 2 steps are the same as Mike Ward’s solution
,Add_HeaderCheck_COL = Table.AddColumn(Transpose, “Wanted”, each
List.Contains(Record.FieldValues(_), Starting_Value))
,Skip_RowsAboveHeader_COL = Table.Skip(Add_HeaderCheck_COL, each
[Wanted] true)
// count how many rows the previous step removed…this is how many columns to remove from the un-transposed table
,Rows_Removed = Table.RowCount(Add_HeaderCheck_COL) – Table.RowCount(Skip_RowsAboveHeader_COL)
// uses Datachant’s method of removing the first N columns from a table…without referencing column names
,Remove_NColumns = Table.RemoveColumns(Remove_HeaderCheck_ROW
,List.FirstN( Table.ColumnNames(Remove_HeaderCheck_ROW), Rows_Removed) )
,Headers = Table.PromoteHeaders(Remove_NColumns, [PromoteAllScalars=true])
in
Headers
I also converted this M code into a function that you can call for every table. If you had each file as a separate row, you could do a Table.AddColumn() and create the "proper" table. I need to dig up that code. Let me know if you need it.
Cheers,
~ Chris Haas
Hello @Anonymous , check out this blog article by PowerPivotPro. This should work for you.
The idea is to find the value in a certain "cell"...and consider that the beginning of the table. In your case it's either A11 or A12. Note that P3's solution only works for 16k rows. This is because the solution "transposes the table" (swaps columns and rows), and Power BI can only support 16k columns.
I wrote a comment midway through that addresses the 16k row limit: search for this text to jump to my comment:
let
Source =
,Starting_Value = “Label” // change as needed
// these first 3 steps are the same as Mike Ward’s solution
,Add_HeaderCheck_ROW = Table.AddColumn(Source, “Wanted”, each
List.Contains(Record.FieldValues(_), Starting_Value)),
Skip_RowsAboveHeader_ROW = Table.Skip(Add_HeaderCheck_ROW, each
[Wanted] true)
,Remove_HeaderCheck_ROW = Table.RemoveColumns(Skip_RowsAboveHeader_ROW
,{“Wanted”})
// keep only the first row to avoid transpose errors for large tables
,FirstRow_forColumns = Table.FirstN(Remove_HeaderCheck_ROW, 1)
,Transpose = Table.Transpose(FirstRow_forColumns)
// these 2 steps are the same as Mike Ward’s solution
,Add_HeaderCheck_COL = Table.AddColumn(Transpose, “Wanted”, each
List.Contains(Record.FieldValues(_), Starting_Value))
,Skip_RowsAboveHeader_COL = Table.Skip(Add_HeaderCheck_COL, each
[Wanted] true)
// count how many rows the previous step removed…this is how many columns to remove from the un-transposed table
,Rows_Removed = Table.RowCount(Add_HeaderCheck_COL) – Table.RowCount(Skip_RowsAboveHeader_COL)
// uses Datachant’s method of removing the first N columns from a table…without referencing column names
,Remove_NColumns = Table.RemoveColumns(Remove_HeaderCheck_ROW
,List.FirstN( Table.ColumnNames(Remove_HeaderCheck_ROW), Rows_Removed) )
,Headers = Table.PromoteHeaders(Remove_NColumns, [PromoteAllScalars=true])
in
Headers
I also converted this M code into a function that you can call for every table. If you had each file as a separate row, you could do a Table.AddColumn() and create the "proper" table. I need to dig up that code. Let me know if you need it.
Cheers,
~ Chris Haas
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
117 | |
104 | |
77 | |
73 | |
52 |
User | Count |
---|---|
145 | |
109 | |
109 | |
90 | |
64 |