Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
I have a problem that I need to solve in Power Query before the data is loaded into Power BI. Whilst I can do the basic things here I have not learnt M yet.
In my data set I have a list of customer actions linked to the same application number, that is static for all actions.
The actions are listed as a status change number, and depending on the actions you can have many numbers relating to the action stages.
Customer Application No Current Status Date Set
Dave 1 200 01/01/01
Geoff 2 200 01/01/01
Dave 1 300 02/01/01
Dave 1 950 02/01/01
Geoff 2 300 02/01/01
Simon 3 200 03/01/01
Geoff 2 400 02/01/01
Dave 4 200 03/01/01
If an application has been undertaken in error, it is changed to status 950. People can add duplicated applications to the system and we cannot delete them only change to 950 created in error
I have filtered the curent status column to remove 950 - However all other status numbers related to the application are still brought through into power BI, giving me duplicate applications that I do not need.
What I want to do is:
If an Application status changes to 950 then remove this application and all rows of data linked to this application in the sheet.
Is there a way to remove all the rows of data linked to the application number if the status code reaches 950?
Thank you for any asssitance you can offer
Solved! Go to Solution.
I'm glad it worked. You should mark it as a solution in case anyone is searching something similar.
You've mostly understood the code, I'll try to explain it step by step
SelectRelevant = Table.SelectColumns(#"Changed Type", {"Application Number", "Current Status"}),
We get only the 2 columns we're going to work with, to reduce data load.
Select950 = Table.SelectRows(SelectRelevant, each [Current Status] = 950),
We filter the relevant columns for only rows where status is 950
Distinct = Table.Distinct(Table.SelectColumns(Select950, {"Application Number"})),
We get only the [Application Number] column, and because we need any [Application Number] just once, we make it distinct.
Remove950 = Table.RemoveColumns(Table.NestedJoin(#"Changed Type", {"Application Number"}, Distinct, {"Application Number"}, "d", JoinKind.LeftAnti), {"d"})
We join our table as it was before these calculations, with the newfound Distinct table, keeping only non-matches (LeftAnti means anything from left table (#"Changed Type") which didn't match the right table (Distinct)), and subsequently we remove the extra column [d].
Hi @DemoFour
You could try editing your query in the advanced editor, adding this:
,SelectRelevant = Table.SelectColumns(PreviousStep, {"Application No", "Current Status"}),
Select950 = Table.SelectRows(SelectRelevant, each [Current Status] = 950),
Distinct = Table.Distinct(Table.SelectColumns(Select950, {"Application No"})),
Remove950 = Table.RemoveColumns(Table.NestedJoin(PreviousStep, {"Application No"}, Distinct, {"Application No"}, "d", JoinKind.LeftAnti), {"d"})
in
Remove950
after the last step and before the 'in'.
PreviousStep should change to the previous step's name.
Cheers
Thank you for your assistance, I get errors if I paste the text into the advanced editor.
I am trying to amend with the column names, but I get a Expression.Error: The import PreviousStep matchs no exxports. Did you miss a module reference?
I am trying to amend the code, but I am not too sure what I am doing is correct.
Can you expline how I would write the statement to remove the rows based on the values, or is it not that simple?
I have now written the code into the editor and not got any errors, but once it is executed I am left with an empty table that has two coloumns Application Number and Current status.
My aim is to just remove the rows that contain the application steps for any application that has reached 950 and leave all the data from the output in the table untouched.
Sorry I did not see your reply.
This is what I have got to, sorry I have not wrote this before so its a steep learning curve
let
Source = Excel.Workbook(File.Contents("S:\... Application Status Changes.xlsx"), null, true),
#"IAG - HRA Application Status Ch_Sheet" = Source{[Item="IAG - HRA Application Status Ch",Kind="Sheet"]}[Data],
#"Removed Blank Rows" = Table.SelectRows(#"IAG - HRA Application Status Ch_Sheet", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
#"Removed Top Rows" = Table.Skip(#"Removed Blank Rows",1),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
#"Removed Columns" = Table.RemoveColumns(#"Promoted Headers",{"Column2", "Column8", "wl cat", "res pers", "date app", "app sname"}),
#"Duplicated Column" = Table.DuplicateColumn(#"Removed Columns", "date time", "date time - Copy"),
#"Reordered Columns" = Table.ReorderColumns(#"Duplicated Column",{"date time", "date time - Copy", "dt change", "date upd", "pri date", "date enq", "date set", "clnt no", "app no", "init by", "wl id", "app name", "curr status", "method cd", "prior cd", "source cd", "rej reas", "date rej"}),
#"Renamed Columns" = Table.RenameColumns(#"Reordered Columns",{{"dt change", "Date Change"}, {"date upd", "Date Updated"}, {"pri date", "Priority Date"}, {"date enq", "Date Enquiry"}, {"app no", "Application Number"}, {"init by", "Staff Member"}, {"wl id", "Waiting List ID"}, {"app name", "Application Name"}, {"curr status", "Current Status"}, {"method cd", "Referral Method code"}, {"prior cd", "Priorty code"}, {"source cd", "Referral Source code"}, {"rej reas", "End of Service Reason"}, {"date rej", "Date Rejection"}, {"date time", "date time - Date"}, {"date time - Copy", "date time - Time"}, {"clnt no", "Client Number"}, {"date set", "Date Set"}}),
#"Inserted Merged Column" = Table.AddColumn(#"Renamed Columns", "Client Application Number", each Text.Combine({Text.From([Client Number], "en-GB"), Text.From([Application Number], "en-GB")}), type text),
#"Reordered Columns1" = Table.ReorderColumns(#"Inserted Merged Column",{"date time - Date", "date time - Time", "Date Change", "Date Updated", "Priority Date", "Date Enquiry", "Date Set", "Client Application Number", "Client Number", "Application Number", "Staff Member", "Waiting List ID", "Application Name", "Current Status", "Referral Method code", "Priorty code", "Referral Source code", "End of Service Reason", "Date Rejection"}),
#"Changed Type" = Table.TransformColumnTypes(#"Reordered Columns1",{{"date time - Date", type date}, {"date time - Time", type time}, {"Date Change", type date}, {"Date Updated", type date}, {"Priority Date", type date}, {"Date Enquiry", type date}, {"Date Set", type date}, {"Client Application Number", Int64.Type}, {"Client Number", Int64.Type}, {"Application Number", Int64.Type}, {"Current Status", Int64.Type}, {"Date Rejection", type date}}),
#"Select Relevent" = Table.SelectColumns(#"Changed Type", {"Application Number", "Current Status"}),
#"Select950" = Table.SelectRows(#"Select Relevent", each [Current Status] = 950),
#"Distinct" = Table.Distinct(Table.SelectColumns(Select950, {"Application Number"})),
#"Remove950" = Table.RemoveColumns(Table.NestedJoin(#"Select950", {"Application Number"}, Distinct, {"Application Number"}, "d", JoinKind.LeftAnti), {"d"})
in
#"Remove950"
I have written them as seperate lines, Have I missuderstood?
Not bad for a first timer 🙂
Your only issue was the last step, you referenced a different table.
try this:
let
Source = Excel.Workbook(File.Contents("S:\... Application Status Changes.xlsx"), null, true),
#"IAG - HRA Application Status Ch_Sheet" = Source{[Item="IAG - HRA Application Status Ch",Kind="Sheet"]}[Data],
#"Removed Blank Rows" = Table.SelectRows(#"IAG - HRA Application Status Ch_Sheet", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
#"Removed Top Rows" = Table.Skip(#"Removed Blank Rows",1),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
#"Removed Columns" = Table.RemoveColumns(#"Promoted Headers",{"Column2", "Column8", "wl cat", "res pers", "date app", "app sname"}),
#"Duplicated Column" = Table.DuplicateColumn(#"Removed Columns", "date time", "date time - Copy"),
#"Reordered Columns" = Table.ReorderColumns(#"Duplicated Column",{"date time", "date time - Copy", "dt change", "date upd", "pri date", "date enq", "date set", "clnt no", "app no", "init by", "wl id", "app name", "curr status", "method cd", "prior cd", "source cd", "rej reas", "date rej"}),
#"Renamed Columns" = Table.RenameColumns(#"Reordered Columns",{{"dt change", "Date Change"}, {"date upd", "Date Updated"}, {"pri date", "Priority Date"}, {"date enq", "Date Enquiry"}, {"app no", "Application Number"}, {"init by", "Staff Member"}, {"wl id", "Waiting List ID"}, {"app name", "Application Name"}, {"curr status", "Current Status"}, {"method cd", "Referral Method code"}, {"prior cd", "Priorty code"}, {"source cd", "Referral Source code"}, {"rej reas", "End of Service Reason"}, {"date rej", "Date Rejection"}, {"date time", "date time - Date"}, {"date time - Copy", "date time - Time"}, {"clnt no", "Client Number"}, {"date set", "Date Set"}}),
#"Inserted Merged Column" = Table.AddColumn(#"Renamed Columns", "Client Application Number", each Text.Combine({Text.From([Client Number], "en-GB"), Text.From([Application Number], "en-GB")}), type text),
#"Reordered Columns1" = Table.ReorderColumns(#"Inserted Merged Column",{"date time - Date", "date time - Time", "Date Change", "Date Updated", "Priority Date", "Date Enquiry", "Date Set", "Client Application Number", "Client Number", "Application Number", "Staff Member", "Waiting List ID", "Application Name", "Current Status", "Referral Method code", "Priorty code", "Referral Source code", "End of Service Reason", "Date Rejection"}),
#"Changed Type" = Table.TransformColumnTypes(#"Reordered Columns1",{{"date time - Date", type date}, {"date time - Time", type time}, {"Date Change", type date}, {"Date Updated", type date}, {"Priority Date", type date}, {"Date Enquiry", type date}, {"Date Set", type date}, {"Client Application Number", Int64.Type}, {"Client Number", Int64.Type}, {"Application Number", Int64.Type}, {"Current Status", Int64.Type}, {"Date Rejection", type date}}),
#"Select Relevent" = Table.SelectColumns(#"Changed Type", {"Application Number", "Current Status"}),
#"Select950" = Table.SelectRows(#"Select Relevent", each [Current Status] = 950),
#"Distinct" = Table.Distinct(Table.SelectColumns(#"Select950", {"Application Number"})),
#"Remove950" = Table.RemoveColumns(Table.NestedJoin(#"Changed Type", {"Application Number"}, #"Distinct", {"Application Number"}, "d", JoinKind.LeftAnti), {"d"})
in
#"Remove950"
Thank you, I have rewritten as described and got this
let
Source = Excel.Workbook(File.Contents("S:.....IAG - HRA Application Status Changes.xlsx"), null, true),
#"IAG - HRA Application Status Ch_Sheet" = Source{[Item="IAG - HRA Application Status Ch",Kind="Sheet"]}[Data],
#"Removed Blank Rows" = Table.SelectRows(#"IAG - HRA Application Status Ch_Sheet", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
#"Removed Top Rows" = Table.Skip(#"Removed Blank Rows",1),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
#"Removed Columns" = Table.RemoveColumns(#"Promoted Headers",{"Column2", "Column8", "wl cat", "res pers", "date app", "app sname"}),
#"Duplicated Column" = Table.DuplicateColumn(#"Removed Columns", "date time", "date time - Copy"),
#"Reordered Columns" = Table.ReorderColumns(#"Duplicated Column",{"date time", "date time - Copy", "dt change", "date upd", "pri date", "date enq", "date set", "clnt no", "app no", "init by", "wl id", "app name", "curr status", "method cd", "prior cd", "source cd", "rej reas", "date rej"}),
#"Renamed Columns" = Table.RenameColumns(#"Reordered Columns",{{"dt change", "Date Change"}, {"date upd", "Date Updated"}, {"pri date", "Priority Date"}, {"date enq", "Date Enquiry"}, {"app no", "Application Number"}, {"init by", "Staff Member"}, {"wl id", "Waiting List ID"}, {"app name", "Application Name"}, {"curr status", "Current Status"}, {"method cd", "Referral Method code"}, {"prior cd", "Priorty code"}, {"source cd", "Referral Source code"}, {"rej reas", "End of Service Reason"}, {"date rej", "Date Rejection"}, {"date time", "date time - Date"}, {"date time - Copy", "date time - Time"}, {"clnt no", "Client Number"}, {"date set", "Date Set"}}),
#"Inserted Merged Column" = Table.AddColumn(#"Renamed Columns", "Client Application Number", each Text.Combine({Text.From([Client Number], "en-GB"), Text.From([Application Number], "en-GB")}), type text),
#"Reordered Columns1" = Table.ReorderColumns(#"Inserted Merged Column",{"date time - Date", "date time - Time", "Date Change", "Date Updated", "Priority Date", "Date Enquiry", "Date Set", "Client Application Number", "Client Number", "Application Number", "Staff Member", "Waiting List ID", "Application Name", "Current Status", "Referral Method code", "Priorty code", "Referral Source code", "End of Service Reason", "Date Rejection"}),
#"Changed Type" = Table.TransformColumnTypes(#"Reordered Columns1",{{"date time - Date", type date}, {"date time - Time", type time}, {"Date Change", type date}, {"Date Updated", type date}, {"Priority Date", type date}, {"Date Enquiry", type date}, {"Date Set", type date}, {"Client Application Number", Int64.Type}, {"Client Number", Int64.Type}, {"Application Number", Int64.Type}, {"Current Status", Int64.Type}, {"Date Rejection", type date}}),SelectRelevant = Table.SelectColumns(#"Changed Type", {"Application Number", "Current Status"}),
Select950 = Table.SelectRows(SelectRelevant, each [Current Status] = 950),
Distinct = Table.Distinct(Table.SelectColumns(Select950, {"Application Number"})),
Remove950 = Table.RemoveColumns(Table.NestedJoin(#"Changed Type", {"Application Number"}, Distinct, {"Application Number"}, "d", JoinKind.LeftAnti), {"d"})
in
Remove950
The table is all intact and I am just checking that it removes what I need it to.
Thank you so much for your time and patience
It worked thank you, it removed the two applications created in error. So will now remove all new data errors from the data.
to work this backwards as I do not know M
Select950 used Table.SelectRows(SelectRelevent each = 950 This is used to look in the relevent column and select the specifiyed value and select each one of them
Distinct used Table.Distinct(Table.SelectColumns(Select950 Did this then look for all the distinct Application numbers linked to the 950 row ?
Remove950 - I am not too sure here - but this is using the distinct application number to remove all rows associated with the 950 code and then removing all the associated application number rows??
thank you again for your time and patience today
I'm glad it worked. You should mark it as a solution in case anyone is searching something similar.
You've mostly understood the code, I'll try to explain it step by step
SelectRelevant = Table.SelectColumns(#"Changed Type", {"Application Number", "Current Status"}),
We get only the 2 columns we're going to work with, to reduce data load.
Select950 = Table.SelectRows(SelectRelevant, each [Current Status] = 950),
We filter the relevant columns for only rows where status is 950
Distinct = Table.Distinct(Table.SelectColumns(Select950, {"Application Number"})),
We get only the [Application Number] column, and because we need any [Application Number] just once, we make it distinct.
Remove950 = Table.RemoveColumns(Table.NestedJoin(#"Changed Type", {"Application Number"}, Distinct, {"Application Number"}, "d", JoinKind.LeftAnti), {"d"})
We join our table as it was before these calculations, with the newfound Distinct table, keeping only non-matches (LeftAnti means anything from left table (#"Changed Type") which didn't match the right table (Distinct)), and subsequently we remove the extra column [d].
Thank you for all your help today and your explanation, it makes sense and is easy to follow.
That's why I wrote that PreviousStep should be renamed to your previous step.
Let me try to give an example.When you open the advanced editor, you should see something ending like this:
...
#"Changed Type" = ....
in
#"Changed Type"
adding the new code, it should look like this:
...
#"Changed Type" = ....
,SelectRelevant = Table.SelectColumns(#"Changed Type", {"Application No", "Current Status"}),
Select950 = Table.SelectRows(SelectRelevant, each [Current Status] = 950),
Distinct = Table.Distinct(Table.SelectColumns(Select950, {"Application No"})),
Remove950 = Table.RemoveColumns(Table.NestedJoin(#"Changed Type", {"Application No"}, Distinct, {"Application No"}, "d", JoinKind.LeftAnti), {"d"})
in
Remove950
Is that better?