cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
bushpalaneelima
Advocate III
Advocate III

Data Cleaning- in power bi. Remove repeated tickets

My current table appears as follows.

 

The current file sampleThe current file sample

Expected output

Expected output.JPG

 

 

Any ideas or suggestions on how to get this above output.

1 ACCEPTED SOLUTION
v-deddai1-msft
Community Support
Community Support

Hi @bushpalaneelima ,

 

I just create the M-query depend on the output you provided, Would you please tell us the logic of clean data? Please copy the code to the advanced editor:

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZJLC4JAFIX/i+BOmfuY0VpGEbVp41IkhrRwYyLS729GUzNtIQWXew7zOHxzmTh2tgefgMBHhMDxnF2lr/UZjTsWaa6NEmBohNddQwDXSFTrW5Y2BwgECcQ3r5zEm2S3ZfJsipI2iu2CdPvNuWstEi1GYkEweMa/I/FiJCkIB8/wMxKCD5KhR7LHo1LnhdHTZtp0WVb3xwsI1wJJEHdEKOh7slqS3HwCNdhwnAsBjoj3lS4u2eerV3aU7M5RG9JwltqOUUk5TGpaSfIE", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Ticket number" = _t, Draft = _t, Market = _t, Year = _t, TotalCount = _t, CompletedCount = _t, #"%ofCompleted" = _t, Status = _t, #"Actual Receiving Date" = _t, #"Published Date" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Ticket number", type text}, {"Draft", type text}, {"Market", type text}}),
    #"Filtered Rows1" = Table.SelectRows(#"Changed Type", each ([Year] <> "")),
    #"Replaced Value" = Table.ReplaceValue(#"Filtered Rows1","",null,Replacer.ReplaceValue,{"Draft"}),
    #"Filled Down" = Table.FillDown(#"Replaced Value",{"Draft"}),
    #"Inserted Text After Delimiter" = Table.AddColumn(#"Filled Down", "Text After Delimiter", each Text.AfterDelimiter([Draft], "_"), type text),
    #"Grouped Rows" = Table.Group(#"Inserted Text After Delimiter", {"Ticket number"}, {{"all", each _, type table [Ticket number=text, Draft=text, Market=text, Year=text, TotalCount=text, CompletedCount=text, #"%ofCompleted"=text, Status=text, Actual Receiving Date=text, Published Date=text, Text After Delimiter=text]}, {"max", each List.Max([Text After Delimiter]), type text}}),
    #"Expanded all" = Table.ExpandTableColumn(#"Grouped Rows", "all", {"Ticket number", "Draft", "Market", "Year", "TotalCount", "CompletedCount", "%ofCompleted", "Status", "Actual Receiving Date", "Published Date", "Text After Delimiter"}, {"Ticket number.1", "Draft", "Market", "Year", "TotalCount", "CompletedCount", "%ofCompleted", "Status", "Actual Receiving Date", "Published Date", "Text After Delimiter"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded all", each [Text After Delimiter] = [max]),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Text After Delimiter", "max"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns",{{"Actual Receiving Date", type date}, {"Published Date", type date}})
in
    #"Changed Type1"

 

Untitled picture.png

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Best Regards,

Dedmon Dai

 

View solution in original post

9 REPLIES 9
bushpalaneelima
Advocate III
Advocate III

could this page moved from spanish to english as i do not understand spanish, kind request.

v-deddai1-msft
Community Support
Community Support

Hi @bushpalaneelima ,

 

I just create the M-query depend on the output you provided, Would you please tell us the logic of clean data? Please copy the code to the advanced editor:

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZJLC4JAFIX/i+BOmfuY0VpGEbVp41IkhrRwYyLS729GUzNtIQWXew7zOHxzmTh2tgefgMBHhMDxnF2lr/UZjTsWaa6NEmBohNddQwDXSFTrW5Y2BwgECcQ3r5zEm2S3ZfJsipI2iu2CdPvNuWstEi1GYkEweMa/I/FiJCkIB8/wMxKCD5KhR7LHo1LnhdHTZtp0WVb3xwsI1wJJEHdEKOh7slqS3HwCNdhwnAsBjoj3lS4u2eerV3aU7M5RG9JwltqOUUk5TGpaSfIE", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Ticket number" = _t, Draft = _t, Market = _t, Year = _t, TotalCount = _t, CompletedCount = _t, #"%ofCompleted" = _t, Status = _t, #"Actual Receiving Date" = _t, #"Published Date" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Ticket number", type text}, {"Draft", type text}, {"Market", type text}}),
    #"Filtered Rows1" = Table.SelectRows(#"Changed Type", each ([Year] <> "")),
    #"Replaced Value" = Table.ReplaceValue(#"Filtered Rows1","",null,Replacer.ReplaceValue,{"Draft"}),
    #"Filled Down" = Table.FillDown(#"Replaced Value",{"Draft"}),
    #"Inserted Text After Delimiter" = Table.AddColumn(#"Filled Down", "Text After Delimiter", each Text.AfterDelimiter([Draft], "_"), type text),
    #"Grouped Rows" = Table.Group(#"Inserted Text After Delimiter", {"Ticket number"}, {{"all", each _, type table [Ticket number=text, Draft=text, Market=text, Year=text, TotalCount=text, CompletedCount=text, #"%ofCompleted"=text, Status=text, Actual Receiving Date=text, Published Date=text, Text After Delimiter=text]}, {"max", each List.Max([Text After Delimiter]), type text}}),
    #"Expanded all" = Table.ExpandTableColumn(#"Grouped Rows", "all", {"Ticket number", "Draft", "Market", "Year", "TotalCount", "CompletedCount", "%ofCompleted", "Status", "Actual Receiving Date", "Published Date", "Text After Delimiter"}, {"Ticket number.1", "Draft", "Market", "Year", "TotalCount", "CompletedCount", "%ofCompleted", "Status", "Actual Receiving Date", "Published Date", "Text After Delimiter"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded all", each [Text After Delimiter] = [max]),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Text After Delimiter", "max"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns",{{"Actual Receiving Date", type date}, {"Published Date", type date}})
in
    #"Changed Type1"

 

Untitled picture.png

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Best Regards,

Dedmon Dai

 

View solution in original post

Hi, i did not under stand the language. yes the solution you gave me worked. i modified as per my need and put your solution inside the need. Thank you, i have accepted your solution. if i am missing something, please let me know will do that. i accepted the solution.

@v-deddai1-msft , thank you for your reponse, this worked. i modified based on your solutions for my need.

 

let
    Source = Excel.Workbook(File.Contents("\\mydrive.gsk.com@SSL\DavWWWRoot\personal\ajayvarma_x_saikumar_gsk_com\Documents\Digital_Job_Tracking_ConsumerDigital_2020 - Revised.xlsx"), null, true),
    #"2020_Sheet" = Source{[Item="2020",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(#"2020_Sheet", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Ticket Number", type text}, {"Draft", type text}, {"Veeva #(lf)Channel", type text}, {"Market", type text}, {"Market Contact", type text}, {"Category", type text}, {"Channel", type text}, {"Brand", type text}, {"Sub Brand", type text}, {"Global/#(lf)Local", type text}, {"Name of the Global Master", type text}, {"Global#(lf)Master Year", type text}, {"Total No. of #(lf)Slides in GM", type text}, {"No. of Global Slides #(lf)in localised version", type text}, {"% of Global slides in#(lf)Localised version", type text}, {"No of Local #(lf)Slides(if any)", type text}, {"Zinc Code", type text}, {"Presentation Name on iRep", type text}, {"Job Type", type text}, {"Date of #(lf)Approval", type date}, {"Status", type text}, {"Actual #(lf)Receiving Date", type date}, {"DC Start Date", type date}, {"Delivery Date", type date}, {"Content Launch Date", type date}, {"Designer Name", type text}, {"Designer #(lf)Engg Min", Int64.Type}, {"Slide Count", Int64.Type}, {"Main Folder #(lf)(Directory)", type text}, {"Sub Folder #(lf)(Sub-Directory)", type text}, {"Comments", type text}}),
    #"Filtered Rows2" = Table.SelectRows(#"Changed Type", each ([Ticket Number] <> null)),
    #"Renamed Columns1" = Table.RenameColumns(#"Filtered Rows2",{{"Veeva #(lf)Channel", "Veeva Channel"}, {"Global/#(lf)Local", "Global_or_Local"},{"Global#(lf)Master Year","Global Master Year"} ,{"Total No. of #(lf)Slides in GM", "Total Number of Slides within GM"}, {"No. of Global Slides #(lf)in localised version", "Number of Global Slides with localised version"}, {"% of Global slides in#(lf)Localised version", "Percentage of Global slides with Localised version"}, {"No of Local #(lf)Slides(if any)", "Number of Local Slides"}, {"Date of #(lf)Approval","Date of Approval"} ,{"Actual #(lf)Receiving Date", "Actual Receiving Date"},{"Designer #(lf)Engg Min","Designer Engg Min"},{"Main Folder #(lf)(Directory)","Main Folder"},{"Sub Folder #(lf)(Sub-Directory)","Sub Folder"}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Renamed Columns1", "Draft", "Draft - Copy"),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Duplicated Column", "Draft - Copy", Splitter.SplitTextByDelimiter("_", QuoteStyle.Csv), {"Draft - Copy.1", "Draft - Copy.2"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Draft - Copy.1", type text}, {"Draft - Copy.2", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Draft - Copy.1"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Draft - Copy.2", "DraftNumber"}}),
    #"Filtered Rows" = Table.SelectRows(#"Renamed Columns", each [#"Global Master Year"] <> ""),
    #"Replaced Value" = Table.ReplaceValue(#"Filtered Rows","",null,Replacer.ReplaceValue,{"Draft"}),
    #"Filled Down" = Table.FillDown(#"Replaced Value",{"Draft"}),
    #"Inserted Text After Delimiter" = Table.AddColumn(#"Filled Down", "Text After Delimiter", each Text.AfterDelimiter([Draft], "_"), type text),
    #"Grouped Rows" = Table.Group(#"Inserted Text After Delimiter", {"Ticket Number"}, {{"ALL", each _, type table[Ticket Number=text, Draft=text, Veeva Channel=text , Market=text, Market Contact=text, Category=text, Channel=text, Brand=text, Sub Brand=text, Global_or_Local=text, Name of the Global Master=text, Global Master Year=text, Total Number of Slides within GM=text, Number of Global Slides with localised version=text, Percentage of Global slides with Localised version=text , Number of Local Slides=text, Zinc Code=text, Presentation Name on iRep = text, Job Type=text, Date of Approval = text, Status =  text, Actual Receiving Date=text, DC Start Date = text, Delivery Date=text, Content Launch Date=text, Designer Name=text, Designer Engg Min=text, Slide Count = text, Main Folder = text, Sub Folder = text, Comments = text, Text After Delimiter=text]},{"max", each List.Max([Text After Delimiter]), type text}}),
    #"Expanded ALL" = Table.ExpandTableColumn(#"Grouped Rows", "ALL", {"Ticket Number", "Draft", "Veeva Channel", "Market", "Market Contact", "Category", "Channel", "Brand", "Sub Brand", "Global_or_Local", "Name of the Global Master", "Global Master Year", "Total Number of Slides within GM", "Number of Global Slides with localised version", "Percentage of Global slides with Localised version", "Number of Local Slides", "Zinc Code", "Presentation Name on iRep", "Job Type", "Date of Approval", "Status", "Actual Receiving Date", "DC Start Date", "Delivery Date", "Content Launch Date", "Designer Name", "Designer Engg Min", "Slide Count", "Main Folder", "Sub Folder", "Comments", "Text After Delimiter"}, {"Ticket Number.1", "Draft", "Veeva Channel", "Market", "Market Contact", "Category", "Channel", "Brand", "Sub Brand", "Global_or_Local", "Name of the Global Master", "Global Master Year", "Total Number of Slides within GM", "Number of Global Slides with localised version", "Percentage of Global slides with Localised version", "Number of Local Slides", "Zinc Code", "Presentation Name on iRep", "Job Type", "Date of Approval", "Status", "Actual Receiving Date", "DC Start Date", "Delivery Date", "Content Launch Date", "Designer Name", "Designer Engg Min", "Slide Count", "Main Folder", "Sub Folder", "Comments", "Text After Delimiter"}),
    #"Filtered Rows1" = Table.SelectRows(#"Expanded ALL", each [Text After Delimiter] = [max]),
    #"Changed Type2" = Table.TransformColumnTypes(#"Filtered Rows1",{{"Actual Receiving Date", type date}, {"DC Start Date", type date}, {"Delivery Date", type date}, {"Content Launch Date", type date}})
in
    #"Changed Type2"

@v-deddai1-msft , Thank you for your response, When i copy pasted the code i got error. IT could be casue it is a json file and i am loading an excel file. But looking at the code it does answer the question and provide inputs to the challenge. Thank you for doing that.

 

i have one question in the code you have shared i could not see varible "Name" could it be possiable to inculde?

Hi @bushpalaneelima ,

 

Would you please create a blank query and paste the code in it? 

 

Best Regards,

Dedmon Dai

collinq
Super User II
Super User II

Hi bushpalaneelima,

Is it that you are trying to get the ticket number with the highest Draft number?  I see that you have the same ticket number repeatedly but that the lines are different afterwards. 

 

If my thought is correct, then you can do this by filtering for the Max date or Max draft field for each ticket number.

 

I would appreciate Kudos if my response was helpful. I would also appreciate it if you would Mark this As a Solution if it solved the problem. Thanks!



Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!
Private message me for consulting or training needs.




this is showing in spanish could it be converted into english. please. i need to mark this as sloved. please help.

Hello Sir, First of all Thank you for the help!

 

Answers to the question as follows: 

 

Is it that you are trying to get the ticket number with the highest Draft number?  I see that you have the same ticket number repeatedly but that the lines are different afterwards. Yes sir, you got this point correct. In total 3 variables(Ticket number,Draft,Name) are in the play. Ticket number when resubmited for change it created new draft hence we are considering only the latest draft ticket. Now, the challenge was some of the tickets have multiple jobs the name feilds captures the multiple jobs and the ticket number but as you noticed the draft does not repeat for the name feild of the same ticket.

 

If my thought is correct, then you can do this by filtering for the Max date or Max draft field for each ticket number. I tried doing the same and got inconsistant out put, as date feilds does not have all the dates captured. Max draft is not captured for the multiple jobs feilds.

This data set is extracted from a sytem in excel or csv files.

If you have any suggestion am looking forward, please help.

Thank you once again, kudos to your work sample. it does help. Is there a button i need to hit for kudos let know, i am happy to do that sir.

 

Regards,

Neelima.

Helpful resources

Announcements
MBAS on Demand

2021 Release Wave 2 Plan

Power Platform release plan for the 2021 release wave 2 describes all new features releasing from October 2021 through March 2022.

July 2021 Update 768x460.png

Check it out!

Click here to read more about the July 2021 Updates

Power Query PA Forum 768x460.png

Check it out!

Did you know that you can visit the Power Query Forum in Power BI and now Power Apps

Urdu Hindi D365 Bootcamp 768x460.png

Urdu Hindi D365 Bootcamp

Dont miss our very own April Dunnam’s The Developer Guide to the Galaxy! Find out what the Power Platform has to offer for the traditional developer.

Top Solution Authors
Top Kudoed Authors