Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
gurudayalsahu
Frequent Visitor

Auomated process for the Data cleaning

Hi,

 

I am looking for some type of query or scripting that might help during data cleaning or removing a particular records from a data set. I was exploring the option but not able to find it,

 

The scripting might be helpful to be used in other data set as well.

 

Thanks in advance .

Sahu

 

 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

here the step to remove duplicare rows image.pngand filter out unwanted rows

image.png

 

here the code producted bu GUI in the advanced editor

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3ZHNCsIwEAZfJQSPS91v89tnKT30IBjEIrS+v8GqpKlCz+aU2ctkNl2n4REbp0mrg1qOaZg/LFyQETKhGNQMT4huYRxxFIbN11saL0n3lGVGsE8lnmw5qBmGBLxRzadpfpoEIf40wa2jfBW1YjDF9sX+W1SMO035zaZqWLFSlqzdqqb7mOZhybKBd2YFYldmVYz8d27rug5jms7De4ftv+2wfwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Units Sold" = _t, #" M_Price " = _t, #" S_ Price " = _t, #" G_Sales " = _t, #"  Sales " = _t, #" Profit " = _t, Date = _t, User_Name = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Units Sold", Int64.Type}, {" M_Price ", type text}, {" S_ Price ", type text}, {" G_Sales ", type text}, {"  Sales ", type text}, {" Profit ", type text}, {"Date", type date}, {"User_Name", type text}}),
    #"Removed Duplicates" = Table.Distinct(#"Changed Type"),
    #"Filtered Rows" = Table.SelectRows(#"Removed Duplicates", each ([User_Name] <> "test"))
in
    #"Filtered Rows"

 

 

 

 

View solution in original post

6 REPLIES 6
Anonymous
Not applicable

«Vaste programme...» if you don't give more specific specs!

 

Thanks for the reply.

 

There are more than 4 Tables in the current dataset, where we wanted to get all  records to be removed based on a column name "user_role" having values  as "Test" avaiblle in one of the table named "mobileusers", while all the remaining tables are connected with a column name "User_name"

 

We wanted a have some auotomated process that should help us to get all those test entries removed and should not appear in the existign calculations.

 

Thanks once again for your further suggestion on this.

 

With best,

Sahu

 

 

 

 

 

 

Anonymous
Not applicable

peraphs this function could be useful

 

Table.RemoveMatchingRows(table as table, rows as list, optional equationCriteria as any) as table

 

But let me to insist, that if you want a specific help you have to give a specific spec.

 

You should give some example tables of your input data and describe what you expect, perhaps using a table in addition to the verbal description.

Thanks you both for the reply.

 

I am sharing a data Just to share further details on this requirement.

Units Sold M_Price  S_ Price  G_Sales   Sales  Profit DateUser_Name
1618.5 $       3.00 $    20.00 $   32,370.00 $  32,370.00 $  16,185.001/1/2014pinki
1321 $       3.00 $    20.00 $   26,420.00 $  26,420.00 $  13,210.001/1/2014test
2178 $       3.00 $    15.00 $   32,670.00 $  32,670.00 $  10,890.006/1/2014pinki
888 $       3.00 $    15.00 $   13,320.00 $  13,320.00 $    4,440.006/1/2014sunita
2470 $       3.00 $    15.00 $   37,050.00 $  37,050.00 $  12,350.006/1/2014manisha
2179 $       3.00 $    15.00 $   32,670.00 $  32,670.00 $  10,890.006/1/2014pinki
888 $       3.00 $    15.00 $   13,320.00 $  13,320.00 $    4,440.006/1/2014sunita

 

In the above example, we wanted to remove the data from the user named "Test" and also repeated data.

 

Thanks,

Sahu

 

Anonymous
Not applicable

here the step to remove duplicare rows image.pngand filter out unwanted rows

image.png

 

here the code producted bu GUI in the advanced editor

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3ZHNCsIwEAZfJQSPS91v89tnKT30IBjEIrS+v8GqpKlCz+aU2ctkNl2n4REbp0mrg1qOaZg/LFyQETKhGNQMT4huYRxxFIbN11saL0n3lGVGsE8lnmw5qBmGBLxRzadpfpoEIf40wa2jfBW1YjDF9sX+W1SMO035zaZqWLFSlqzdqqb7mOZhybKBd2YFYldmVYz8d27rug5jms7De4ftv+2wfwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Units Sold" = _t, #" M_Price " = _t, #" S_ Price " = _t, #" G_Sales " = _t, #"  Sales " = _t, #" Profit " = _t, Date = _t, User_Name = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Units Sold", Int64.Type}, {" M_Price ", type text}, {" S_ Price ", type text}, {" G_Sales ", type text}, {"  Sales ", type text}, {" Profit ", type text}, {"Date", type date}, {"User_Name", type text}}),
    #"Removed Duplicates" = Table.Distinct(#"Changed Type"),
    #"Filtered Rows" = Table.SelectRows(#"Removed Duplicates", each ([User_Name] <> "test"))
in
    #"Filtered Rows"

 

 

 

 

Helpful resources

Announcements
PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

Top Solution Authors
Top Kudoed Authors