Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
howie2293
Frequent Visitor

Remove duplicate Rows within a timeframe of the original entry

I have idientified on my data set that there are duplicate entries being entered creates. within 5 mintues of the orginal entry. Is there a way to be able remove the duplicate rows within that is entered within that 5 min time frame. It does have Each row does have the a feild that does has the same feild called case_ID and the time feild is SSL_Event_time giveing both the date and time.

 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

my  shortest solution

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("1ZhLbhRBDIavgrIG4WfZ7luwRiw5AvfHPV094x5NGRRYAIpQEumLn7/tmq9f31DdCAFC3j6+ffnx/cf3D5TfAX4G+0xA8AFgg/1X6OGa/719+3hiYuYNRjcsIFh8FGy4UIMx3qz5MEUsmIFzh01rIEHFGmFQh0kc1oyNpWBBAUsMN/QztuHFSZZMUoPJHbMoTgYrPqzhE2Yb2IwtwumCsf8eZsXJCDRtsEzf4aQKPVLCSMNwiflGt0wap4tuBWO/5fh1SnxjPZqLWbJQ/xnmZNSkRA/heABqTUnAkA4bE/OwghEwxRKLDXhizFAwHSOWMsXsSTjKPSALXLAQXSogMcXZJaZFOJy2eJlJpI1kWhs+/IEZOS2FkxjP2EgQR8HYo8XOVja3KJioLAuwY6dMLQgq5tBiMeckqxUnHZg7J/VoLt3HpFZs4HJO7tiMTdmsFGBX37puvMFZN4eaSc9pgh02x6vn7IIV9pySNZYaGL/E3DyIdIWtnPzXMX2NieJyUe3YeCnT3Mmw3AE7Zq+tXbAX1t6JxbswxPdh4zWWl0OXSZwKIMMoIyi/uBPOrTqJIcOlAM7SCQdn3W5LoGLOa3UndhSAQSkKFrTPiDVG5+RyHGXgpUibWcJzd++zJGdXwQatz5nEjrW4WyOoWK4B6zA/5mQMk/GJKyjSSecCVsywqxzH3U2vbrqM9ariTWafjKByiAqARhedjOlkxjI+XcARrT0rYMFwP58b7F5yYarYwOVRf8EYoWAEbe3mvZx/G6hao2hzeVwLO4blNhEQoq5yKmdKYvgll8JtyS/gBYtuNEws23mgls6U/Dk6levMJg0sSkgseD3BZIN5e6XKJS7x5WtnLdgnsGLD1oKVufzzbcXEteipoHXRZa6svVcGFGsItL6ad2yOPst8So0Ogce6ek9gwfLfunqJHeMhZ52MEh2StdaQ7tbi0mRIzUv1GSyYQrMR5L4RzPNavthTHGuly9wJE6zYkK5651sVNFNTMFfqMNITEyvZpHxNrk832Xh2Zt7OowwIImlzyXy2WNAF07YzCyZFd9nd0unuvrgyPKtYtxHk/pzINsRSAJI+k/f9E3BxUjG6lBwPMzdkQq6YNOeN3NdWWmP9C5hqF9uBmYuYFw1QXg7r7ZPY/EhJ9mutYKbaFUAeg8gumGPr5KPcVpvLgVonZ7nzIUxYMfRunpxY5EOsCqc9FGWu/5uToX+ORb9AJF5mkjPSTjjnq9+JykdKiY31J4g7dso0kO3Psdyx3ZZTOWPzWu79Bu4U8MDylKqYaVcAfTTXjn37CQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ACT_ENTRY_OBJID = _t, ACTION_TO = _t, datetime = _t, Id = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ACT_ENTRY_OBJID", Int64.Type}, {"ACTION_TO", type text}, {"datetime", type datetime}, {"Id", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Id"}, {{"count", each Table.RowCount(_), type number}, {"all", each Table.Group(_, {"datetime"}, {"CLOSE", (c)=> Table.First(c)},GroupKind.Local,(x,y)=>Number.From(Duration.Minutes(y[datetime]-x[datetime])>5))}}),
    #"Expanded all" = Table.ExpandTableColumn(#"Grouped Rows", "all", {"datetime", "CLOSE"}, {"all.datetime", "all.CLOSE"}),
    #"Expanded all.CLOSE" = Table.ExpandRecordColumn(#"Expanded all", "all.CLOSE", {"ACT_ENTRY_OBJID", "ACTION_TO"}, {"all.CLOSE.ACT_ENTRY_OBJID", "all.CLOSE.ACTION_TO"})
in
    #"Expanded all.CLOSE"

View solution in original post

7 REPLIES 7
Mariusz
Community Champion
Community Champion

Hi @howie2293 

 

Sure you can use the below code or see the attached.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUTIw1jew0DcyMDJQMDSyMjBQitXBI5EEkjAkRSIZJGFEtEQsAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Id = _t, datetime = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"datetime", type datetime}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Id"}, {{"rows", each _, type table [Id=nullable text, datetime=nullable datetime]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", 
        each 
            if 
                List.Contains( [rows][datetime], #date( 2020, 8, 2 ) & #time( 12, 0, 0 ) ) 
            then 
                Table.Distinct( [rows] ) 
            else 
                [rows]),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"datetime"}, {"datetime"})
in
    #"Expanded Custom"

 If you need somthing more specific to your scenerio, please provide a sample.

 

Best Regards,
Mariusz

If this post helps, then please consider Accepting it as the solution.

Please feel free to connect with me.
LinkedIn

 

Hi @Mariusz

 

Thankyou for your answer. Unfortunatly it close to what I am after but not 100%. What I am needed to produce from the data itself is to be able to show trends over days and months Showing the traffic going into certian queue. While taking out the upto an hour duplicate of when i have someone resubmitting during that time frame.

 

Looking at what you have produced It gives a slight time frame that I able to produce. But it locks this down to the group coloum taking the Event identifier out of the equation.

 

As requested, I have thrown in the link below with some example data.

https://drive.google.com/file/d/103-Yeb0Suu7B39V-9TtqJlcBlt0AJ0wQ/view?usp=sharing 

 

 

 

 

Anonymous
Not applicable

my  shortest solution

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("1ZhLbhRBDIavgrIG4WfZ7luwRiw5AvfHPV094x5NGRRYAIpQEumLn7/tmq9f31DdCAFC3j6+ffnx/cf3D5TfAX4G+0xA8AFgg/1X6OGa/719+3hiYuYNRjcsIFh8FGy4UIMx3qz5MEUsmIFzh01rIEHFGmFQh0kc1oyNpWBBAUsMN/QztuHFSZZMUoPJHbMoTgYrPqzhE2Yb2IwtwumCsf8eZsXJCDRtsEzf4aQKPVLCSMNwiflGt0wap4tuBWO/5fh1SnxjPZqLWbJQ/xnmZNSkRA/heABqTUnAkA4bE/OwghEwxRKLDXhizFAwHSOWMsXsSTjKPSALXLAQXSogMcXZJaZFOJy2eJlJpI1kWhs+/IEZOS2FkxjP2EgQR8HYo8XOVja3KJioLAuwY6dMLQgq5tBiMeckqxUnHZg7J/VoLt3HpFZs4HJO7tiMTdmsFGBX37puvMFZN4eaSc9pgh02x6vn7IIV9pySNZYaGL/E3DyIdIWtnPzXMX2NieJyUe3YeCnT3Mmw3AE7Zq+tXbAX1t6JxbswxPdh4zWWl0OXSZwKIMMoIyi/uBPOrTqJIcOlAM7SCQdn3W5LoGLOa3UndhSAQSkKFrTPiDVG5+RyHGXgpUibWcJzd++zJGdXwQatz5nEjrW4WyOoWK4B6zA/5mQMk/GJKyjSSecCVsywqxzH3U2vbrqM9ariTWafjKByiAqARhedjOlkxjI+XcARrT0rYMFwP58b7F5yYarYwOVRf8EYoWAEbe3mvZx/G6hao2hzeVwLO4blNhEQoq5yKmdKYvgll8JtyS/gBYtuNEws23mgls6U/Dk6levMJg0sSkgseD3BZIN5e6XKJS7x5WtnLdgnsGLD1oKVufzzbcXEteipoHXRZa6svVcGFGsItL6ad2yOPst8So0Ogce6ek9gwfLfunqJHeMhZ52MEh2StdaQ7tbi0mRIzUv1GSyYQrMR5L4RzPNavthTHGuly9wJE6zYkK5651sVNFNTMFfqMNITEyvZpHxNrk832Xh2Zt7OowwIImlzyXy2WNAF07YzCyZFd9nd0unuvrgyPKtYtxHk/pzINsRSAJI+k/f9E3BxUjG6lBwPMzdkQq6YNOeN3NdWWmP9C5hqF9uBmYuYFw1QXg7r7ZPY/EhJ9mutYKbaFUAeg8gumGPr5KPcVpvLgVonZ7nzIUxYMfRunpxY5EOsCqc9FGWu/5uToX+ORb9AJF5mkjPSTjjnq9+JykdKiY31J4g7dso0kO3Psdyx3ZZTOWPzWu79Bu4U8MDylKqYaVcAfTTXjn37CQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ACT_ENTRY_OBJID = _t, ACTION_TO = _t, datetime = _t, Id = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ACT_ENTRY_OBJID", Int64.Type}, {"ACTION_TO", type text}, {"datetime", type datetime}, {"Id", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Id"}, {{"count", each Table.RowCount(_), type number}, {"all", each Table.Group(_, {"datetime"}, {"CLOSE", (c)=> Table.First(c)},GroupKind.Local,(x,y)=>Number.From(Duration.Minutes(y[datetime]-x[datetime])>5))}}),
    #"Expanded all" = Table.ExpandTableColumn(#"Grouped Rows", "all", {"datetime", "CLOSE"}, {"all.datetime", "all.CLOSE"}),
    #"Expanded all.CLOSE" = Table.ExpandRecordColumn(#"Expanded all", "all.CLOSE", {"ACT_ENTRY_OBJID", "ACTION_TO"}, {"all.CLOSE.ACT_ENTRY_OBJID", "all.CLOSE.ACTION_TO"})
in
    #"Expanded all.CLOSE"

Can you explain your data and show an example of what you want? You are using terms like case_ID, SSL_Event_time, and Event Identifier, but you have no columns with those names.

 
 
 
 
 
 

2020-08-03 08_44_19-Example Data - Excel.png



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Appolgies for the confusion, I had renamed the Case_ID and  SSL_Event_time to the easily refrence with with the function that Marisuz provided

 

case_ID = Id

SSL_Event_time = datetime

ACT_ENTRY_OBJID is the event identfiter 

 

Based on the logic i looking to remove event rows created when someone has duplicated their actions.

Then using the date and time. the plan is to give anylsis on time of day and weekly/monthly stats. THis part already done it was just the peice of being able to remove user actions of resending tickets and creating mutiple entries.

 

howie2293_1-1596473679005.png

 

 

 

 

Anonymous
Not applicable

try this:

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("1ZhLbhRBDIavgrIG4WfZ7luwRiw5AvfHPV094x5NGRRYAIpQEumLn7/tmq9f31DdCAFC3j6+ffnx/cf3D5TfAX4G+0xA8AFgg/1X6OGa/719+3hiYuYNRjcsIFh8FGy4UIMx3qz5MEUsmIFzh01rIEHFGmFQh0kc1oyNpWBBAUsMN/QztuHFSZZMUoPJHbMoTgYrPqzhE2Yb2IwtwumCsf8eZsXJCDRtsEzf4aQKPVLCSMNwiflGt0wap4tuBWO/5fh1SnxjPZqLWbJQ/xnmZNSkRA/heABqTUnAkA4bE/OwghEwxRKLDXhizFAwHSOWMsXsSTjKPSALXLAQXSogMcXZJaZFOJy2eJlJpI1kWhs+/IEZOS2FkxjP2EgQR8HYo8XOVja3KJioLAuwY6dMLQgq5tBiMeckqxUnHZg7J/VoLt3HpFZs4HJO7tiMTdmsFGBX37puvMFZN4eaSc9pgh02x6vn7IIV9pySNZYaGL/E3DyIdIWtnPzXMX2NieJyUe3YeCnT3Mmw3AE7Zq+tXbAX1t6JxbswxPdh4zWWl0OXSZwKIMMoIyi/uBPOrTqJIcOlAM7SCQdn3W5LoGLOa3UndhSAQSkKFrTPiDVG5+RyHGXgpUibWcJzd++zJGdXwQatz5nEjrW4WyOoWK4B6zA/5mQMk/GJKyjSSecCVsywqxzH3U2vbrqM9ariTWafjKByiAqARhedjOlkxjI+XcARrT0rYMFwP58b7F5yYarYwOVRf8EYoWAEbe3mvZx/G6hao2hzeVwLO4blNhEQoq5yKmdKYvgll8JtyS/gBYtuNEws23mgls6U/Dk6levMJg0sSkgseD3BZIN5e6XKJS7x5WtnLdgnsGLD1oKVufzzbcXEteipoHXRZa6svVcGFGsItL6ad2yOPst8So0Ogce6ek9gwfLfunqJHeMhZ52MEh2StdaQ7tbi0mRIzUv1GSyYQrMR5L4RzPNavthTHGuly9wJE6zYkK5651sVNFNTMFfqMNITEyvZpHxNrk832Xh2Zt7OowwIImlzyXy2WNAF07YzCyZFd9nd0unuvrgyPKtYtxHk/pzINsRSAJI+k/f9E3BxUjG6lBwPMzdkQq6YNOeN3NdWWmP9C5hqF9uBmYuYFw1QXg7r7ZPY/EhJ9mutYKbaFUAeg8gumGPr5KPcVpvLgVonZ7nzIUxYMfRunpxY5EOsCqc9FGWu/5uToX+ORb9AJF5mkjPSTjjnq9+JykdKiY31J4g7dso0kO3Psdyx3ZZTOWPzWu79Bu4U8MDylKqYaVcAfTTXjn37CQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ACT_ENTRY_OBJID = _t, ACTION_TO = _t, datetime = _t, Id = _t]),
    #"Renamed Columns" = Table.RenameColumns(Source,{{"datetime", "when"}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"ACT_ENTRY_OBJID", Int64.Type}, {"ACTION_TO", type text}, {"when", type datetime}, {"Id", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Id"}, {{"ALL", each _, type table [ACT_ENTRY_OBJID=number, ACTION_TO=text, datetime=datetime, Id=text]}, {"COUNT", each Table.RowCount(_), type number}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "close", each Table.Group([ALL], {"when"}, {"CLOSE", each Table.First(_)},GroupKind.Local,(x,y)=>Number.From(Duration.Minutes(y[when]-x[when])>5))),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"close","COUNT"}),
    #"Expanded close" = Table.ExpandTableColumn(#"Removed Other Columns", "close", {"when", "CLOSE"}, {"close.when", "close.CLOSE"}),
    #"Expanded close.CLOSE" = Table.ExpandRecordColumn(#"Expanded close", "close.CLOSE", {"ACT_ENTRY_OBJID", "ACTION_TO", "when", "Id"}, {"close.CLOSE.ACT_ENTRY_OBJID", "close.CLOSE.ACTION_TO", "close.CLOSE.when", "close.CLOSE.Id"})
in
    #"Expanded close.CLOSE"

 

 

could be shorter but now I don't have enough time to be short :-).

 

Hello @howie2293 

 

try this solution

let
	Source = #table
	(
		{"ID","datetime","ID2"},
		{
			{"1","01/01/2020","1"},	{"2","01/01/2020","1"},	{"3","01/01/2020","1"},	{"4","01/01/2020","1"},	{"5","01/01/2020","1"},	{"6","01/01/2020","1"},	{"7","01/01/2020","1"},	
			{"8","01/01/2020","2"},	{"9","01/01/2020","2"},	{"10","02/01/2020","3"},	{"11","02/01/2020","4"}
		}
	),
	TableDistinct = Table.Distinct(Source, {"datetime", "ID2"})
in
	TableDistinct

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

Top Solution Authors
Top Kudoed Authors