Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

Reply
irnm8dn
Post Prodigy
Post Prodigy

Dedupe ID - Keep most recent record

I have a list of Visitor IDs to a website for a specific timespan.  For purposes of summarizing the data, I would like to dedeupe the IDs in the list, keep only the timestamp of the most recent visit.

 

The Visitor ID is a column created from a concatenation DAX statement.

 

Is there a preferred way of doing this?

1 ACCEPTED SOLUTION
Vvelarde
Community Champion
Community Champion

@irnm8dn

 

(DAX(SUMMARIZECOLUMNS(analytics_clickstream_201802[visid_comb);"LastDate":MAX(analytics_clickstream_201802[date_time]))

)).

 

Replace : with

 

If you have problems with ; replace it with , (This is for regional settings)

 

Regards

Victor

 

 




Lima - Peru

View solution in original post

12 REPLIES 12
Ashish_Mathur
Super User
Super User

Hi,

 

Share your data and also show the expected result.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

@Ashish_Mathur

 

The link below containas a sample data .xslx (Two tabs, Raw Data and Desired State)

 

Simply put, I have one column of Visitor IDs, and the other a timestamp.  If for example I look over the course of a year, I may see repeated Visitor IDs (i.e. the individual has visited the site multiple times over the last year) however the adjacent column is timestamped.

 

I would like to dedupe the Visitor ID column (Column A) so the data is distilled down to a single instance, showing the latest (newest) timestamp.

 

https://www.dropbox.com/s/6wdubhk9kj8kalq/Dedupe%20Sample%20Data.xlsx?dl=0

 

Thanks for your help!

 

 

Vvelarde
Community Champion
Community Champion

@irnm8dn

 

Hi, you can create a New Table (Modeling - New Table)

 

 

NewTable =
SUMMARIZECOLUMNS (
    'Sample Data'[Visitor ID];
    "LASTDATE"; MAX ( 'Sample Data'[Date] )
)

Regards

VIctor

 




Lima - Peru

@Vvelarde

 

It occurred to me as you were posting your response that the Visitor ID is based on a concatentation based on a Dax Function.

 

Will this work?

Vvelarde
Community Champion
Community Champion

@irnm8dn

 

Yes, this should be Work.

 

It takes the Visitor ID (after the concatenation was made it). The visitor ID is a calculated column right?

 

 




Lima - Peru

@Vvelarde

 

This is the statement I am using for the ID creation:

visid_comb = CONCATENATE(analytics_clickstream_201802[post_visid_low],analytics_clickstream_201802[post_visid_high])

 

This is your recommendation, modified for the table:

Column = SUMMARIZECOLUMNS(analytics_clickstream_201802[visid_comb);"LastDate":MAX(analytics_clickstream_201802[date_time]))

 

This is the error message when I apply your suggestion, and I am not sure where my error is:

The syntax for ')' is incorrect. (DAX(SUMMARIZECOLUMNS(analytics_clickstream_201802[visid_comb);"LastDate":MAX(analytics_clickstream_201802[date_time]))

)).

 

Appreciate all of the help.

 

 

Vvelarde
Community Champion
Community Champion

@irnm8dn

 

(DAX(SUMMARIZECOLUMNS(analytics_clickstream_201802[visid_comb);"LastDate":MAX(analytics_clickstream_201802[date_time]))

)).

 

Replace : with

 

If you have problems with ; replace it with , (This is for regional settings)

 

Regards

Victor

 

 




Lima - Peru

Neither change succeeded.

 

Assuming I followed your instruction properly.

 

Column = SUMMARIZECOLUMNS(analytics_clickstream_201802[visid_comb]);"LastDate",MAX(analytics_clickstream_201802[date_time]))

 

The syntax for ';' is incorrect. (DAX(SUMMARIZECOLUMNS(analytics_clickstream_201802[visid_comb]);"LastDate",MAX(analytics_clickstream_201802[date_time])))).

Vvelarde
Community Champion
Community Champion

@irnm8dn

 

Ok.

 

Replace ALL ; with , Dont Mixed.

 

And Use a Modeling - NewTable Not a New Column.

 

Regards

 

Victor

 

 




Lima - Peru

@Vvelarde

 

So I think what you're suggesting: 

 

1.  Modeling>New Table

2.  Add Dax Statement: 

 

Table =  SUMMARIZECOLUMNS(analytics_clickstream_201802[visid_comb]),"LastDate",MAX(analytics_clickstream_201802[date_time]))

3.  Then what?

 

Thanks.

Vvelarde
Community Champion
Community Champion

@irnm8dn

 

This will create you a new Table with the data.

 

Regards

 

Victor




Lima - Peru

Thank you.

Helpful resources

Announcements
March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Fabric Community Conference

Microsoft Fabric Community Conference

Join us at our first-ever Microsoft Fabric Community Conference, March 26-28, 2024 in Las Vegas with 100+ sessions by community experts and Microsoft engineering.

Fabric Partner Community

Microsoft Fabric Partner Community

Engage with the Fabric engineering team, hear of product updates, business opportunities, and resources in the Fabric Partner Community.