cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
CTan42 Helper II
Helper II

Remove Duplicates?

Below are my power bi sample. I am asked to create a dashboard to present the stat of left orders. For examples, i do a calculation: orders= calculate(count(ID)). The result shows there are 5 orders. It is wrong. It should be 2 orders (just apple and banana). Same goes to calculate the remaining payment. It should be 6dollars instead of sum of the remaining payment column. It is not just about remove "duplicate". Is there anyways to do that?image.pngtableimage.pngordersimage.pngremaining payment

15 REPLIES 15
Highlighted
Super User III
Super User III

Re: Remove Duplicates?

Hey,

 

this is a difficult question, without knowing why there are duplicate records, it's not possible to recommend a way to remove these records.

 

The ID 1 has 3 records with different dates, for this reason, without more information it's not possible to advise you how to get rid of the duplicate rows (at least I'm not able to do this).

 

Do you have any idea why there are duplicates?

 

Can you share sample data, by providing a link to the pbix on onedrive or dropbox?

 

Regards

Hamburg - Germany
If I answer your question, please mark my post as solution, this will also help others.
I accept Kudos :-), If you find my post helpful.

Proud to be a Datanaut!
CTan42 Helper II
Helper II

Re: Remove Duplicates?

@TomMartens

thanks for reply. The idea behind the data is to record the purchase order details of the fruit. For example, Apple is ID 1, Banana is ID 2, Orange is ID 3. Today the shop ordered 10 apples in 1st of January 2017. However, the supplier only delivered 5 apples in 1st January 2017. So the shop just paid 5dollar to supplier which mean there is another 5dollar is yet to be paid. And then the next day 2nd of January 2017, the supplier delivered another 3apples. Hence the tables will record all the delivery. So what i was asked to do is find out how many and how much of orders and payment are yet to be collected and paid. So technically i should take the latest date of the deliver item. Hope that you would understand what i'm trying to explain here. 

Super User III
Super User III

Re: Remove Duplicates?

This makes sense, now I understand.

 

To count the Orders just use distinctcount('yourtable'[ID]). To answer your other questions it would be really helpful if you would prepare sample data, and make this data available. This makes it a lot easier for all of us to provide answers to your questions.

 

Cheers

Hamburg - Germany
If I answer your question, please mark my post as solution, this will also help others.
I accept Kudos :-), If you find my post helpful.

Proud to be a Datanaut!
CTan42 Helper II
Helper II

Re: Remove Duplicates?

@TomMartens

u can create same tables as screenshot in your power bi. Due to P&C, i cant show u the real dataset. 

Super User III
Super User III

Re: Remove Duplicates?

Hey,

 

I couldn't resist and created this measure

Remaining Payment latestOrder = 
SUMX(
VALUES('OrdersPayment'[ID])
,
var latestOrderDate =
 CALCULATE(MAX('OrdersPayment'[Date]))
return
CALCULATE(
 SUMX('OrdersPayment',
  IF('OrdersPayment'[Date] = latestOrderDate
   ,'OrdersPayment'[RemainingPayment]
   ,BLANK()
  )
 )
)
)

This is my sample data
Latest Order Remaining Payment - Sample Data.png

 

And this is the result

Latest Order Remaining Payment - Result.png

 

Hope this is what you are looking for

 

Regards

Hamburg - Germany
If I answer your question, please mark my post as solution, this will also help others.
I accept Kudos :-), If you find my post helpful.

Proud to be a Datanaut!
CTan42 Helper II
Helper II

Re: Remove Duplicates?

@TomMartens,

 

Once again thanks for the help. I'm really appreciate it. I had tried it in my dataset. It seems work, however my dataset is huge (1,000,000++rows). I have to find a way to make sure the measure is correct. Do u mind to explain your measure? I dont really understand it. Thanks.

 

Regards,

Chung.

Super User III
Super User III

Re: Remove Duplicates?

Hey,

 

sure.

 

At the heart of this meausre there are two nested SUMX('table', expression)

SUMX('table'
  ,SUMX('table',IF(...)
)

SUMX() is an table iterator function this means, that the expression is evaluated for each row of the table. The first table calculated by using the VALUES() function, VALUES always returns a one column sized table, this table is calculated by the column with all the IDs (the orders). Two things to know about this function is the fact , that it returns the unique values of the used column and the values returns the values in the current FILTERCONTEXT.

Implicity the value is implicitly used as filter for the the calculation of the expression.

 

The expression for the 1st SUMX is also a SUMX. Before the second expression is evaluated the MAX date is stored in the variable lastorderdate. This happens for each pass of the first iteration.

The 2nd table (1st parameter of the 2nd SUMX) is the table and is implicitly filtered by each ID. The 2nd expression checks if the date equals the value of the variable, if this is the case, the value "RemainingPayment" is used, otherwise the IF statement returns BLANK.

It is necessary to use SUMX for the calculation of the TOTAL row, where no ID is determining the FILTERCONTEXT.

 

I hope this explanation makes things a little clearer.

 

The function does not consider if there are more than 1 row with exactly the same lastorderdate (date and time), if this would be the case a rule has to be defined that describes which row has to be considered and the measure has to be adjusted accordingly.

 

Regards

Hamburg - Germany
If I answer your question, please mark my post as solution, this will also help others.
I accept Kudos :-), If you find my post helpful.

Proud to be a Datanaut!
CTan42 Helper II
Helper II

Re: Remove Duplicates?

@TomMartens

thanks for your explanation. It's great that you mentioned the condition at the last. As my data have more than 1 row with exactly same date. Hence do you have any idea to do that? 

 

Regards,

Chung. 

CTan42 Helper II
Helper II

Re: Remove Duplicates?

@TomMartens 

Data Updates:

In my dataset, there are 2 columns of Date - Order Date & Goods Received Date. So everything should calculated based on Goods Received Date. There is more than 1 row with exactly same date. And another question is some rows are blank as the goods are not received yet. 

Thanks

Helpful resources

Announcements
‘Better Together’ T-Shirt Contest – Winner Announced!

‘Better Together’ T-Shirt Contest – Winner Announced!

And the winner is...

Announcing the New Spanish Forum

Announcing the New Spanish Forum

Do you need help in Spanish? Check out our new Spanish community section.

April 2020 Community Highlights

April 2020 Community Highlights

Info on our Super Users, MBAS content and badges, and updates to our support articles. - Read the full Community Highlights.

MBAS Gallery 2020

MBAS Gallery 2020

Watch Microsoft Business Applications Summit sessions on-demand.

Top Solution Authors
Top Kudoed Authors