Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Kumshan450
Helper III
Helper III

Create new table by filtering column value

Hi,
  
I have created a calculated column which will count rows that are duplicate.
 
Count = CALCULATE(countrows(Receiptdump),ALLEXCEPT(Receiptdump,Receiptdump[RECEIPT_NO]))
 
Receipt noBPIS NOBPIS DateMapping status Count
C00935218006621013x30-01-2018MAPPED3
C00935218006621013x30-01-2018MAPPED3
C00935218006621013x30-01-2018MAPPED3
C009352180066231014x29-01-2018MAPPED2
C009352180066231014x29-01-2018MAPPED2
C009352180066251015x28-01-2018MAPPED1
C009352180066261016x28-01-2018MAPPED1
 
After a lot of googling, I have found that we cannot delete any records in power pivot. 
 
So I wanted to create a new table where Count =1. This will throw me distinct values of receipt no.
 
The objective is to create a relationship with source table by receipt no. 
 
Receipt noBPIS NOBPIS DateMapping status Count
C009352180066251015x28-01-2018MAPPED1
C009352180066261016x28-01-2018MAPPED1
1 ACCEPTED SOLUTION

Hi I Finally got the answer, 

 

I created the new table by applying the formula using calculate table function. 

 

Distinctreceiptdump = CALCULATETABLE(Receiptdump,filter(Receiptdump,Receiptdump[Count]=1))

View solution in original post

15 REPLIES 15
Anonymous
Not applicable

Hi @Kumshan450,

 

If what you want to acheive is a table which contains a set of all the distinct receipt numbers you can do the following

 

Power Pivot

 

 

Power BI

 

If implementing this in Power BI this is slightly easier. You can follow the steps above replacing Step 1 with creating a calculated table and inputting the measure.

 

Hope this helps.

 

Kris

Hi, thanks for the reply.

 

Hi, I want to mention that there are also samples where receipt no is same but the BPIS no in the table are different. 
 
Receipt noBPIS NOBPIS DateMapping status Count
C009352180066251015x28-01-2018MAPPED1
C009352180066251016x28-01-2018MAPPED1

So if I apply distinct (table) I will not be able to get the distinct table. 
 
And if I apply distinct (table. ([receipt no]) I will get the distinct table but what about other columns how I will map the BPIS No and date to the distinct table. 
 
Please suggest. 
Anonymous
Not applicable

Hi @Kumshan450,

 

I see. With the table you have posted above you will not be able to use this table to create a relationship on Receipt no as you have multiple instances of the same receipt number. It will help if you can explain why you need to create the relationship and then I will be able to advise, as I can see multiple ways of achiving multiple goals here.

 

Thanks!

Kris

Anonymous
Not applicable

Hi @Kumshan450,

 

I'll talk generally - The easiest way create the table that you desire would be to use the query editor. Load the source table again (as you would have to initially load this to your model). In the query editing mode highlight the 3 columns that you wish to keep distinct rows (hold Ctrl whn clicking each one to highlight multiple) and select Remove Rows > Duplicate Values as shown below

 

Capture44.PNG

 

This will give you a table of all distinct Receipt no/BPIS NO/BPIS Dates.

 

Let me know how you get on.

 

Cheers,

Kris

Hi,

 

I am not using power query to remove duplicates as

 

Removing duplicates in power query subsequently slows the process of loading the query in power pivot table. 

 

 

Anonymous
Not applicable

Hi @Kumshan450,

 

If you only wish to use DAX and you are using Power Pivot (and not Power BI), the only way I can see to do this would be to

 

  • Create a column that contains a concatenation of the columns Receipt no, BPIS NO and BPIS Date.
  • You can that follow the steps from Step 1 in my original post to get a calcualted table that contains a distinct list of the created concat column.
  • From here create a relationship between the 2 tables and use the DAX formulas =RELATED([Receipt no]), =RELATED([BPIS NO]) and = RELATED([BPIS Date]) to build the other 3 columns.
  • You can then hide the concatenated columns from the model by right clicking these in Diagram View and selecting Hide from Client Tools.

Let me know if this suffices.

 

Thanks,

Kris

Yes i will check 

 

Is there any I can create a new table from the existing table in power pivot table data model with count value =1. This will definitely give me an accurate distinct table. 

using dax

Hi I Finally got the answer, 

 

I created the new table by applying the formula using calculate table function. 

 

Distinctreceiptdump = CALCULATETABLE(Receiptdump,filter(Receiptdump,Receiptdump[Count]=1))

Anonymous
Not applicable

Hi @Kumshan450,

 

Perfect! I see that you must be using Power BI then and not Power Pivot?

 

Congrats and Kudos,

 

Kris

Yes i am using power BI

Anonymous
Not applicable

Hi @Kumshan450,

 

I am not sure you are able to create a calculated table of multiple columns in Power Pivot. In Power BI you can create a calculated talble to do this using the DAX formula

 

Table = SUMMARIZE(Receiptdump,Receiptdump[Receipt no],Receiptdump[BPIS NO],Receiptdump[BPIS Date])

 

Using Power Pivot and DAX I'd suggest trying my method above. Otherwise maybe biting the bullet and creating this in the query editor. How many rows of data are we talking that causes the refresh to take so long?

 

Best,

Kris

Hi, It will not filter out the below sample which is also a duplicate

 

Hi, I want to mention that there are also samples where receipt no is same but the BPIS no in the table are different. 
 
Receipt noBPIS NOBPIS DateMapping status Count
C009352180066251015x28-01-2018MAPPED2
C009352180066251016x28-01-2018MAPPED2

 

However, I applied the calculate table function which is giving me distinct values. 

 

Distinctreceiptdump = CALCULATETABLE(Receiptdump,filter(Receiptdump,Receiptdump[Count]=1))

 

Why am I totally against removing duplicates in power query stage? Removing duplicate step dont fit large volume of data. I am handlling around 20 lakh rows of data. IF suppose i add additional months and refresh OH god I have to wait for months to load the query in data model. 

Its 20 lakh rows of data.

 

Also i dont want unique combinations. I want unique receipt no. 

 

IF i apply the formula 

Table = SUMMARIZE(Receiptdump,Receiptdump[Receipt no],Receiptdump[BPIS NO],Receiptdump[BPIS Date])

 

IT will show the unique combinations of receipt no|Receipdate|BPISno.

 

I want to remove all the duplicate receipt no and concentrate only on the unique receipt no. 

 

I have mentioned in the post that there are also samples where receipt no is same but the BPIS no and dates are different. I don't want to filter this data. I want to remove the same. 

Its 20 lakh rows of data.

 

Also i dont want unique combinations. I want unique receipt no. 

 

IF i apply the formula 

Table = SUMMARIZE(Receiptdump,Receiptdump[Receipt no],Receiptdump[BPIS NO],Receiptdump[BPIS Date])

 

IT will show the unique combinations of receipt no|Receipdate|BPISno.

 

I want to remove all the duplicate receipt no and concentrate only on the unique receipt no. 

 

I have mentioned in the post that there are also samples where receipt no are same but the BPIS no and dates are different. I dont want to filter this data. I want to remove the same. 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.