Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Power Query: Extract distinct values from column as new query

Hi,

 

I have a fairly large query with 1,5 million rows, which is growing by around 5000 rows per day. From that query I want to extract the distinct values from one specific column and create a new table with those values, so I can use it as a dimension and for forming relationships in my model.

 

Right now, I have referenced the first query, deleted all the columns besides the one I need, and then done a "Remove Duplicates", with a result of two distinct text values. This works, but at query refresh, this simple table containing only two text values as a final result will load all 1,5 million rows first, and then clean it up.

 

Is there a more efficient way to complete the same task in Power Query, so I can reduce refresh time?

1 ACCEPTED SOLUTION
v-shex-msft
Community Support
Community Support

Hi @Anonymous,

 

You can try to create a blank query to reference original source, then use list.distinct to remove duplicate records.

 

let
    DistinctSource = List.Distinct(Sheet2[Date])// List.Distinct(QueryName[ColumnName])
in
    DistinctSource

 

Sheet2(22582 rows) -> Date(953 rows)

9.PNG

 

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

View solution in original post

13 REPLIES 13
Stephen_Sink
Frequent Visitor

I struggled for a while with this becuase there is some key puntuation missing from this solution. Here's what worked for me - I am refrenceing another query

= List.Distinct(#"QueryName"[ColumnName])



Key things missing was having the # and "" in the right spot

HairyDrumroll
Advocate I
Advocate I

In the Power Query editor:

Right click on your existing query and choose "Reference".

Use "Choose Columns" to select only the column which you want your unique values generated from.

Right click on the column and choose "Remove Duplicates"

I like this solution because it allows you to continue to use table tools and create new custom column off the de-duplicated list. Power BI Guy's solution may be more effecient however.

This is the users current solution, however this is inefficient as it loads the entire table and then performs the transformations.

 

Current best solution is:

 

let
    DistinctSource = List.Distinct(Sheet2[Date])// List.Distinct(QueryName[ColumnName])
in
    DistinctSource

Your code is slightly wrong. I've edited it below 🙂

 

let
    DistinctSource = List.Distinct(Sheet2,"Date")// List.Distinct(QueryName,"ColumnName")
in
    DistinctSource

 

Thanks! This is just the alternative I was looking for.

Anonymous
Not applicable

Hello

I faced the same problem you asked for and i am quite new to power bi and mostly know its basics.

what i did was i created a new table in power bi desktop and  used the formula

 

NewTable = DISTINCT(Tablename [field])
 
and it worked.
v-shex-msft
Community Support
Community Support

Hi @Anonymous,

 

You can try to create a blank query to reference original source, then use list.distinct to remove duplicate records.

 

let
    DistinctSource = List.Distinct(Sheet2[Date])// List.Distinct(QueryName[ColumnName])
in
    DistinctSource

 

Sheet2(22582 rows) -> Date(953 rows)

9.PNG

 

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

This works. But the query still refreshes all the 1.5 million records! So it does not solve the question.

That means you still have to wait for the query to finish once more. This is made like this to "make shure" Power-Bi have all the data once more pulled from the souse. Quite annoying actually.    

I have a question for this. I am trying to do this myself:
you write // List.Distinct(QueryName[ColumnName]), is this the new query you're working in?

If not, what is it?

Hi @BiBra,

 

Nope, it just means the description of my formula.

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

@v-shex-msft Okay, however the code does not work for me

I get this:

fault2.PNG

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.