cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Abhaykumar
Microsoft
Microsoft

Getting distinct values for multiple columns

Is there a way to get distinct values for multiple columns? Specifically, I want to replicate the following SQL query into PowerBI to create a new table:

 

SELECT DISTINCT Col1, Col2, Col3 FROM TableA;

I can find the DISTINCT keyword, but it only supports one column. 

 

 

2 ACCEPTED SOLUTIONS
ankitpatira
Community Champion
Community Champion

@Abhaykumar You can also select individual columns in query editor and click Remove Duplicates.

 

Capture.PNG

View solution in original post

achinm45
Advocate IV
Advocate IV

Hi @Abhaykumar,

 

Use DAX and in DAX the function SUMMARIZE().

Summarize will pull out distinct values from columns.

 

So your resultset will be new table from Summarize

Table_Output =  Summarize(Table_IN,Col1,Col2,Col3)

 

I hope it helps !

 

BR,

Achin

View solution in original post

26 REPLIES 26
BrianVT
Resolver I
Resolver I

Hi @ImkeF 

 

I followed your approach in this thread to create my first manual query in Power BI.  I wanted to create this list of distinct values from two different columns from two tables, so that I could use the newly created column as the relational data point between those two tables/columns.  However, I'm getting the ambiguity error when trying to relate the second column to it.  Is that not the purpose of creating this union list?  I also tried creating a table that did the same thing, and the ambiguity error popped up there as well.

 

Is there a way to create a stand alone column that is the list of unique values from two columns for the purpose of relating those two columns in one-to-many relationships?

Hi @BrianVT 

sorry, but I cannot follow.

Could you possibly post some sample data?

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

@ImkeF 

 

 = List.Distinct(List.Union({#"Obligations by FY"[Project Number],#"CJI3 Actual Cost Data"[Project Number]}))

 

So, that's the query I created.  In my model, I then try to create a relationship between this new column, and the two different "Project Number" columns, so that the new "Project Number" column can be used for filtering purposes in visuals.

 

I can create one relationship, but upon creating the second relationship, I get the error about ambiguous paths.  So, I can't do what I wanted with the new query/column. 

I guess I'll post a new question.  Was just trying to figure out a quick way to make the "one" side of my relationships for two tables that is dynamic, vs. manually joining them into a table to bring in on my own and having to manually refresh that table with new unique values all the time.  Creating the table within Power BI by joining the unique values from the two tables seemed like the easiest way to do it, which landed me here.

achinm45
Advocate IV
Advocate IV

Hi @Abhaykumar,

 

Use DAX and in DAX the function SUMMARIZE().

Summarize will pull out distinct values from columns.

 

So your resultset will be new table from Summarize

Table_Output =  Summarize(Table_IN,Col1,Col2,Col3)

 

I hope it helps !

 

BR,

Achin

View solution in original post

Hi achinm45,

 

Receiving the following error when using your suggested function (below; names anonymized):

"The expression refers to multiple columns. Multiple columns cannot be converted to a scalar value."

 

SUMMARIZE(table,[col1],[col2],[col3])
 
Thank You!

Despite the existence of many solutions in Power BI (Query Editor, DAX, etc), this seams to be the most elegante solution.
Thants for your hint @achinm45 

I am trying to create a table that shows the person's unique code, department and name.  I am using the Summarize function and am getting duplicates.  How do I exclude the duplicate that does not get associated with a department?

 

Table = Summarize('MO YTD Jan - June Revenue','MO YTD Jan - June Revenue'[Employee Code],'MO YTD Jan - June Revenue'[Department Name])
 
Distinct.JPG
 
Thank you!
 
Kim

If you do it in DAX, just be aware that the full table must be loaded into the data model. Doing it in the query-editor using M would (mostly) fold back to the server, meaning that the SQL-server does the extraction and returns only the results to PBI to be loaded.

 

So if you need to load the full table anyway, DAX would probably be faster.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

You can also wrap a DISTINCT(SELECTCOLUMNS(table, "col1name", Col1fromtable, table, "col2name", Col2fromtable)

ToFrai
Frequent Visitor

How can I copy distinct values from multiple columns into ONE single colum?

 

The shown above always created multiple columns into which the code copies the distinct values from each column.

 

Thanks!

This formula will create a list of distinct values in the query editor (where "Source" is the name of the table/previous step and column names in square brackets):

 

List.Distinct(List.Union({Source[Col1], Source[Col2], Source[Col3]})) 

Not sure if you want this as a standalone query or merge the resulting list with some other table?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi, 

I'm trying this solution to combine two columns of sales order numbers from two separate queries and get an error:

"Expression.SyntaxError: Invalid identifier"

 

here is my code:

= list.distinct(list.union({"ZSD026 - Sales Analysis Report"[#"Sales Order No"],"ZSD029 - Invoice Report Material Level"[#"Sales Order No"}))

Hi @jnorth78 ,

the M-language is case sensitive.

 



 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi ,

 

I'm trying this M code and getting an error "Expression.SyntaxError: Invalid identiffier"

 

Here is the code I'm entering to try and combine sales order number lists from two separate queries:

 

= list.distinct(list.union({"ZSD026 - Sales Analysis Report"[#"Sales Order No"],"ZSD029 - Invoice Report Material Level"[#"Sales Order No"}))

ToFrai
Frequent Visitor

Hi Imke,

thanks for the hint - where do I enter the code in the querie editor? Do I have to create a new table first, or add a new step in an existing query?

 

Problem is that the querie editor shows source tables only, not the ones I caclulated.

Isn't there a Dax formula that I can use in a new table that says something like "copy all (distinct) values from Table A Column 1, and then below all (distinct) valies from Table 2 Column 1, ...."?

 

BR | Tobias

You CAN do it in your current query:

Add step manually by clicking the fx-sign:

 image.png

 

But of course, this only works for tables that exist in the query editor.

If you've created them in the data view with DAX, then you won't have access to them there.

 

You might consider opening a new thread and specifically request a DAX-solution to enhance visibility of your request.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

ankitpatira
Community Champion
Community Champion

@Abhaykumar You can also select individual columns in query editor and click Remove Duplicates.

 

Capture.PNG

View solution in original post

Anonymous
Not applicable

@ankitpatira if he reomves the duplicates will it not remove the entire row ? which will ultimately result in wrong data ?

i am new to powerBI

My anser was rubbish (it would create a list of distinct values of the selected columns and returns it as a list if applied as a standalone-command (and not within a Table.AddColumn-command where will be performed as a function acting on record-level)).

 

Do as @ankitpatira said in the query-editor: First select the columns with your mouse and then click as shown. No need to edit a formula then 🙂

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
UG GA Amplification 768x460.png

Launching new user group features

Learn how to create your own user groups today!

Power BI October Update 2021.jpg

Power BI Release

Click here to read more about the October 2021 Release!

Community Connections 768x460.jpg

Community & How To Videos

Check out the new Power Platform Community Connections gallery!

Teds Dev Camp Oct. 2021 768x460.jpg

Power BI Dev Camp - October 28th, 2021

Mark your calendars and join us for our next Power BI Dev Camp!

Users online (889)