cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Lenihan Member
Member

append data from two columns only for distinct count.

Hi, 

So I have a table of about 25 columns, 35K rows of data.

I want to take two of those columns, append the data to make once column of about 70K of data. Then, remove duplicates.

 

Would this need to be a new table? If I create a calculated column, I'm going to end up with 70K rows of data, where 35K of them have null value in the other 24 columns columns. Unless there is a programmable way to just do a distinct count of all those rows of data. The possibility of the data that I'm trying to manage is about 50K different possibilities, so I can't just search for a key word or anything. 

If I did it in excel I would do exactly that - copy / paste all the data from one column onto the bottom of the other column, then click remove duplicates. 

 

I've searched, but the other options/questions I've found are people trying to do distinct count where they can concatenate the columns of data into a new column with the same # of rows as the original table.  This is different.

1 ACCEPTED SOLUTION

Accepted Solutions
Lenihan Member
Member

Re: append data from two columns only for distinct count.

Thank you Imke for your patience and assistance. I was finally able to get it using the old copy/paste/paste/merge/remove duplicates way. I was able to get the chart right too - apparently I had the relationship pointing the wrong way. Once I changed the direction, it is now displaying the right values.

19 REPLIES 19
Super User
Super User

Re: append data from two columns only for distinct count.

I would copy your query, use an Append Query step to append your original query, use a calculated column in your query to concatenate your columns and then do a remove duplicates on that column. Although, I'm not exactly certain of your use case as it sounds a bit strange, I'm therefore not sure I have the solution correct.


Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!


Lenihan Member
Member

Re: append data from two columns only for distinct count.

The thing though is I only need the number of distinct values in those columns.

 

For example - here is a data analogy

 

NameData1Name2Data3
a3c6
a4c7
b4a2
d5a2
c6a2
b2d4
a4c4
d5d6
d6c6
c2a4
a1b5
a5b6
c6d7
d7d5

 

I want to take all the values under Name, and Name2, and count how many unique. (except I'd have more than 4 different options of course, I have potentialy 40K unique values as they are configuration items in a CMDB). An item could appear under Name, or Name2, or could appear multiple times under both. 

If I append my data, I'm going to end up with double of the values under Name - but not necessary Name2 under Name. I essentially want to copy the full row of data in Name2 under Name.  The rest of the columns don't matter for what I need. I could create two copies of the query, delete all other columns, rename the column so they match in both new queries, then append them, but it seems like there should be a better way. 

Super User
Super User

Re: append data from two columns only for distinct count.

You can take this simple formula to do this:

 

List.Count(List.Distinct(List.Union(Table.ToColumns(Source))))

 

Where "Source" is the reference to the table to be analyzed.

 

It reads inside-out as in Excel:

Table.ToColumns: transfers every column of your table to a list and then combines all those lists into a big list (nested list of lists). 

List.Union: combines them 

List.Distinct: removes dups

List.Count: does what it says ;-)  

 

This is case-sensitive. Pls let me know if you want a case-insensitive count instead.

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




Lenihan Member
Member

Re: append data from two columns only for distinct count.

Thank you for the reply,

 

With this being just a number - where would this be created? It doesn't appear to be a new column since just a number, but when I try to create a new measure, it won't let me select any tables as the source, only existing measures I've already created.

 

I tried doing this formula too, based on a post from your earlier for someone else, 

List.Distinct(List.Union([CI Identifier], [Child CI Identifier]))

 

with CI Identifier and Child CI Identifier being the two columns I'm trying to merge and I was going to just then do a count on them, but I get this error:

 

Expression.Error: We cannot convert the value "CI00000085018" to type List.
Details:
    Value=CI00000085018
    Type=Type

 

In every row. the CI######## are the values within the rows of data. 

Super User
Super User

Re: append data from two columns only for distinct count.

Having difficulties understanding your request.

 

My formula actually returns just a single value (the number of distinct values in the whole table) and would be loaded as a table with one column and one row to your data model where it could be adressed by any DAX-function.

 

Do you need the number of distinct values in just some specific columns?

Or sth totally different? If so, pls post a clear picture of input-data and desired output-data.

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




Lenihan Member
Member

Re: append data from two columns only for distinct count.

Hi,

Sorry didn't mean to confuse. The solution you provided sounds like what I want. Since I was having challenges, i was trying to do it other ways too

 

So I clicked on New Table, and entered this:

 

Table = List.Count(List.Distinct(List.Union(Table.ToColumns('CI Relationship Report'))))

 

 

But I got this error:

The syntax for 'Count' is incorrect. (DAX(List.Count(List.Distinct(List.Union(Table.ToColumns('CI Relationship Report')))))).

Super User
Super User

Re: append data from two columns only for distinct count.

No prob :-)

 

You need to use this in the query-editor and not in the data model/tab.

 

Open the query editor -> move your mouse over the query-name in the left navigation pane (Queries...) -> rightclick mouse -> check "reference":

 

Then a new query will be created with one step in it: "Source" with the reference to your table. Click on the fx like shown in the picture will create a new step where you paste the formula:

 

PBI_CountAllDistinctValuesInTable.png

 

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries




Highlighted
Super User
Super User

Re: append data from two columns only for distinct count.

I think you are confusing Power Query (shaping) and Power Pivot (Modelling). New Table is power Pivot, List.Count is POwer query

 

http://exceleratorbi.com.au/shaping-modelling-power-bi/

 



* Matt is a Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.
Super User
Super User

Re: append data from two columns only for distinct count.

Hey Matt @MattAllington - congrats to your 1000 st post :-)

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Imke Feldmann

How to integrate M-code into your solution -- Check out more PBI- learning resources here -- Performance Tipps for M-queries