cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Sayth Regular Visitor
Regular Visitor

Return count of all distinct words in a column

I have a column of text at my work that contains questions. In these fields input anything can be typed. I want to better understand the sentiment of the words used so return a count for each unique word in the column. How can I acheive this?

 

If that is too hard is it possible to supply a list of 20-30 words and have the counts of those words in the columns returned?

 

Due to IT policy I cannot paste my work data but have supplied this dummy data as a basis.

 

Date	Text
1/01/2017	Lorem ipsum dolor sit amet, consectetur adipiscing elit.
1/01/2017	Itaque contra est, ac dicitis;
2/01/2017	Beatus sibi videtur esse moriens
3/01/2017	naturam affectum esse possit Beatus
2/01/2017	moriens. Quo igitur, inquit, modo?

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Super User
Super User

Re: Return count of all distinct words in a column

Try this (used an Enter Data query to start with)

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ZY6xDsIwDAV/xcpctRQGBgYkNiQW5qpDSNzKUhOX2OH7SQtDEfvdvdd1pm12bbPftUdTmRsnDECz5ACeJ04gpGADagWOo6BT1JzAeppJHMURcCKtTV/9hq5qnxkXR5MFlOJbB54cKclpxfcb/IJWs5SxB8GL/LqBIgiBE2GUVThshFj4ZAPYYSifytuVnlmWv5/a38i3VcM9M9BIpVABxWem8i6w57Pp+zc=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, Text = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Text", type text}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Text", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Text.1", "Text.2", "Text.3", "Text.4", "Text.5", "Text.6", "Text.7", "Text.8"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Text.1", type text}, {"Text.2", type text}, {"Text.3", type text}, {"Text.4", type text}, {"Text.5", type text}, {"Text.6", type text}, {"Text.7", type text}, {"Text.8", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Date"}, "Attribute", "Value"),
    #"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"})
in
    #"Removed Columns"

and this:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ZY6xDsIwDAV/xcpctRQGBgYkNiQW5qpDSNzKUhOX2OH7SQtDEfvdvdd1pm12bbPftUdTmRsnDECz5ACeJ04gpGADagWOo6BT1JzAeppJHMURcCKtTV/9hq5qnxkXR5MFlOJbB54cKclpxfcb/IJWs5SxB8GL/LqBIgiBE2GUVThshFj4ZAPYYSifytuVnlmWv5/a38i3VcM9M9BIpVABxWem8i6w57Pp+zc=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, Text = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Text", type text}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Text", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Text.1", "Text.2", "Text.3", "Text.4", "Text.5", "Text.6", "Text.7", "Text.8"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Text.1", type text}, {"Text.2", type text}, {"Text.3", type text}, {"Text.4", type text}, {"Text.5", type text}, {"Text.6", type text}, {"Text.7", type text}, {"Text.8", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Date"}, "Attribute", "Value"),
    #"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Columns", {"Value"})
in
    #"Removed Duplicates"

Relate the two tables on Value and then you should be able to create a matrix and get your counts.

 

 

 

 

 

 


Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!


1 REPLY 1
Super User
Super User

Re: Return count of all distinct words in a column

Try this (used an Enter Data query to start with)

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ZY6xDsIwDAV/xcpctRQGBgYkNiQW5qpDSNzKUhOX2OH7SQtDEfvdvdd1pm12bbPftUdTmRsnDECz5ACeJ04gpGADagWOo6BT1JzAeppJHMURcCKtTV/9hq5qnxkXR5MFlOJbB54cKclpxfcb/IJWs5SxB8GL/LqBIgiBE2GUVThshFj4ZAPYYSifytuVnlmWv5/a38i3VcM9M9BIpVABxWem8i6w57Pp+zc=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, Text = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Text", type text}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Text", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Text.1", "Text.2", "Text.3", "Text.4", "Text.5", "Text.6", "Text.7", "Text.8"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Text.1", type text}, {"Text.2", type text}, {"Text.3", type text}, {"Text.4", type text}, {"Text.5", type text}, {"Text.6", type text}, {"Text.7", type text}, {"Text.8", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Date"}, "Attribute", "Value"),
    #"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"})
in
    #"Removed Columns"

and this:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ZY6xDsIwDAV/xcpctRQGBgYkNiQW5qpDSNzKUhOX2OH7SQtDEfvdvdd1pm12bbPftUdTmRsnDECz5ACeJ04gpGADagWOo6BT1JzAeppJHMURcCKtTV/9hq5qnxkXR5MFlOJbB54cKclpxfcb/IJWs5SxB8GL/LqBIgiBE2GUVThshFj4ZAPYYSifytuVnlmWv5/a38i3VcM9M9BIpVABxWem8i6w57Pp+zc=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, Text = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Text", type text}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Text", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Text.1", "Text.2", "Text.3", "Text.4", "Text.5", "Text.6", "Text.7", "Text.8"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Text.1", type text}, {"Text.2", type text}, {"Text.3", type text}, {"Text.4", type text}, {"Text.5", type text}, {"Text.6", type text}, {"Text.7", type text}, {"Text.8", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Date"}, "Attribute", "Value"),
    #"Removed Columns" = Table.RemoveColumns(#"Unpivoted Columns",{"Attribute"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Columns", {"Value"})
in
    #"Removed Duplicates"

Relate the two tables on Value and then you should be able to create a matrix and get your counts.

 

 

 

 

 

 


Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!