Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
samwong
Frequent Visitor

Replacing multiple values in a cell

Trying to parse a JSON object where one of the values is a list of multiple integers.  In a separate table (the lookup table),  I have a list of integers which map to a text string.  For example (this data is from themoviedb.org😞

Data Table

 

Title     genre_ids
Wonder Woman     28,12,14,878

 Lookup Table

genre.id     genre.name
28     Action
12     Adventure
16     Animation
35     Comedy
...

 

Question

In the Data Table, how would I make a new column (e.g. genre_fullname) where the contents are the string name taken from the look up table?  In other words, how can I make this?

Title     genre_ids     genre_fullname
Wonder Woman     28,12,14,878     Action, Adventure, Fantasy, Science Fiction

Bonus Question

Is this type of multiple data within a cell the right way to go about this?  In my data analysis with Python I strive for tidy data, with one value per cell (so the above row would actually be 4 rows).  I see thatPower BI has an option to expand lists into individual rows rather than the comma separated values like I have above.  Which approach is the *right* way?  (End goal: Have a list of movies with their genres listed next to them, and allow filtering the list by genre)

2 ACCEPTED SOLUTIONS
v-caliao-msft
Employee
Employee

@samwong,

 

You can do this by edit query, and the final power query looke like:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCs/PS0ktUgjPz03MU9JRMrLQMTTSMTTRsTC3UIrVgcv7gmWBUsZmOkA1xqZKsbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Title = _t, Genre_ids = _t]),
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "Genre_ids", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Genre_ids.1", "Genre_ids.2", "Genre_ids.3", "Genre_ids.4"}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Split Column by Delimiter", {"Title"}, "Attribute", "Value"),
    #"Merged Queries" = Table.NestedJoin(#"Unpivoted Columns",{"Value"},Lookup,{"Genre_ids"},"NewColumn",JoinKind.LeftOuter),
    #"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Genre.name"}, {"NewColumn.Genre.name"}),
    #"Grouped Rows" = Table.Group(#"Expanded NewColumn", {"Title"}, {{"Genre_ids", each Text.Combine([Value],","), type text}, {"Genre_fullname", each Text.Combine([NewColumn.Genre.name],","), type text}})
in
    #"Grouped Rows"

Capture.PNG

 

 

Regards,

Charlie Liao

View solution in original post

A better alternative is to split the cells to nested lists and expand these.

When splitting into columns, you get code with hard coded column names which will not be adjusted if you have more than 4 gneres (in this case).

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCs/PS0ktUgjPz03MU9JRMrLQMTTSMTTRsTC3UIrVgcv7gmWBUsZmOkA1xqZKsbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Title = _t, Genre_ids = _t]),
    Splitted = Table.TransformColumns(Source,{{"Genre_ids", each Text.Split(_,",")}}),
    Expanded1 = Table.ExpandListColumn(Splitted, "Genre_ids"),
    Merged = Table.NestedJoin(Expanded1,{"Genre_ids"},Lookup,{"Genre_ids"},"NewColumn",JoinKind.LeftOuter),
    Expanded2 = Table.ExpandTableColumn(Merged, "NewColumn", {"Genre.name"}, {"Genre.name"})
in
    Expanded2

 

I also left out the last "group by" step as I understand this is not desired.

Specializing in Power Query Formula Language (M)

View solution in original post

3 REPLIES 3
samwong
Frequent Visitor

Thank you to @v-caliao-msft and @MarcelBeug for your help!  I combined both of your work into what I needed.

 

For anyone interested, here is code that creates two tables, a main (data) table and a lookup table that demonstrates what I originally set out to accomplish.

 

Main Table

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSsssKi5RKMkoSk1V0lEy1FEw0lEwVorViVZKLUvNAwoB+SY6CmY6ChZg0fyUFIg6Yx0FU6XYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Group = _t, Numbers = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source, {{"Group", type text}, {"Numbers", type text}}),
    #"Split on commas" = Table.TransformColumns(Source, {{"Numbers", each Text.Split(_, ", ")}}),
    #"Expand Split" = Table.ExpandListColumn(#"Split on commas", "Numbers"),
    #"Lookup Values" = Table.NestedJoin(#"Expand Split", {"Numbers"}, #"Lookup Table", {"numeral"}, "number as word", JoinKind.LeftOuter),
    #"Expand Lookup" = Table.ExpandTableColumn(#"Lookup Values", "number as word", {"word"}, {"number as word"}),
    #"Combine Rows" = Table.Group(#"Expand Lookup", {"Group"}, { {"Numbers", each Text.Combine([Numbers], ", "), type text}, {"numbers as word", each Text.Combine([number as word], ", "), type text} })
in
    #"Combine Rows"

Lookup Table

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("Lc09DsAgCIDRuzA71P73LMYRK4sm1toev4DdvhcIOAcWDOSE4I2Dkbs+WXuSjgX7ZGaFfBfFIqDWJyvjold7k8aGSbWzkM5YVQcr0f/IDnJd9vwH", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [numeral = _t, word = _t])
in
    Source

 

v-caliao-msft
Employee
Employee

@samwong,

 

You can do this by edit query, and the final power query looke like:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCs/PS0ktUgjPz03MU9JRMrLQMTTSMTTRsTC3UIrVgcv7gmWBUsZmOkA1xqZKsbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Title = _t, Genre_ids = _t]),
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "Genre_ids", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Genre_ids.1", "Genre_ids.2", "Genre_ids.3", "Genre_ids.4"}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Split Column by Delimiter", {"Title"}, "Attribute", "Value"),
    #"Merged Queries" = Table.NestedJoin(#"Unpivoted Columns",{"Value"},Lookup,{"Genre_ids"},"NewColumn",JoinKind.LeftOuter),
    #"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Genre.name"}, {"NewColumn.Genre.name"}),
    #"Grouped Rows" = Table.Group(#"Expanded NewColumn", {"Title"}, {{"Genre_ids", each Text.Combine([Value],","), type text}, {"Genre_fullname", each Text.Combine([NewColumn.Genre.name],","), type text}})
in
    #"Grouped Rows"

Capture.PNG

 

 

Regards,

Charlie Liao

A better alternative is to split the cells to nested lists and expand these.

When splitting into columns, you get code with hard coded column names which will not be adjusted if you have more than 4 gneres (in this case).

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCs/PS0ktUgjPz03MU9JRMrLQMTTSMTTRsTC3UIrVgcv7gmWBUsZmOkA1xqZKsbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Title = _t, Genre_ids = _t]),
    Splitted = Table.TransformColumns(Source,{{"Genre_ids", each Text.Split(_,",")}}),
    Expanded1 = Table.ExpandListColumn(Splitted, "Genre_ids"),
    Merged = Table.NestedJoin(Expanded1,{"Genre_ids"},Lookup,{"Genre_ids"},"NewColumn",JoinKind.LeftOuter),
    Expanded2 = Table.ExpandTableColumn(Merged, "NewColumn", {"Genre.name"}, {"Genre.name"})
in
    Expanded2

 

I also left out the last "group by" step as I understand this is not desired.

Specializing in Power Query Formula Language (M)

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.