Solved: Relate data from one table to ranges from another

carlpaul153 · ‎08-25-2020

I have a table "characters" with two columns: "character" and "Unicode value" (in decimal).
I also imported the first table that appears here
In other words, for each range of values, indicate in the "Name" column the script or unicode language block (Latin, Greek, etc.).

What I want is to add to my "characters" table a "Name" column, which replicates the language block according to which unicode range the character belongs to.

I found a similar post here which explains that to do this you have to create relationships between the tables.

And another here which explains that to do this you have to create a function in a new query. I also noticed that the last comment in this post suggests that there is a much easier way to do it, but I don't understand it.

My question is, what would be the best way to solve it? With a function, or relating tables? I discovered power query just a few days ago, and was trying to fix my problems without using power pivot or power BI. However, if it is better for my situation I am willing to learn it.

In addition to telling me which of the two approaches is better, I would greatly appreciate if someone explains this procedure to me in a graphic and simple way. I'm just starting out and it's hard for me to understand.

Thank you very much.

lbendlin · ‎08-25-2020

Here's my approach:

You can do the list conversion in either table. For no good reason I did it only in the Characters table.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclHSUTKzUIrViVY6vBbINjI2B3PygGxDQwOl2FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Character = _t, Unicode = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Unicode", Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Lookup",  (k)=>
Table.SelectRows(Unicode,
each (List.Contains(Expression.Evaluate("{" & Text.Replace([Decimal],"-","..") & "}"),k[Unicode]) = true ))),
    #"Expanded Name" = Table.ExpandTableColumn(#"Added Custom", "Lookup", {"Name"}, {"Name"})
in
    #"Expanded Name"

A couple comments.

Note that I changed the Unicode column of the Characters table to Whole number.

I am ignoring the Character column of the Characters table.

The interesting part happens in the AddColumn line. I employ a custom generator function that pulls in the current row context ("k") .

Then I concatenate the information from the Unicode table in a way that it represents the text of a list definition.

Next I EVAL that expression, converting it into a list.

The next step is to return all rows of the Unicode table where the Unicode value of the Characters table ( "k[Unicode]" ) is contained in the generated list. Luckily this always returns maximum one row 🙂

Lastly I expand the returned row and retrieve the lookup name of the Unicode range.

Note there is no "each" in front of the "(k)=>" function definition. This is intentional! When you edit this part Power Query will re-insert the "each" keyword, breaking the code. You will need to manually remove it again.

Here's the result:

View solution in original post

Jimmy801 · ‎08-26-2020

Hello @carlpaul153

you can Manipulate your Unicode table in order that for the decimal range columns you have 2 different columns with type number. Here the code for the query called "Unicode"

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMqgwAAJdEGXupqSjZKBraGQOpJ0SizOTFXwSSzLzlGJ1IOosIOrcQOoMjSx0jUxNgSywGl1DheDSgoKc1NzUvBKYBkOIwYZgg41MzXSNLYxhGhRcK0pS81JSU3QdlWJjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Range = _t, Decimal = _t, Type = _t]),
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "Decimal", Splitter.SplitTextByDelimiter("-", QuoteStyle.Csv), {"Decimal.1", "Decimal.2"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Range", type text}, {"Decimal.1", Int64.Type}, {"Decimal.2", Int64.Type}, {"Type", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Decimal.1", "Start"}, {"Decimal.2", "End"}})
in
    #"Renamed Columns"

Now use your character table and add a column where you filter the unicode table on decimal ranges. The formula here does the trick. .. important is that the unicode-table is called "Unicode"

(add)=>
        let 
            FilterTable = Table.SelectRows(Unicode, each add[Unicode] >= [Start]  and  add[Unicode]<=[End]),
            ExtractType = try FilterTable[Type]{0} otherwise "Unicode not defined in Unicode table"
        in 
            ExtractType

here the complete code for the characters-table

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclHSUTKzUIrViVY6vBbINjI2B3PygGxDQwOl2FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Character = _t, Unicode = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Character", type text}, {"Unicode", Int64.Type}}),
    AddType = Table.AddColumn(#"Changed Type", "Type", (add)=>
        let 
            FilterTable = Table.SelectRows(Unicode, each add[Unicode] >= [Start]  and  add[Unicode]<=[End]),
            ExtractType = try FilterTable[Type]{0} otherwise "Unicode not defined in Unicode table"
        in 
            ExtractType)
in
    AddType

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

View solution in original post

lbendlin · ‎08-25-2020

I know you asked for a simple way but in reality this is pretty advanced.

Here's what I would do.

- Ingest the Unicode table

- replace the "-" in the first column with ".."

- use Expression.Evaluate to test each of your characters against the lists created in the previous step

- get the resulting name

Provide some sample character data if you would like more pointers.

carlpaul153 · ‎08-25-2020

Thanks for your reply. I thought it would be something simpler. I still don't understand how to add the column in a table using data from another table.

I leave some examples:

Table "Characters"

Character	Unicode
D	68
í	237
n	110

Table "Unicode"

Range Decimal Name

0x0000-0x007F	0-127	Basic Latin
0x0080-0x00FF	128-255	Latin-1 Supplement
0x0100-0x017F	256-383	Latin Extended-A

I want table "Character" as:

Character	Unicode	Name
D	68	Basic Latin
í	237	Latin-1 Supplement
n	110	Basic Latin

Jimmy801 · ‎08-26-2020

Hello @carlpaul153

you can Manipulate your Unicode table in order that for the decimal range columns you have 2 different columns with type number. Here the code for the query called "Unicode"

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMqgwAAJdEGXupqSjZKBraGQOpJ0SizOTFXwSSzLzlGJ1IOosIOrcQOoMjSx0jUxNgSywGl1DheDSgoKc1NzUvBKYBkOIwYZgg41MzXSNLYxhGhRcK0pS81JSU3QdlWJjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Range = _t, Decimal = _t, Type = _t]),
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "Decimal", Splitter.SplitTextByDelimiter("-", QuoteStyle.Csv), {"Decimal.1", "Decimal.2"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Range", type text}, {"Decimal.1", Int64.Type}, {"Decimal.2", Int64.Type}, {"Type", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Decimal.1", "Start"}, {"Decimal.2", "End"}})
in
    #"Renamed Columns"

Now use your character table and add a column where you filter the unicode table on decimal ranges. The formula here does the trick. .. important is that the unicode-table is called "Unicode"

(add)=>
        let 
            FilterTable = Table.SelectRows(Unicode, each add[Unicode] >= [Start]  and  add[Unicode]<=[End]),
            ExtractType = try FilterTable[Type]{0} otherwise "Unicode not defined in Unicode table"
        in 
            ExtractType

here the complete code for the characters-table

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclHSUTKzUIrViVY6vBbINjI2B3PygGxDQwOl2FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Character = _t, Unicode = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Character", type text}, {"Unicode", Int64.Type}}),
    AddType = Table.AddColumn(#"Changed Type", "Type", (add)=>
        let 
            FilterTable = Table.SelectRows(Unicode, each add[Unicode] >= [Start]  and  add[Unicode]<=[End]),
            ExtractType = try FilterTable[Type]{0} otherwise "Unicode not defined in Unicode table"
        in 
            ExtractType)
in
    AddType

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

lbendlin · ‎08-25-2020

Here's my approach:

You can do the list conversion in either table. For no good reason I did it only in the Characters table.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclHSUTKzUIrViVY6vBbINjI2B3PygGxDQwOl2FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Character = _t, Unicode = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Unicode", Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Lookup",  (k)=>
Table.SelectRows(Unicode,
each (List.Contains(Expression.Evaluate("{" & Text.Replace([Decimal],"-","..") & "}"),k[Unicode]) = true ))),
    #"Expanded Name" = Table.ExpandTableColumn(#"Added Custom", "Lookup", {"Name"}, {"Name"})
in
    #"Expanded Name"

A couple comments.

Note that I changed the Unicode column of the Characters table to Whole number.

I am ignoring the Character column of the Characters table.

The interesting part happens in the AddColumn line. I employ a custom generator function that pulls in the current row context ("k") .

Then I concatenate the information from the Unicode table in a way that it represents the text of a list definition.

Next I EVAL that expression, converting it into a list.

The next step is to return all rows of the Unicode table where the Unicode value of the Characters table ( "k[Unicode]" ) is contained in the generated list. Luckily this always returns maximum one row 🙂

Lastly I expand the returned row and retrieve the lookup name of the Unicode range.

Note there is no "each" in front of the "(k)=>" function definition. This is intentional! When you edit this part Power Query will re-insert the "each" keyword, breaking the code. You will need to manually remove it again.

Here's the result:

carlpaul153 · ‎08-26-2020

I tried to replicate the code with my table, but I couldn't. Your first step I removed, because I guess you import the example from this forum thread.

It gives me an error in the penultimate step: Expression.Error: The name 'Unicode' wasn't recognized. Make sure it's spelled correctly.

I guess it has to do with the first step that I can't replicate as this is the only time unicode appears not referencing the column.

Here the code:

let
    Origen = Folder.Files("C:\Users\User\Desktop\myfile"),
    #"Filas filtradas" = Table.SelectRows(Origen, each [Extension] <> ".en" or Text.Contains([Name], ".es")),
    #"Personalizada agregada" = Table.AddColumn(#"Filas filtradas", "Personalizado", each Csv.Document([Content],[Delimiter="	",Encoding=65001])),
    #"Otras columnas quitadas" = Table.SelectColumns(#"Personalizada agregada",{"Personalizado", "Extension"}),
    #"Se expandió Personalizado" = Table.ExpandTableColumn(#"Otras columnas quitadas", "Personalizado", {"Column1"}, {"Personalizado.Column1"}),
    #"Filas alternas quitadas" = Table.AlternateRows(#"Se expandió Personalizado",0,999,1),
    SplitUpCharacters = Table.TransformColumns(#"Filas alternas quitadas",{{"Personalizado.Column1", Text.ToList}}),
    #"Expanded Column2" = Table.ExpandListColumn(SplitUpCharacters, "Personalizado.Column1"),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Column2",{{"Personalizado.Column1", type text}}),
    #"Filas agrupadas" = Table.Group(#"Changed Type", {"Personalizado.Column1", "Extension"}, {{"Recuento", each Table.RowCount(_), type number}}),
    #"Personalizada agregada1" = Table.AddColumn(#"Filas agrupadas", "Unicode", each Character.ToNumber([Personalizado.Column1])),

    #"Changed Type2" = Table.TransformColumnTypes(#"Personalizada agregada1",{{"Unicode", Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type2", "Lookup",  (k)=>
    Table.SelectRows(Unicode, each (List.Contains(Expression.Evaluate("{" & Text.Replace([Decimal],"-","..") & "}"),k[Unicode]) = true ))),
    #"Expanded Name" = Table.ExpandTableColumn(#"Added Custom", "Lookup", {"Name"}, {"Name"})
in
    #"Expanded Name"

lbendlin · ‎08-26-2020

Yes, it's a bit unfortunate that you named the column in one table the same as the name of the other column. "k[Unicode]" refers to your column in the "Characters" table, "Unicode" to the Range lookup table.

Make sure you use exactly the same spelling. Power Query is case sensitive.

carlpaul153 · ‎08-25-2020

If someone doesn't have time for a detailed answer but knows which of the two approaches is correct it would help me as well. So I can go forward and investigate on my own.

Relate data from one table to ranges from another

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024