Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hello Community,
this is my first time asking a question here! 🙂
I want to find every lower and upper character in a string. To achieve this, I use the following DAX statement:
DEFINE
VAR Repl_Char = FORMAT ("Adma Sandler","STRING")
VAR Mylen = LEN ( Repl_Char )
VAR mytable =
ADDCOLUMNS (
GENERATESERIES ( 1, mylen ), "Original", MID ( Repl_Char, [Value], 1 ))
VAR Result_Uni =
ADDCOLUMNS (
SELECTCOLUMNS (
ADDCOLUMNS ( mytable, "uni_char", UNICODE ( [Original] ) ),
"Number", [Value],
"Original",[Original],
"Unicode", [uni_char]
),
"uppercase_lowercase",
IF (
[Unicode] >= 97
&& [Unicode] <= 122,
"lower",
IF ( [Unicode] >= 65 && [Unicode] <= 90, "upper","non alpha" )
)
)
EVALUATE
Result_Uni
If I execute this I will receive the follow:
Number | Original | Unicode | uppercase_lowercase |
1 | A | 65 | upper |
2 | d | 100 | lower |
3 | m | 109 | lower |
4 | A | 65 | upper |
5 | 32 | non alpha | |
6 | S | 83 | upper |
7 | A | 65 | upper |
8 | n | 110 | lower |
9 | d | 100 | lower |
10 | l | 108 | lower |
11 | e | 101 | lower |
12 | r | 114 | lower |
Question:
1. Could someone help to understand why this situation is happend?
2. It would be really helpful if someone has an idea to solve this.
BTW: I only have the option to use DAX for this challenge. 😉
Many thanks
Solved! Go to Solution.
I'm pretty sure this has to do with case-insensitive column compression as I explain here:
While I don't think there's a way to make the [Original] column as you expect, you should be able to fix the other columns by immediately converting to Unicode like this:
DEFINE
VAR Repl_Char = FORMAT ( "Adma Sandler", "STRING" )
VAR Mylen = LEN ( Repl_Char )
VAR mytable =
ADDCOLUMNS (
GENERATESERIES ( 1, mylen ),
"Unicode", UNICODE ( MID ( Repl_Char, [Value], 1 ) )
)
VAR Result_Uni =
SELECTCOLUMNS (
mytable,
"Number", [Value],
"Original", UNICHAR ( [Unicode] ),
"Unicode", [Unicode],
"uppercase_lowercase",
IF (
[Unicode] >= 97 && [Unicode] <= 122,
"lower",
IF ( [Unicode] >= 65 && [Unicode] <= 90, "upper", "non alpha" )
)
)
EVALUATE
Result_Uni
I'm pretty sure this has to do with case-insensitive column compression as I explain here:
While I don't think there's a way to make the [Original] column as you expect, you should be able to fix the other columns by immediately converting to Unicode like this:
DEFINE
VAR Repl_Char = FORMAT ( "Adma Sandler", "STRING" )
VAR Mylen = LEN ( Repl_Char )
VAR mytable =
ADDCOLUMNS (
GENERATESERIES ( 1, mylen ),
"Unicode", UNICODE ( MID ( Repl_Char, [Value], 1 ) )
)
VAR Result_Uni =
SELECTCOLUMNS (
mytable,
"Number", [Value],
"Original", UNICHAR ( [Unicode] ),
"Unicode", [Unicode],
"uppercase_lowercase",
IF (
[Unicode] >= 97 && [Unicode] <= 122,
"lower",
IF ( [Unicode] >= 65 && [Unicode] <= 90, "upper", "non alpha" )
)
)
EVALUATE
Result_Uni
Hello @AlexisOlson, many thanks for your explanation! 🙂 I now understand why my approach to identify the characters did not work! 😉 Your solution will save my day!
@Eastwood , Why is there a problem with the other A = 65?
@amitchandak many thx for your reply. The search string "Adma" has the last lower character "a". But the Unicode are referenced to an upper character.
User | Count |
---|---|
57 | |
22 | |
21 | |
19 | |
16 |
User | Count |
---|---|
87 | |
87 | |
52 | |
37 | |
23 |