Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi team,
I have a datset with 1 million rows, which are loaded into PowerBI by ODBC(MongoDB). Below is some sample data.
I want to extract each element in multi-vlaue column "C_DS" into a new table , and then stats the value count and create a pie cart by creating a new measurement.
Here is the two Dax scripts I use:
1. Extract each element
C_DS_label =
VAR SplitByCharacter = ","
VAR Table0 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", CONCATENATEX(SampleData,SampleData[C_DS],",") ),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, TokenCount )
),
"Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
),
"Word",[Word])
RETURN
SUMMARIZE(Table0,[Word])
2. stats the value count
Count_C_DS =
var curr = SELECTEDVALUE(C_DS_label[Word])
return
calculate(COUNTROWS('SampleData'), FILTER(ALL('SampleData'[C_DS]), PATHCONTAINS(SUBSTITUTE ( SampleData[C_DS], ",", "|" ) , curr) ))
But I found that the first DAX script(extract each element) is running very very ....very slow. It has run for 3 hours for 1 million records and hasn't been ended yet.
can any expert help take a look and advice how to improve the DAX script?
Thanks,
Cherie
Solved! Go to Solution.
Great @Anonymous
Please use
C_DS_label =
DISTINCT (
SELECTCOLUMNS (
GENERATE (
FILTER ( VALUES ( SampleData[C_DS] ), SampleData[C_DS] <> BLANK () ),
VAR NumofWords =
PATHLENGTH ( SUBSTITUTE ( [C_DS], ",", "|" ) )
VAR NumSeries =
GENERATESERIES ( 1, NumofWords )
RETURN
ADDCOLUMNS (
NumSeries,
"@Word", PATHITEM ( SUBSTITUTE ( [C_DS], ",", "|" ), [Value] )
)
),
"Word", [@Word]
)
)
Hi @Anonymous
Please try
C_DS_label =
SELECTCOLUMNS (
GENERATE (
FILTER ( VALUES ( SampleData[C_DS] ), SampleData[C_DS] <> BLANK () ),
VAR NumofWords = PATHLENGTH ( SUBSTITUTE ( [C_DS], ",", "|" ) )
VAR NumSeries = GENERATESERIES ( 1, NumofWords )
RETURN
ADDCOLUMNS (
NumSeries,
"@Word", PATHITEM ( SUBSTITUTE ( [C_DS], ",", "|" ), [Value] )
)
),
"Word", [@Word]
)
Hi @tamerj1 ,
Thank you for the advice. I tried, and it's very fast to extract the values. But the extracted values looks like duplicated, and not unique . What I expect is to extract the disctinct unique values from the C_DS column.
Here is the actual result in powerBI Desktop.
Expected Result I want:
Word |
Profile_Criteria |
Keyword |
Faceted_Criteria |
Backgroud_Criteria |
Rating_Criteria |
Great @Anonymous
Please use
C_DS_label =
DISTINCT (
SELECTCOLUMNS (
GENERATE (
FILTER ( VALUES ( SampleData[C_DS] ), SampleData[C_DS] <> BLANK () ),
VAR NumofWords =
PATHLENGTH ( SUBSTITUTE ( [C_DS], ",", "|" ) )
VAR NumSeries =
GENERATESERIES ( 1, NumofWords )
RETURN
ADDCOLUMNS (
NumSeries,
"@Word", PATHITEM ( SUBSTITUTE ( [C_DS], ",", "|" ), [Value] )
)
),
"Word", [@Word]
)
)
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
109 | |
99 | |
77 | |
66 | |
54 |
User | Count |
---|---|
144 | |
104 | |
102 | |
87 | |
64 |