Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more.
Get startedGrow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.
Hi,
I tried creating small R script that is able to parse incorrect JSON column. The data is poor quality as it lacks quotations. Here is my code:
# 'dataset' holds the input data for this script pattern <- "(:)(?!null)(?!true)(?!false)(?!\"[a-z0-9A-Z.:-[[:space:]]]+\")([a-z0-9A-Z.:-[[:space:]]]+)(,|})" replacer <- function(x) { gsub(pattern, "\\1_TEST_\\2_TEST_\\3", as.character(x), perl = TRUE) } output <- within(dataset, { PARSED=replacer(dataset$WORKSTATUSES) } )
What is happening the regex matches incorrect data in R runner but doesn't in Power BI. The problem are spaces. I tried it all; \\s, [:blank:], [[:blank:]], [:space:], [[:space:]] as well as inserting the space character itself into my patter. Some gave errors, some did nothing (regex must had not match). Can you tell me how to do it properly?
Solved! Go to Solution.
It turned out the problem is my script didn't handle all cases of string-type values that occure in JSON I parse. Also some cells contain empty strings (not null) and therefore aren't valid for parsing (produce Errors as well). Filtering empty rows out and then filtering rows with error let me investigate what the problem really was.
Hi @Anonymous
First, could you show me the error in Power BI when running the R script?
What is your purpose to use this code?
Some requiremnets can be directly achieved by Power Query, if your requirement is clear, we can make a test for it.
Hi,
What I needed was parsing improper JSON from mongoDB. JSON I get is missing quotations around keys and some of values. I managed to work this out through the weekend and the code that works looks like this:
'dataset' holds the input data for this script pattern_key <- "(\"?)([a-zA-Z_-]+)(\"?:)" pattern_value <- "(:\"?)(?!null|true|false)([a-zA-Z0-9\\s,-_:.]+)(,|})" key_replacer <- function(x) { gsub(pattern_key, "\"\\2\":", x, perl = TRUE) } value_replacer <- function(x) { gsub(pattern_value, "\\1\"\\2\"\\3", x, perl = TRUE) } output <- within(dataset, { PARSED=value_replacer(key_replacer(dataset$WORKSTATUSES)) } )
Although this may lack some characters that should be treated as part of keys or values and as the result some quotations aren't being inserted in the right places. The error I get now says there are some unexpected characters at the end of JSON file. What's interesting I am able to parse JSON with my R script to new column, then parse JSON into Lists, further into separate rows and then expand as columns and it works fine in Query Editor. Once I click 'Apply changes' on the yellow bar in the main window it loads the data from remote server, tries to bend them to my needs and then this error occures. So I managed to make some progress in this area but it still doesn't work.
It turned out the problem is my script didn't handle all cases of string-type values that occure in JSON I parse. Also some cells contain empty strings (not null) and therefore aren't valid for parsing (produce Errors as well). Filtering empty rows out and then filtering rows with error let me investigate what the problem really was.
User | Count |
---|---|
80 | |
74 | |
63 | |
61 | |
45 |
User | Count |
---|---|
108 | |
98 | |
89 | |
82 | |
61 |