Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I am getting the above error when I parse a text column that contains JSON data using PowerQuery. The methodology I am following is the same as outlined here (Guy in a Cube). I have 17000 rows of data (unparsed JSON data) in text format.
I am using an ODBC connection to a Vertica database. At the time I get the error, memory usage is approximately 76% - so not maxed out. However, if I reduce the number of rows o I attempt to parse from 17K to 2K, the error disappears. If I increase the number of rows to 4K, the error returns.
Additional background
1) I am using the latest Sept 2018 release of PBI desktop.
2) I am using 64 bit PBI
3) I have restarted my PC
4) I have turned off all the preview features
5) I have also tried reducing the columns of data returned by parsing the JSON
Here is the M-code for the PowerQuery (with the 2K row filter).
let Source = Odbc.DataSource("dsn=VerticaTestCluster", [HierarchicalNavigation=true]), AT_Vertica_Database = Source{[Name="AT_Vertica",Kind="Database"]}[Data], CXSocial_Schema = AT_Vertica_Database{[Name="CXSocial",Kind="Schema"]}[Data], flex_op_json_mention_view_View = CXSocial_Schema{[Name="flex_op_json_mention_view",Kind="View"]}[Data], #"Show all rows" = Table.SelectColumns(flex_op_json_mention_view_View,{"complete_json"}), #"Remove rows with no JSON" = Table.SelectRows(#"Show all rows", each [complete_json] <> null), #"Added Index" = Table.AddIndexColumn(#"Remove rows with no JSON", "Index", 1, 1), #"Filtered Rows" = Table.SelectRows(#"Added Index", each [Index] < 4000), #"Parsed JSON" = Table.TransformColumns(#"Filtered Rows",{{"complete_json", Json.Document}}), #"Expanded complete_json" = Table.ExpandRecordColumn(#"Parsed JSON", "complete_json", {"message", "tags"}, {"complete_json.message", "complete_json.tags"}), #"Expanded complete_json.message" = Table.ExpandRecordColumn(#"Expanded complete_json", "complete_json.message", {"title", "language", "sentiment", "type"}, {"complete_json.message.title", "complete_json.message.language", "complete_json.message.sentiment", "complete_json.message.type"}), #"Expanded complete_json.tags" = Table.ExpandListColumn(#"Expanded complete_json.message", "complete_json.tags") in #"Expanded complete_json.tags"
Similar posts that don't provide an obvious solution in my case (e.g up grade PBI, just wait it may fix itself etc.
Can anyone suggest a way forward or solution?
Solved! Go to Solution.
@Anonymous
Interesting! What happens when you try to move the buffer up the stream to the first table like so
#"Show all rows" = Table.Buffer(Table.SelectColumns(flex_op_json_mention_view_View,{"complete_json"})),
N -
I am getting the above error when I attempt to parse JSON data contained in a text column, using PowerQuery. The methodology I am following is the same as outlined here (Guy in a Cube). I am starting with about 17000 rows of data (unparsed JSON data). The data is comming from a Vertica via an ODBC connection
Total memory ultilized on my PC is about 74% just prior to getting the error (see screenshot).
I have also
If I add an index column ( a row number column ), and then add a filter so that I only take the top 2000 of the 17000 rows prior to attempting to parse the JSON data , I don't get an error. However, if I increase the number of rows from 2000 to 4000 the error comes back.
This suggests the error is related to memory issues, (as the buffer reference in the error message eludes to). However, at the same time, it didn't appear that RAM was being fully utilised.
I have included the M-Code below .
Can anyone suggest a solution?
Cheers
Similar posts (without a definative fix)
let Source = Odbc.DataSource("dsn=VerticaTestCluster", [HierarchicalNavigation=true]), AT_Vertica_Database = Source{[Name="AT_Vertica",Kind="Database"]}[Data], CXSocial_Schema = AT_Vertica_Database{[Name="CXSocial",Kind="Schema"]}[Data], flex_op_json_mention_view_View = CXSocial_Schema{[Name="flex_op_json_mention_view",Kind="View"]}[Data], #"Show all rows" = Table.SelectColumns(flex_op_json_mention_view_View,{"complete_json"}), #"Remove rows with no JSON" = Table.SelectRows(#"Show all rows", each [complete_json] <> null), #"Added Index" = Table.AddIndexColumn(#"Remove rows with no JSON", "Index", 1, 1), #"Filtered Rows" = Table.SelectRows(#"Added Index", each [Index] < 2000), #"Parsed JSON" = Table.TransformColumns(#"Filtered Rows",{{"complete_json", Json.Document}}), #"Expanded complete_json" = Table.ExpandRecordColumn(#"Parsed JSON", "complete_json", {"message", "tags"}, {"complete_json.message", "complete_json.tags"}), #"Expanded complete_json.message" = Table.ExpandRecordColumn(#"Expanded complete_json", "complete_json.message", {"title", "language", "sentiment", "type"}, {"complete_json.message.title", "complete_json.message.language", "complete_json.message.sentiment", "complete_json.message.type"}), #"Expanded complete_json.tags" = Table.ExpandListColumn(#"Expanded complete_json.message", "complete_json.tags") in #"Expanded complete_json.tags"
Thanks - I ended up abandoning doing this is PowerQuery, and set up a staging table in SQL database that parsed the JSON using T-SQL.
But I have gone back and tested moving the buffer up, as you suggested, and this works. Thanks for your input.
@Anonymous Glad to hear it worked out for you. It makes a good case for the community when to use Buffer functions. Let's call this one - parsing large JSON files.
- Nick
@Anonymous You might like this news. We have a native connector now for Vertica starting with Power October 2018 release.
@nickchobotar Yes- I just saw that an hour or so ago! Thanks again for your help.
Hello @Anonymous and @nickchobotar. So... I'm getting this error from transforming a JSON file and an XLS file. What can I check/adjust here to get this query to complete?
This appears to be some kind of memory issue. I say that because when I execute the (many!) queries individually, the error does not appear.
What is the memory limit for Power BI Desktop?
Thanks
@Anonymous
It appears Vertica is odbc compliant database, so I would make sure drivers are of the same version that your system.
Also, based ont the error message, you can try to buffer the table on your line 10. (not tested)
#"Parsed JSON" = Table.Buffer(Table.TransformColumns(#"Filtered Rows",{{"complete_json", Json.Document}})),
Nick -
@nickchobotar Thanks for the suggestions.
Drivers don't appear to be the issue. I seperated the Extract & Load (moving the Text column from Vertica to PowerBI) and Transformation (transforming Text to JSON & parsing of the JSON) into two seperate steps. The Extract & Load works fine - indicating that the issue is not with the connection with Vertica and Drivers. So the issue is with the Parsing of the text column to JSON.
I have also tried adding in the the "Table.Buffer..." code suggested and still run into the same "Buffer" error.
If you have any other ideas, I would be keen to try them .
Cheers Steve
@Anonymous
Interesting! What happens when you try to move the buffer up the stream to the first table like so
#"Show all rows" = Table.Buffer(Table.SelectColumns(flex_op_json_mention_view_View,{"complete_json"})),
N -
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
113 | |
97 | |
85 | |
67 | |
60 |
User | Count |
---|---|
150 | |
120 | |
99 | |
87 | |
68 |