Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Error: oledb or odbc error data.format We reached the end of the buffer.

I am getting the above error when I parse a text column that contains JSON data using PowerQuery. The methodology I am following is the same as outlined here (Guy in a Cube). I have 17000 rows of data (unparsed JSON data) in text format.

I am using an ODBC connection to a Vertica database.  At the time I get the error, memory usage is approximately 76%  - so not maxed out. However, if I reduce the number of rows o I attempt to parse from 17K to 2K, the error disappears. If I increase the number of rows to 4K, the error returns.

 

Additional background
1) I am using the latest Sept 2018 release of PBI desktop.
2) I am using 64 bit PBI
3) I have restarted my PC
4) I have turned off all the preview features
5) I have also tried reducing the columns of data returned by parsing the JSON

 

 Here is the M-code for the PowerQuery (with the 2K row filter).

let
    Source = Odbc.DataSource("dsn=VerticaTestCluster", [HierarchicalNavigation=true]),
    AT_Vertica_Database = Source{[Name="AT_Vertica",Kind="Database"]}[Data],
    CXSocial_Schema = AT_Vertica_Database{[Name="CXSocial",Kind="Schema"]}[Data],
    flex_op_json_mention_view_View = CXSocial_Schema{[Name="flex_op_json_mention_view",Kind="View"]}[Data],
    #"Show all rows" = Table.SelectColumns(flex_op_json_mention_view_View,{"complete_json"}),
    #"Remove rows with no JSON" = Table.SelectRows(#"Show all rows", each [complete_json] <> null),
    #"Added Index" = Table.AddIndexColumn(#"Remove rows with no JSON", "Index", 1, 1),
    #"Filtered Rows" = Table.SelectRows(#"Added Index", each [Index] < 4000),
    #"Parsed JSON" = Table.TransformColumns(#"Filtered Rows",{{"complete_json", Json.Document}}),
    #"Expanded complete_json" = Table.ExpandRecordColumn(#"Parsed JSON", "complete_json", {"message", "tags"}, {"complete_json.message", "complete_json.tags"}),
    #"Expanded complete_json.message" = Table.ExpandRecordColumn(#"Expanded complete_json", "complete_json.message", {"title", "language", "sentiment", "type"}, {"complete_json.message.title", "complete_json.message.language", "complete_json.message.sentiment", "complete_json.message.type"}),
    #"Expanded complete_json.tags" = Table.ExpandListColumn(#"Expanded complete_json.message", "complete_json.tags")
in
    #"Expanded complete_json.tags"

Similar posts that don't provide an obvious solution in my case (e.g up grade PBI, just wait it may fix itself etc.

https://community.powerbi.com/t5/Desktop/DataFormat-Error-We-reached-the-end-of-the-buffer/m-p/45136...

https://community.powerbi.com/t5/Desktop/DataFormat-Error-We-reached-the-end-of-the-buffer/td-p/2960...

 

Can anyone suggest a way forward or solution?

 

1 ACCEPTED SOLUTION

@Anonymous

 

Interesting! What happens when you try to move the buffer up the stream to the first table like so 

 

 #"Show all rows" = Table.Buffer(Table.SelectColumns(flex_op_json_mention_view_View,{"complete_json"})),

 

 N -

View solution in original post

10 REPLIES 10
Anonymous
Not applicable

I am getting the above error when I attempt to parse JSON data contained in  a text column, using PowerQuery. The methodology I am following is the same as outlined here (Guy in a Cube). I am starting with about 17000 rows of  data (unparsed JSON data). The data is comming from a Vertica via an ODBC connection 

 

Total memory  ultilized  on my PC is about 74% just prior to getting the error (see screenshot). 

 

Processes.JPG

 

I have also 

 

  1.   Restarted my machine
  2.  Run a windows update
  3. Confirmed  I am running the latest version of PBI (2.61.5192.601 64-bit (August 2018))
  4. Turned off all preview features

If I add an index column ( a row number column ), and then add a filter so that I only take the top 2000 of the 17000 rows prior to attempting to parse the JSON data , I don't get an error. However, if I increase the number of rows  from 2000 to 4000 the error comes back. 

 

This suggests the error is related to memory issues, (as the buffer reference in the error message eludes to).  However, at the same time, it didn't appear that RAM was being fully utilised. 

 

I have included the M-Code below . 

 

Can anyone suggest a solution?   

Cheers

 

Similar posts (without a definative fix)

https://community.powerbi.com/t5/Desktop/DataFormat-Error-We-reached-the-end-of-the-buffer/m-p/45136...

https://community.powerbi.com/t5/Desktop/DataFormat-Error-We-reached-the-end-of-the-buffer/td-p/2960...

 

let
    Source = Odbc.DataSource("dsn=VerticaTestCluster", [HierarchicalNavigation=true]),
    AT_Vertica_Database = Source{[Name="AT_Vertica",Kind="Database"]}[Data],
    CXSocial_Schema = AT_Vertica_Database{[Name="CXSocial",Kind="Schema"]}[Data],
    flex_op_json_mention_view_View = CXSocial_Schema{[Name="flex_op_json_mention_view",Kind="View"]}[Data],
    #"Show all rows" = Table.SelectColumns(flex_op_json_mention_view_View,{"complete_json"}),
    #"Remove rows with no JSON" = Table.SelectRows(#"Show all rows", each [complete_json] <> null),
    #"Added Index" = Table.AddIndexColumn(#"Remove rows with no JSON", "Index", 1, 1),
    #"Filtered Rows" = Table.SelectRows(#"Added Index", each [Index] < 2000),
    #"Parsed JSON" = Table.TransformColumns(#"Filtered Rows",{{"complete_json", Json.Document}}),
    #"Expanded complete_json" = Table.ExpandRecordColumn(#"Parsed JSON", "complete_json", {"message", "tags"}, {"complete_json.message", "complete_json.tags"}),
    #"Expanded complete_json.message" = Table.ExpandRecordColumn(#"Expanded complete_json", "complete_json.message", {"title", "language", "sentiment", "type"}, {"complete_json.message.title", "complete_json.message.language", "complete_json.message.sentiment", "complete_json.message.type"}),
    #"Expanded complete_json.tags" = Table.ExpandListColumn(#"Expanded complete_json.message", "complete_json.tags")
in
    #"Expanded complete_json.tags"

 

 

Anonymous
Not applicable

Thanks - I ended up abandoning doing this is PowerQuery, and set up a staging table in SQL database that parsed the JSON using T-SQL.

 

But I have gone back and  tested moving the buffer up, as you suggested, and this works. Thanks for your input. 

@Anonymous  Glad to hear it worked out for you. It makes a good case for the community when to use Buffer functions. Let's call this one - parsing large JSON files. 

 

- Nick

@Anonymous   You might like this news. We have a native connector now for Vertica starting with Power October 2018 release.

Anonymous
Not applicable

@nickchobotar Yes- I just saw that an hour or so ago!  Thanks again for your help. 

Hello @Anonymous and @nickchobotar. So... I'm getting this error from transforming a JSON file and an XLS file. What can I check/adjust here to get this query to complete?

 

powerBIErrorMsg.png

This appears to be some kind of memory issue. I say that because when I execute the (many!) queries individually, the error does not appear. 

 

What is the memory limit for Power BI Desktop?

 

Thanks

nickchobotar
Skilled Sharer
Skilled Sharer

@Anonymous

 

It appears Vertica is odbc compliant database, so I would make sure drivers are of the same version that your system.

 

Also, based ont the error message, you can try to buffer the table on your line 10.   (not tested)

 

 #"Parsed JSON" = Table.Buffer(Table.TransformColumns(#"Filtered Rows",{{"complete_json", Json.Document}})),


Nick -

Anonymous
Not applicable

@nickchobotar Thanks for the suggestions.

Drivers don't appear to be the issue. I seperated  the Extract & Load  (moving the Text  column  from Vertica to PowerBI) and Transformation (transforming Text to JSON & parsing of the JSON) into two seperate steps.  The  Extract & Load  works fine - indicating that the issue is not with the connection with Vertica and Drivers.  So the issue is with the Parsing of the text column  to JSON.

I have also tried adding in the the  "Table.Buffer..."  code suggested and still run into the same "Buffer" error.

If you have any other ideas, I would be keen to try them .


Cheers Steve

 

 

@Anonymous

 

Interesting! What happens when you try to move the buffer up the stream to the first table like so 

 

 #"Show all rows" = Table.Buffer(Table.SelectColumns(flex_op_json_mention_view_View,{"complete_json"})),

 

 N -

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.