cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
lst
Frequent Visitor

Import XML with leading linebreak

Hello

 

I am trying to import a XML file I get from an API into Powerquery. However, I keep getting the following error message: 

 

DataFormat.Error: Xml processing failed. Either the input is invalid or it isn't supported. (Internal error: '>' is an unexpected token. The expected token is ''. Line 2, position 63.)

 

 

The problem seems to be that the received XML has a leading linebreak. When I download the data and save it as an XML file without the leading linebreak the import into Powerquery works fine. 

 

Is there any way to remove the linebreak before powerquery parses the XML without having to fix the API?  

 

thanks

2 ACCEPTED SOLUTIONS
KNP
Super User
Super User

Hi @lst,

 

This took quite a bit of trial and error but I think I at least have a working example that you can modify to suit.

See code below...(attached PBIX and sample file, zipped as can't upload xml directly)

let
  Source = Files,
  File = Source
    {[#"Folder Path" = "H:\My Drive\Power BI\Community Solutions\Files\", Name = "sample.xml"]}
    [Content],
  #"Imported CSV" = Csv.Document(
    File,
    [Delimiter = "`", Encoding = 1252, QuoteStyle = QuoteStyle.None]
  ),
  #"Changed Type" = Table.TransformColumnTypes(#"Imported CSV", {{"Column1", type text}}),
  #"Removed Top Rows" = Table.Skip(#"Changed Type", 1),
  #"Added Custom" = Table.AddColumn(#"Removed Top Rows", "Custom", each Text.Trim([Column1])),
  #"Removed Other Columns" = Table.SelectColumns(#"Added Custom", {"Custom"}),
  Custom = #"Removed Other Columns"[Custom],
  Custom1 = Text.Combine(Custom),
  #"Parsed XML" = Xml.Tables(Custom1)
in
  #"Parsed XML"

 

Basically, forcing the import of the xml file as csv, quote style in csv import matters.

Remove the top (blank) row.

Trim the text in new column.

Use this column with a Text.Combine.

Parse as XML.

 

I hope this helps.

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x

View solution in original post

KNP
Super User
Super User

Hi @lst,

 

It sounds like you need a couple of extra steps to get the data first. I would probably do that in a staging query and then reference that query to parse the xml.

 

Although it is part of the same problem you are trying to solve, it is kind of a different issue.

 

If the first part is solved, could I ask that you create a new post about getting the data from API after the login screen (or whatever topic name makes sense). Tag me in that question also if you like. You will likely get an answer to that part of the problem faster if it is a new post as more people will be looking at it.

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x

View solution in original post

4 REPLIES 4
lst
Frequent Visitor

Hi @KNP 

thank you so much for your solution! It works great as long as I am working with a xml file. If I try to switch the data source to the api (using web.contents()), the csv.document() function captures the html file of the login screen instead of the xml file that is displayed shortly afterwards. 

 

 

KNP
Super User
Super User

Hi @lst,

 

It sounds like you need a couple of extra steps to get the data first. I would probably do that in a staging query and then reference that query to parse the xml.

 

Although it is part of the same problem you are trying to solve, it is kind of a different issue.

 

If the first part is solved, could I ask that you create a new post about getting the data from API after the login screen (or whatever topic name makes sense). Tag me in that question also if you like. You will likely get an answer to that part of the problem faster if it is a new post as more people will be looking at it.

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x

View solution in original post

lst
Frequent Visitor

Thanks, I will definitely try that!

KNP
Super User
Super User

Hi @lst,

 

This took quite a bit of trial and error but I think I at least have a working example that you can modify to suit.

See code below...(attached PBIX and sample file, zipped as can't upload xml directly)

let
  Source = Files,
  File = Source
    {[#"Folder Path" = "H:\My Drive\Power BI\Community Solutions\Files\", Name = "sample.xml"]}
    [Content],
  #"Imported CSV" = Csv.Document(
    File,
    [Delimiter = "`", Encoding = 1252, QuoteStyle = QuoteStyle.None]
  ),
  #"Changed Type" = Table.TransformColumnTypes(#"Imported CSV", {{"Column1", type text}}),
  #"Removed Top Rows" = Table.Skip(#"Changed Type", 1),
  #"Added Custom" = Table.AddColumn(#"Removed Top Rows", "Custom", each Text.Trim([Column1])),
  #"Removed Other Columns" = Table.SelectColumns(#"Added Custom", {"Custom"}),
  Custom = #"Removed Other Columns"[Custom],
  Custom1 = Text.Combine(Custom),
  #"Parsed XML" = Xml.Tables(Custom1)
in
  #"Parsed XML"

 

Basically, forcing the import of the xml file as csv, quote style in csv import matters.

Remove the top (blank) row.

Trim the text in new column.

Use this column with a Text.Combine.

Parse as XML.

 

I hope this helps.

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x

View solution in original post

Helpful resources

Announcements
UG GA Amplification 768x460.png

Launching new user group features

Learn how to create your own user groups today!

November Power BI Update 768x460.png

Check it Out!

Click here to read more about the November 2021 Updates!

M365 768x460.jpg

Microsoft 365 Collaboration Conference | December 7–9, 2021

Join us, in-person, December 7–9 in Las Vegas, for the largest gathering of the Microsoft community in the world.

Top Solution Authors
Top Kudoed Authors