Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Import XML with leading linebreak

Hello

 

I am trying to import a XML file I get from an API into Powerquery. However, I keep getting the following error message: 

 

DataFormat.Error: Xml processing failed. Either the input is invalid or it isn't supported. (Internal error: '>' is an unexpected token. The expected token is ''. Line 2, position 63.)

 

 

The problem seems to be that the received XML has a leading linebreak. When I download the data and save it as an XML file without the leading linebreak the import into Powerquery works fine. 

 

Is there any way to remove the linebreak before powerquery parses the XML without having to fix the API?  

 

thanks

2 ACCEPTED SOLUTIONS
KNP
Super User
Super User

Hi @Anonymous,

 

This took quite a bit of trial and error but I think I at least have a working example that you can modify to suit.

See code below...(attached PBIX and sample file, zipped as can't upload xml directly)

let
  Source = Files,
  File = Source
    {[#"Folder Path" = "H:\My Drive\Power BI\Community Solutions\Files\", Name = "sample.xml"]}
    [Content],
  #"Imported CSV" = Csv.Document(
    File,
    [Delimiter = "`", Encoding = 1252, QuoteStyle = QuoteStyle.None]
  ),
  #"Changed Type" = Table.TransformColumnTypes(#"Imported CSV", {{"Column1", type text}}),
  #"Removed Top Rows" = Table.Skip(#"Changed Type", 1),
  #"Added Custom" = Table.AddColumn(#"Removed Top Rows", "Custom", each Text.Trim([Column1])),
  #"Removed Other Columns" = Table.SelectColumns(#"Added Custom", {"Custom"}),
  Custom = #"Removed Other Columns"[Custom],
  Custom1 = Text.Combine(Custom),
  #"Parsed XML" = Xml.Tables(Custom1)
in
  #"Parsed XML"

 

Basically, forcing the import of the xml file as csv, quote style in csv import matters.

Remove the top (blank) row.

Trim the text in new column.

Use this column with a Text.Combine.

Parse as XML.

 

I hope this helps.

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x

View solution in original post

Hi @Anonymous,

 

It sounds like you need a couple of extra steps to get the data first. I would probably do that in a staging query and then reference that query to parse the xml.

 

Although it is part of the same problem you are trying to solve, it is kind of a different issue.

 

If the first part is solved, could I ask that you create a new post about getting the data from API after the login screen (or whatever topic name makes sense). Tag me in that question also if you like. You will likely get an answer to that part of the problem faster if it is a new post as more people will be looking at it.

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x

View solution in original post

4 REPLIES 4
Anonymous
Not applicable

Hi @KNP 

thank you so much for your solution! It works great as long as I am working with a xml file. If I try to switch the data source to the api (using web.contents()), the csv.document() function captures the html file of the login screen instead of the xml file that is displayed shortly afterwards. 

 

 

Hi @Anonymous,

 

It sounds like you need a couple of extra steps to get the data first. I would probably do that in a staging query and then reference that query to parse the xml.

 

Although it is part of the same problem you are trying to solve, it is kind of a different issue.

 

If the first part is solved, could I ask that you create a new post about getting the data from API after the login screen (or whatever topic name makes sense). Tag me in that question also if you like. You will likely get an answer to that part of the problem faster if it is a new post as more people will be looking at it.

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x
Anonymous
Not applicable

Thanks, I will definitely try that!

KNP
Super User
Super User

Hi @Anonymous,

 

This took quite a bit of trial and error but I think I at least have a working example that you can modify to suit.

See code below...(attached PBIX and sample file, zipped as can't upload xml directly)

let
  Source = Files,
  File = Source
    {[#"Folder Path" = "H:\My Drive\Power BI\Community Solutions\Files\", Name = "sample.xml"]}
    [Content],
  #"Imported CSV" = Csv.Document(
    File,
    [Delimiter = "`", Encoding = 1252, QuoteStyle = QuoteStyle.None]
  ),
  #"Changed Type" = Table.TransformColumnTypes(#"Imported CSV", {{"Column1", type text}}),
  #"Removed Top Rows" = Table.Skip(#"Changed Type", 1),
  #"Added Custom" = Table.AddColumn(#"Removed Top Rows", "Custom", each Text.Trim([Column1])),
  #"Removed Other Columns" = Table.SelectColumns(#"Added Custom", {"Custom"}),
  Custom = #"Removed Other Columns"[Custom],
  Custom1 = Text.Combine(Custom),
  #"Parsed XML" = Xml.Tables(Custom1)
in
  #"Parsed XML"

 

Basically, forcing the import of the xml file as csv, quote style in csv import matters.

Remove the top (blank) row.

Trim the text in new column.

Use this column with a Text.Combine.

Parse as XML.

 

I hope this helps.

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
chrome-9xf-Zagzel-B

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
KIfp67uy-Sr
Proud to be a Super User!PBI-Super-User-Rank-30x30-1x

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors