Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
copelandd3
New Member

XML/HTML Column Headings

Hi all,

 

So I've taken up a new role as a Data manager - previously all my data came in CSV files or xlsx and I could manipulate them freely. However - all the extracts i now receive are in XML/HTML and whilst I can pull the data in, manipulating it is seeming more complex. My role is in a secondary school, I'm looking to analyse attendance. The tables come through with these headings:

copelandd3_0-1653042769045.png

Now, I'm essentially wanting them to come through as "column 1/column 2" etc so i can then rename them in bulk, add in a source file name as a date and - well, then i should be away. 

 

does anyone know how to import the files but with generic column names instead of those which already exist?

 

My script (edited for security)

let
Source = Folder.Files("\\XXXXXX\Downloads\Attendance Reporting"),
#"Combined Binaries" = Binary.Combine(Source[Content]),
#"Imported HTML" = Web.Page(#"Combined Binaries"),
#"Filtered Rows" = Table.SelectRows(#"Imported HTML", each ([Caption] = "Full Register Report"))
in
#"Filtered Rows"

 

 

5 REPLIES 5
lbendlin
Super User
Super User

If these are your columns then the very first thing you need to do is unpivot these files to bring them into a usable format.

 

Please provide sanitized sample data that fully covers your issue (Please don't reply with "my data is sensitive and I cannot show it"). If you paste the data into a table in your post or use one of the file services it will be easier to assist you. Avoid posting screenshots of your source data if possible.

Please show the expected outcome based on the sample data you provided. Screenshots of the expected outcome are ok.

https://community.powerbi.com/t5/Desktop/How-to-Get-Your-Question-Answered-Quickly/m-p/1447523

Hi - thanks for the response.  The html files are extracted in the below format (I apologise for the screenshot, I can't scrub the names from the data due to the file type)...

copelandd3_1-1653289151603.png

I am unable to amend the output of the files, so I am needing to amend the data after this point. It shows the date, and then the class attandance marks below this. 

 

When pulling this into power query, it amends the headers automatically to the below:

copelandd3_2-1653289370448.png

This then obviously prevents the next files headers matching up. My plan is to utlise the first column as the name of the file, creating a "week beginning" column. My ideal sheet would look like the below:

 

 

copelandd3_3-1653289563265.png

I'm not fussy on the exact column names, but obviously I want to collate each week's worth which is proving difficult. 

 

Thanks

"the very first thing you need to do is unpivot these files "

So... how would I do that? I've never played with these XML / HTML files before. 

Please provide sanitized sample data that fully covers your issue. If you paste the data into a table in your post or use one of the file services it will be easier to assist you. I cannot use screenshots of your source data.

Please show the expected outcome based on the sample data you provided. Screenshots of the expected outcome are ok.

https://community.powerbi.com/t5/Desktop/How-to-Get-Your-Question-Answered-Quickly/m-p/1447523

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors