Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
ripstaur
Helper III
Helper III

Missing data when loading files from a folder

I have used the "Folder" option in "Get Data" with a lot of success before. I am attempting to build a report now that combines several files, all in exactly the same format (I was very careful about this). I have tried using "Load" and "Transform and Load" options, but the same thing happens...the columns all come in, but some of the data is missing.

 

I have tried looking at each file before I load them all, and the data are all there when I look at them individually. However, when I load the entire folder, some of the data is missing again. 

 

Any ideas? One note - I cannot convert the individual worksheets to tables, because I need the dynamic headers to make the user inputs work correctly. 

 

For now, I am just going to append. It's not the best or most efficient way to go, though, so if anyone has run across this problem (and a solution), it would be great! Thanks in advance. 

1 ACCEPTED SOLUTION
PhilipTreacy
Super User
Super User

Hi @ripstaur 

Your query steps may be inadvertently excluding some rows of data.  The query editor only previews a max of 1000 rows.  If there's data beyond that it can get left out of your steps exclude it.  What rows are missing?  What's the data in them? Check your steps to see what they are doing and make sure they don't exclude anything.

 

For testing purposes, you can also try creating a separate query for each file (rather than using the From Folder connector), load the file and then check if all your expected data is there.  At least that may give you some pointers as to what rows are missing,and indicate why this is happening.

 

Not sure what you mean by I cannot convert the individual worksheets to tables, because I need the dynamic headers to make the user inputs work correctly  - it's better to work with Excel tables.  Maybe you could create the Excel tables from the ranges where this data is being input?

Regards

Phil



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


View solution in original post

5 REPLIES 5
ripstaur
Helper III
Helper III

Phil,

 

Thanks for giving me some things to check..Your second paragraph helped a lot. Here's what ended up somehow working: I went in and loaded each file separately in PBI. I learned a couple of things:

1. I had some spurious hidden data in one of them (this seems to have been what kept the file load from working). So I went back into each file and deleted 3 (blank) rows from beneath the array I wanted to use. 

2. If you have an array where some columns contain no data (only a header), PBI won't recognize your headers as headers.

 

After deleting those rows in each file, I went back to PBI and used the folder option. This time it loaded. I just had to click on "use first row as headers" to promote the header, then everything worked fine. 

BTW, what I meant about the table is this: The data files in the folder are results from a questionnairre, which is sent out to respondents as an Excel workbook. They provide their answers on one sheet, and then the answers are  compiled/simplified in another worksheet. The completed questionnaire workbooks are kept in the folder, and the compilation worksheets are what I select when I load the data from the folder. Each compilation sheet contains an array of 12 columns and 11 rows. This particular questionnaire is sent to up to 10 individuals. 

The headers in the compilation worksheet come from formulas which bring product model names in from another sheet. I tried inserting a table, but I get an error telling me that the headers have to be text and cannot be formulas. 

 

Best regards,

 

Rip

Hi @ripstaur 

Yes Excel table headers must be text.  But if you were using tables you'd have avoided this particular issue with hidden data/blank rows.

You can create the tables with generic column names and then after import to PBI, rename the columns.

Regards

Phil



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


PhilipTreacy
Super User
Super User

Hi @ripstaur 

Your query steps may be inadvertently excluding some rows of data.  The query editor only previews a max of 1000 rows.  If there's data beyond that it can get left out of your steps exclude it.  What rows are missing?  What's the data in them? Check your steps to see what they are doing and make sure they don't exclude anything.

 

For testing purposes, you can also try creating a separate query for each file (rather than using the From Folder connector), load the file and then check if all your expected data is there.  At least that may give you some pointers as to what rows are missing,and indicate why this is happening.

 

Not sure what you mean by I cannot convert the individual worksheets to tables, because I need the dynamic headers to make the user inputs work correctly  - it's better to work with Excel tables.  Maybe you could create the Excel tables from the ranges where this data is being input?

Regards

Phil



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


I recently experienced this when using the From Folder process in Excel and imagine the same would occur in Power BI when trying to consolidate Excel files where the first few rows contain non-column header information. In my case, the first two rows of data contained a report name and other ancillary information which sounds similar to your own. The third row contained column headers with data following thereafter. In my case, only the first column generated null values for all files other than the Sample File and the ancillary info in the first two rows was in the first column; this is a clue to the issue. 

 

If the files were saved as CSV, the Folder process worked with no issues. If the content from the first two rows were removed where the first available row with data is the column headings, that resolved the issue as well. However, I did not want to need to manually touch any of the files in order for the process to work. It occurred to me that every time I have used the from Folder process with Excel files, the files always had the first-row containing column headers. Again, another clue.  Here is what I discovered.

 

In the Transform Sample File associated with your query, check to see if it has a Promoted Headers step at the end as seen in the second pic. 

TransformFile.png

Transform2.png

 

My guess is you probably do which is what is creating this issue. 

 

Tranform3.png

 

It seems that Power Query is expecting the first row of data in Excel to contain column headings. This works most of the time, however, when the first few rows contains ancillary information it promotes something that is not a usable column header. For this reason in my case, it explains why the Sample File had values in the first column but none of the subsequent files did because to do so, they would need their first row to match the promoted header shown above. 

 

Therefore, remove the Promoted Headers step in the Transform Sample File. This will leave you with the first column being Column1 which can then align correctly with all other files Column1. 

 

When you remove Promoted Headers, your query may now generate an error like below. 

Transform4.png

Now, simply remove the Changed Type step that is looking for the previously promoted Column1 or remove the portion of the formula that is looking for the column that is no longer there. 

 

Transform5.png

 

SterlingTX_0-1713268659208.png

 

In retrospect, if there had been other non-header information on row 1 in columns 2, 3, etc., we could expect null values to be present in all those columns as well for all files except the sample file.

 

Hope this helps!

 

Sterling @  Ember Analytics 

 

 

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors