Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
vpanchu
Helper I
Helper I

Excel sheets from sharepoint

Hi Experts,

 

I have a scenario to create a dashboard fetching data from 50 to 400 excel sheets which resides  in sharepoint folders , every country has a sharepoint folder inside that folder there are excel sheets to input data , i have more than 40 countries and respective folders and somewere between 10 to 20 excel sheets in each folder.

 

given this scenario , the normal way is to pull data from each share point folder and append it and make a single sheet and then create a viz for that. is this the only way to do it? 

 

Can anyone provide me with a better solution because

The excel template contains 40 columns and less than 1000 rows in each excel file.

 

Challenge i am facing now is :

1- as i import data from more and more shaepoint folders the power query slows down, and its nearly impossible to do any editing in power query as i fetch many folders.

2- if someone by misake edits a field or rename a filed its getting hard to find error and correct it.

 

I would really appriciate any suggetions on how to handle the situation.

 

1 ACCEPTED SOLUTION
Jimmy801
Community Champion
Community Champion

Hello @vpanchu 

 

i know that if you are reading from so many files and then want to add columns in your productive systems, this will be really ugly, because however to update your preview PQ has to start getting the data again. To it would make sense so reduce the amount of data when your are developing by using two different parameters so you can work on a small amount of data and when it's finished you can use all your data. This is how I did always when working with a lot of files

 

BR

 

Jimmy

View solution in original post

6 REPLIES 6
Jimmy801
Community Champion
Community Champion

Hello @vpanchu 

 

about challenge 1: Do you just read the data and combine or are you also transforming it? Are you accessing the file once or multiple times? If you are accessing it multiple times, you can think of using a Buffer-function to limit access. Normaly just reading a file and combining them should not take too long. How long it exactly takes? How big is each file?

 

challenge 2: You are talking about column names right? And you are using a template, meaning every Excel-file should be exactly the same. If so, you could first read the folder, then access every column headers and check if they are the same as your template (you could hardcode a list with your column headers and use this list to make the checks. So you would get a list of files and indicator if the column names are ok

 

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Hi @Jimmy801 

 

Thank you for the response.

I have answered your questions below

 

about challenge 1: Do you just read the data and combine or are you also transforming it? Are you accessing the file once or multiple times? If you are accessing it multiple times, you can think of using a Buffer-function to limit access. Normaly just reading a file and combining them should not take too long. How long it exactly takes? How big is each file?

I  Read excel sheets and do some transformations (removing and renaming rows and columns) but nothing heavy as i am not able to its very slow after i load more than 10 excel from diffrent sharepoint folders and Append all to one , and later I uncheck the Enable load option in the seperate files and keep only the appended file in power query for viz.

Data Source details :

Sharepoint folder , every country has a folder and 5 to 20  excel sheets, 40 columns in a excel sheet , not morethat 1000  to 2000 rows per excel sheet as of  today , it may be populated with more 2000 every year.

 

challenge 2: You are talking about column names right? And you are using a template, meaning every Excel-file should be exactly the same. If so, you could first read the folder, then access every column headers and check if they are the same as your template (you could hardcode a list with your column headers and use this list to make the checks. So you would get a list of files and indicator if the column names are ok

 

Hence i do remove some columns and promote headers ect to set the fields correctly in Power BI as it has lot of unwanted rows aboe in the excel file, can i aply the above techniq in this case? 

 

Any more suggetions for me ?

 

Regards

VP

Jimmy801
Community Champion
Community Champion

Hello @vpanchu 

 

reading a lot of files will takes some time, thats for sure. But it should not be hours, but within 15 - 20 minutes. You didn't tell me how long it takes that you are concered about performance. 

 

about your second question. Didn't you say that all files are the same? If yes, then you have to remove the top empty rows of all files before promote headers. If no, you need a logic to work around this issue. Some characteristics that you can query and then continue accordingly. You can apply my technique.. indipendingly what transformation you have to to beforehand. Lets say cut 2 empty rows and then promote headers... then you get a table where you can use the columnnames to make a comparison with your template and check wheter something is wrong or not

 

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Hi @Jimmy801 basically this paticular app taks atleast 15 to 25 min to open up when i try to open the PBIX file thats forst issue and refreshing takes almost 20 min , to add some additional query transformation it lags and takes ages to reflect ,for examaple if i add a conditional column or delete a column it takes 5 to 10 minutes to reflect the change and it slows down the developement speed , just wanted to understand if its because i am getting data from 40 share point folders ?

the second option you have suggested i shall try that as well ,thank you for the help.

 

Regards

VP

Jimmy801
Community Champion
Community Champion

Hello @vpanchu 

 

i know that if you are reading from so many files and then want to add columns in your productive systems, this will be really ugly, because however to update your preview PQ has to start getting the data again. To it would make sense so reduce the amount of data when your are developing by using two different parameters so you can work on a small amount of data and when it's finished you can use all your data. This is how I did always when working with a lot of files

 

BR

 

Jimmy

thank you jimmy will try it out.

regards

VP

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors