Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi,
I have a folder with 300+ csv files. It's a mass download that I receive on a regular basis mirroring our EMR's database so each file is it's own table with a different schema. I'd like to import all of them into Power BI. I need them imported as separate tables so each table is listed on the side like with any other import of multiple tables (e.g. several tables from an excel file or database). I don't see any options allowing me to automatically read all the tables in that folder in, nor do I see a way to select multiple csv files. How can I get all those tables without having to click "get data" for each and every one of them?
Thanks,
Yisroel Len
Solved! Go to Solution.
Hi @caninetiger
Currently Power Query doesn't support to create more queries(tables) automatically or dynamically from a table. So this is not possible at present. I read your discussion in another thread about this and you may decide to use a Python script to write them into an Excel Workbook or a database. This seems the best alternative currently.
But I'd like to remind that if these files have too many rows, it may hit the data size limit as you will have 300+ tables in a semantic model. You may refer to Power BI limit for number of tables - Microsoft Fabric Community There is no document to point out that the limit for number of tables in Power BI but there is limit to data size.
Best Regards,
Jing
@Greg_Deckler unfortunately they all have different schemas... It's the only mass download option the software gives us.
So, I've been praying.
@caninetiger OK, so what do you want to end up with? A single table with all of the rows from all of the files or 300+ tables, one for each file or something else?
300+ tables, one for each file. It's a mass export from our EHR. They've basically mirrored a lot of the underlying data model with tables for patients, tables for employees, tables for encounters etc. The correct way to go about this is to import all of it in and properly build out the relationships between tables in Power BI. That way I can easily create graphs and reports.
Maybe I'm wrong but I believe that's the best way to go about it. I just can't figure out how to import them all at the same time.
@caninetiger OK, found a similar issue and maybe the technique will work for you.
dynamically import files from a folder WITHOUT com... - Microsoft Fabric Community
@Greg_Deckler, please excuse my inexperience but I searched through it and I'm not sure where the solution is.
I see your recommendation at the bottom and I've previously looked into writing a specialized function that perhaps could separate all the files in the folder into separate tables but I'm not familiar enough with the M language to do that. I've gotten only as far as listing all the files as separate tables in Power Query (see pic below), but I wouldn't know what to do next:
Hi @caninetiger
Currently Power Query doesn't support to create more queries(tables) automatically or dynamically from a table. So this is not possible at present. I read your discussion in another thread about this and you may decide to use a Python script to write them into an Excel Workbook or a database. This seems the best alternative currently.
But I'd like to remind that if these files have too many rows, it may hit the data size limit as you will have 300+ tables in a semantic model. You may refer to Power BI limit for number of tables - Microsoft Fabric Community There is no document to point out that the limit for number of tables in Power BI but there is limit to data size.
Best Regards,
Jing
Here's that Python script if anyone wants to use it:
import pandas as pd
import glob
from pathlib import Path
# path to your folder
folder_path = r"C:\Users\YOUR_USER\Downloads\FOLDER_NAME/*.csv"
# create an excel file
with pd.ExcelWriter("EXCEL_FILE_NAME.xlsx") as writer:
# Get paths for every csv file in the folder
for file_path in glob.glob(folder_path):
# get each file name
file_name=Path(file_path).stem
# read the file into pandas
df = pd.read_csv(file_path)
# write the file into the excel file with each csv file becoming it's own sheet
df.to_excel(writer, sheet_name=file_name, index=False)
Just a heads up about the above code, if any of your csv file names have more than 31 characters, excel will reject it as a sheet name and the code will break.
Best,
Yisroel
@caninetiger I would recommend reaching out to the user that originally posted the issue and/or @ImkeF
@caninetiger Let me think about this one.
@caninetiger If they all have the same schema, then you can use a Folder query. If they all have different schemas may God have mercy on your soul...