The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now
Hi !
I am a beginner in R and I'm trying to use it in Query Editor to combine multiple text files, but the output is a empty table. I tried the code in R Studio and a get a positive result. I checked the class and the output is a data frame. I do not know what I am doing wrong.
Any help is appreciated.
I used the code bellow with a loop to read all the files in a folder and put them in a data frame. I know there is many other ways to do that in R, but I choose this example because it looks like something I would do in VBA.
Code using Loop
library(readr)
library(stringr)
Path<- setwd("C:/Users/mrX/Desktop/Temp")
files <- list.files(
path = Path
, pattern = "(.*)csv$"
, ignore.case = TRUE
)
master_data<- data.frame()
counter = 0
for (file in files){
file_data<-read.csv2(file)
if (counter==0){
master_data<-file_data
} else {
master_data<-rbind(master_data, file_data)
}
counter = counter+1
}
master_data
File 1
Name Value
a 1
b 2
c 3
File 2
Name Value
d 4
e 5
f 6
Thank you
Solved! Go to Solution.
Hi @Anonymous,
Ok, got it, could you please mark the useful/helpful reply as answer? So more people will benefit from here. I really appreciate it.
Best Regards,
Angelia
Hi @Anonymous
You can combine multiple files without using R but just choosing the Folder option as a data source.
Thank you for the reply @Phil_Seamark !
Yes, query editor does it very well.
My actual model uses a lot of GUI transformations and some customized M code.
It is updated weekly with more than 60 k rows and the consolidate table is growing fast. But it is not a problem for Power Bi.
My issue is the evaluation chain, because I need to merge this table with others. Every time I need to change or update the model, it is taking hours.
I did some experimentation with a kind of “staging” process. Consolidating all the tables separately and only then load to the model. It made the update process much faster.
What I am trying to do now is to consolidate the files using R inside query editor and check how it affects evaluation chain and the speed. It is just an experimentation.
If it does not work, I will need to load the files to a database and create the final tables using joins, just to reduce the load of the model.
Thank you
Hi @Anonymous
I see your problem, but don't know enough about the implementation of R in the Query Editor to suggest how to fix via the R-Script. Personally I'd do it in a data base so long as you have easy access to one. Or use Azure DB (or data lake) to store and pre-process your data.
Thank you for the suggestion, but I hope I can find a workaround.
If I have to use a database, I need to request IT and it is not their main priority.
I also hope microsoft consider this problem very import, otherwise the concept of Power Bi as a complete self service Bi tool, will only be true for small set of data.
Power Query is the most wonderful tool I have ever used. It is powerful to use and much simpler then SSIS. Consolidate files is fast and easier than do a bulk insert in a database. But merging is disappointing.
Using a database to pre process data will make Power Query not relevant, since I can do all the work via T-Sql.
I used another R script and it worked well, even so I still do not have a clue why I got a empty table with the first script.
I will load the rest of the tables and see how performance goes.
Hi @Anonymous,
When you load one file data, if it returns the correct data as what described in this blog? Please load file1 and file2 separately and check if it works fine, then consolidate the two files together.
Best Regards,
Angelia
Thank you very much, @v-huizhn-msft !
I wish I had find this blog before. A lot of good info about R and Power Query.
Yes, I did "baby steps" with the consolidation in R. I load one file and it worked file. Then, I added one more and consolidate. Everything was fine. But when I tried to run a loop, because I have a lot of files to consolidate in my actual model, I got an empty table.
I changed my code to lapply + do.call +r.bind and now it is working very well.
My first problem with evaluation chain is solved. The model is running much faster with ~pre- consolidated~ files.
I might even try to go deeper in R and solve my second evaluation chain problem: merge.
I am very happy to have R as alternative, because is integrated in the same environment of the model. I could use VBA to consolidate the csv, but I it is much simpler to just click refresh in Power BI without any worries about doing a two-step process.
Best wishes,
M.Penner
Hi @Anonymous,
Ok, got it, could you please mark the useful/helpful reply as answer? So more people will benefit from here. I really appreciate it.
Best Regards,
Angelia
User | Count |
---|---|
160 | |
110 | |
96 | |
86 | |
75 |
User | Count |
---|---|
157 | |
137 | |
132 | |
81 | |
61 |