Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello Power Bi community,
I was trying to extract information using python code from a website that only holds the last 30 days of the records that I need and append on a dataset in PowerBI. My goal is to be able to refresh the process in Power BI, so the new content posted on the website will be appended to my PowerBI dataset.
So, for example, my source file (csv) contains records from 8/15-9/14, which I extracted on 9/14.
And now in PowerBI, I ran a python script that 1. scraped all the information on the website today (8/29-9/28) and 2. appended the scraped records to my source file, so now i have a dataset in PowerBI that contains records from 8/15-9/28. I call it "Appended Dataset".
My question is, on a future date like 10/25, the website will only have the records from 9/26-10/25. If I clicked "refresh all" on Query Editor, will the python code be applied on the source csv so I will get a dataset that contains (8/15-9/14, 9/26-10/25) records, or the python code will be applied on the "Appended Dataset" so I will get a dataset that contains (8/15-10/25) records. If it runs on the original dataset that misses the records from 9/15-9/25, what's the best thing I can do to prevent that from happening?
I know it sounds confusing so please let me know if there's anything I can do to clarify my question. Thank you!
Hey,
I have to admit that I do not fully understand what you fully have so far, if possible share your query by using the advanced editor in PowerQuery.
Basically when I'm facing similar tasks I wolud do the following
Use a python script as a data source in PowerBI, from within this script
Currently it's somewhat difficult to add data increments to existing tables this may change with the general availability of dataflows.
Currently I'm trying to create the "complete" dataset outside of the Power BI data pipeline and then source the complete file to Power BI.
You can leverage R or Python based data sources using the on-premises data gateway in personal mode.
Hopefully this provides you some ideas.
Regards,
Tom
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
111 | |
94 | |
80 | |
68 | |
59 |
User | Count |
---|---|
150 | |
119 | |
104 | |
87 | |
67 |