Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
cpkoywk
New Member

Data refresh and Python Script

Hello Power Bi community,

 

I was trying to extract information using python code from a website that only holds the last 30 days of the records that I need and append on a dataset in PowerBI. My goal is to be able to refresh the process in Power BI, so the new content posted on the website will be appended to my PowerBI dataset.

 

So, for example, my source file (csv) contains records from 8/15-9/14, which I extracted on 9/14. 

 

And now in PowerBI, I ran a python script that 1. scraped all the information on the website today (8/29-9/28) and 2. appended the scraped records to my source file, so now i have a dataset in PowerBI that contains records from 8/15-9/28. I call it "Appended Dataset".

 

My question is,  on a future date like 10/25, the website will only have the records from 9/26-10/25. If I clicked "refresh all" on Query Editor, will the python code be applied on the source csv so I will get a dataset that contains (8/15-9/14, 9/26-10/25) records, or the python code will be applied on the "Appended Dataset" so I will get a dataset that contains (8/15-10/25) records. If it runs on the original dataset that misses the records from 9/15-9/25, what's the best thing I can do to prevent that from happening?

 

I know it sounds confusing so please let me know if there's anything I can do to clarify my question. Thank you!

 

 

 

1 REPLY 1
TomMartens
Super User
Super User

Hey,

I have to admit that I do not fully understand what you fully have so far, if possible share your query by using the advanced editor in PowerQuery.

 

Basically when I'm facing similar tasks I wolud do the following

Use a python script as a data source in PowerBI, from within this script

  • scrape the site
  • append the content to an existing file, if the file does not exist create it
  • read the complete file and present the content as table to powerbi

Currently it's somewhat difficult to add data increments to existing tables this may change with the general availability of dataflows.

 

Currently I'm trying to create the "complete" dataset outside of the Power BI data pipeline and then source the complete file to Power BI.

 

You can leverage R or Python based data sources using the on-premises data gateway in personal mode.

 

Hopefully this provides you some ideas.

 

Regards,

Tom 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 



Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.