Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

v-henryk-mstf

Add date to the new records without changing previous date

Background:
The source data does not contain date information, so you need to add date information after importing the data. The data is refreshed at a fixed time every day. When the data is refreshed early, the time of the day is written to the input information. And the date of the previous data does not change.


An example:
The original data has two columns, product and sales, after importing the data, you need to add a column named upload date.
On July 19, 2022, the data table has the following columns.

product

sales

upload date

sku001

24

7/18/2022

sku002

20

7/19/2022

 

On July 21, 2022, the data should be refreshed to look like this.

product

sales

upload date

sku001

24

7/18/2022

sku002

20

7/19/2022

sku002

29

7/20/2022

sku001

27

7/21/2022

 

The principle of implementation:
Use pandas, a package in python, to add a date to a file for saving, and then import the date data from the file into the power query editor when it is refreshed next time. And use the m function of power query to get the refresh date at that time, and then update it to the date file.

vhenrykmstf_1-1659434205603.png

 

 

How to implement:
Step 1. Import data and change the type of all fields into text format. Here I use excel as the data source.

vhenrykmstf_2-1659434243403.png


Step 2. Prepare the python environment. The runtime environment must have the packages pandas and matplotlib installed, otherwise it won't work. Please run the following code in PowerShell or command.

pip install pandas
pip install matplotlib

vhenrykmstf_3-1659434273664.png

 

Step 3. Read the date file. Use the following code.

 

 

 

try:
    df = pandas.read_csv('D:\\DesktopFile\\\documents\\Test_Original_Data.csv')['upload date']
except:
    pandas.DataFrame([''],columns = ['upload date']).to_csv('D:\\DesktopFile\\\documents\\\Test_Original_Data.csv')
    df = pandas.read_csv('D:\\DesktopFile\\\documents\\\Test_Original_Data.csv')['upload date']
mergedataset= pandas.concat([dataset,df],axis = 1)

 

 

 

 

vhenrykmstf_4-1659434320283.png

(the red parameters are the date file path)

 

Step 4. Create a new custom column. The name of this custom column needs to be the same as the date file column name, case sensitive. Use the following m function to create a custom column.

 

 

 

=if 
Text.Length([upload date last])<8
Then DateTime.Date(DateTime.LocalNow())
else [upload date last]

 

 

 

 

vhenrykmstf_5-1659434372517.png

 

Step 5. Add the new result to the date file. Before doing so, please remember to convert the date column format into text format.

 

 

 

dataset['upload date'].to_csv('D:\\DesktopFile\\\documents\\\Test_Original_Data.csv')

 

 

 

vhenrykmstf_6-1659434414347.png

 

Final result

vhenrykmstf_7-1659434438390.png

 

Summary
The above script can automatically record the date when data is added to the dataset. We may get the same result if we use the power bi service, however, it is much more advisable to install the standard gateway than a personal gateway.
Example pbix file in the end which you can refer.

 

vhenrykmstf_8-1659434494267.png

 

Author: Chenwu Zhu

Reviewer: Kerry Wang & Ula Huang