Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Context: my goal is to have two types of Excel files/CSVs:
I got this to happen once by doing the following:
I know I got it to happen because my Ledger originally only had one Test row, and it now additionally has every Current row.
HOWEVER, it will not happen fully again. Here is what's up:
Here is my python code for reference, with some extra notes:
import pandas as pd import numpy as np import datetime log = dataset
def un_id(series):
Returns the Unique ID of a row return ((str(series["Therapist"]) + "_" + str(series["Agency"])+ "_" + str(series["Provider Office"])).replace(" - ", "-").replace(" ", "_"))
Causes the log (Ledger) table and the prov (Current combined) table to both have IDs log.loc[:, "ID"] = log.apply(un_id, 1) prov.loc[:, "ID"] = prov.apply(un_id, 1)
Dates were taken in as strings, changes them to timestamps log.loc[:, "Date"] = log["Date"].map(pd.to_datetime) prov.loc[:, "Date"] = log["Date"].map(pd.to_datetime)
Checks if a row [in prov/Current combined] is in Ledger. Returns the series if it is def not_in_log(series): #If series is not in log, returns series. #Else, returns nothing.
Finds rows in Ledger with the same Unique ID as in prov/Current combined matches = log[log["ID"] == series["ID"]]
If none of these rows has the same date as the prov/Current combined Date, returns that row if series["Date"] not in list(matches["Date"]): return series
Goes through all prov/Current combined rows. Returns the ones that are not in Ledger new_rows = prov.apply(not_in_log, 1) new_rows = new_rows.dropna(how="all") new_rows = pd.DataFrame(new_rows)
Adds new rows to the Ledger log = log.append(new_rows, ignore_index = True)
Makes sure the Ledger has the right column names log = log.loc[:, list(prov.columns)]
Thanks in advance for the help!
Hi @Anonymous
It seems the first goal is achieved with your code.
The second one means to store historic data in files, these historic files should not override.
If so, please refer to a brilliant article: Storing and using information from a dynamic data source using PBI desktop.
Thanks for the pointer!
I could certainly run my python in a script that I time using task scheduler; in this scenario, I wouldn't even need to go through PowerBI for the updates.
However, because of the data hosting/security context I'm in, it is preferable that everything stays in PowerBI (and OneDrive, where the files are stored).
I'm confused because I was able to get the process to happen once (the initial ledger "fill") but it won't happen a second time to udpate.
I've also gotten this message which I don't know how to fix, and I think it has something to do with security (filepath removed for privacy):
DataSource.Error: ADO.NET: Python script error. Traceback (most recent call last): File "PythonScriptWrapper.PY", line 17, in <module> dataset.to_csv(path, mode = 'w', index = None, header = True, quoting = csv.QUOTE_ALL, chunksize = None, decimal= '.') File "C:\Users\kwilliams\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\core\generic.py", line 3228, in to_csv formatter.save() File "C:\Users\kwilliams\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\formats\csvs.py", line 183, in save compression=self.compression, File "C:\Users\kwilliams\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\common.py", line 399, in _get_handle f = open(path_or_buf, mode, encoding=encoding, newline="") PermissionError: [Errno 13] Permission denied: 'FILEPATH/Provider_Log.csv' Details: DataSourceKind=Python DataSourcePath=Python Message=Python script error. Traceback (most recent call last): File "PythonScriptWrapper.PY", line 17, in <module> dataset.to_csv(path, mode = 'w', index = None, header = True, quoting = csv.QUOTE_ALL, chunksize = None, decimal= '.') File "C:\Users\kwilliams\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\core\generic.py", line 3228, in to_csv formatter.save() File "C:\Users\kwilliams\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\formats\csvs.py", line 183, in save compression=self.compression, File "C:\Users\kwilliams\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\common.py", line 399, in _get_handle f = open(path_or_buf, mode, encoding=encoding, newline="") PermissionError: [Errno 13] Permission denied: FILEPATH/Provider_Log.csv' ErrorCode=-2147467259 ExceptionType=Microsoft.PowerBI.Scripting.Python.Exceptions.PythonScriptRuntimeException
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
114 | |
99 | |
83 | |
70 | |
60 |
User | Count |
---|---|
150 | |
115 | |
104 | |
89 | |
65 |