Refresh in service fails when changing columns in ...

rrita · ‎02-08-2023

I'm reading a csv file which is saved in a Sharepoint folder.
The csv file is made of a variable number of columns. 9 columns always exist and have always the same names. In addition, depending on the data I want to analyze, a certain number of custom columns (at least 1, often 10-20) exist in the file. Their number and headers vary depending on the dataset.

I have implemented a solution that works smoothly in Power BI desktop. I am able to read all columns of the csv, promote headers, and reference the custom columns in a dynamic way without explicitly writing their names in all the steps I need to do next (change type, reorder, additional columns).

Here is the code for the file reading:

Source = SharePoint.Files("https://cellply.sharepoint.com/", [ApiVersion = 15]),
Filter = Table.SelectRows(Source, each([Folder Path] = "myFolderPath")),
SelectCSV = Table.First(Table.SelectRows(Filter, each([Extension]=".csv"))), //get first file with extension .csv in "myFolderPath"
GetData = SelectCSV[Content],
CSV = Csv.Document(GetData, [Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]), //I don't specify the number of columns to read, so I read all the columns existing in the file
#"Promoted Headers" = Table.PromoteHeaders(CSV, [PromoteAllScalars=true])

This code imports all the columns of my csv file, regardless of how many they are and how they are named. This query works with no problems on Power BI Desktop, but when I publish the report to the service, the refresh fails every time I change the dataset with one with a different number of columns or with the same number but at least one different name in the columns. The refresh error I get is: The 'Custom column 1' column does not exist in the rowset. Table: Table1.

I have already tried several workaround that I list below, but none worked.

1) I specify to always read 9 columns and collect the extracolumns in a 10th column, that I then split using the comma delimiter since I know columns are comma-separated:

CSV = Csv.Document(GetData, 9, ",", ExtraValues.List, 1252),
#"Extracted Values" = Table.TransformColumns(CSV, {"Column10", each Text.Combine(List.Transform(_, Text.From), ","), type text}),
dyn_names_count = List.Count(Text.Split(Table.Column(#"Extracted Values", "Column10"){0}, ",")),
#"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Values", "Column10", Splitter.SplitTextByDelimiter(",", QuoteStyle.None), dyn_names_count)

2) Following the workaround #1 I read the csv again from the folder after I compute the number of custom columns:

CSV2 = Csv.Document(GetData, 9 + dyn_names_count, ",", ExtraValues.Ignore, 1252)

3) Finally, since I can also know dynamically the costom columns names, I used the explicit names to read the csv again.

dyn_names = Text.Split(Table.Column(#"Extracted Values", "Column10"){0}, ","),
static_names = List.FirstN(Table.ColumnNames(Table.PromoteHeaders(CSV, [PromoteAllScalars=true])),9),
names_complete = List.Combine({static_names, dyn_names}),
CSV2 = Csv.Document(GetData, [Columns = names_complete, Delimiter = ",", Encoding = 1252, QuoteStyle=QuoteStyle.None])

All these 3 workaround work correctly on Desktop but give the refresh error on service.

Some forum I read suggested to clear the cache on the desktop app, I did it and republished the report, but nothing changed. My Power BI service uses a capacity that has cache disabled by default.

Any help is appreciated, thanks!

rrita · ‎02-13-2023

The problem I have is very similar to what described here: https://community.powerbi.com/t5/Community-Blog/Refresh-error-The-xxxx-column-does-not-exist-in-the-...

I am wondering if there is a way to update the column structure (number of columns and their names) without republishing the report from the desktop to the service.
As I described in the posts above, I have no static references to the columns, and I am able to read the csv for exactly the columns it features every time, so I don't understand why the data structure is sticked to what I published to the service the first time.

rrita · ‎02-09-2023

The refresh also fails even if I do not promote headers and leave column names as "Column1", "Column2", ...

If I use a dataset with N columns ("Column1", ..., "ColumnN") in Power BI desktop and then publish the report to the service, then if I refresh using a dataset with M columns ("Column1", ..., "ColumnM") it fails because it doesn't find a column called "ColumnX" (X depend on the difference between N and M).

GilbertQ · ‎02-08-2023

Hi @rrita

I personally have not worked with variable columns in Power Query. What I do know is that often what happens is that it runs a top 1000 before brining the data in when using Power BI desktop. But in the Power BI Service it does not do this, so that is why in the PBI Service it is expecting a certain column name and it cannot find it so it fails.

I would suggest trying to find a way to always have the same columns or another option is to demote the column names at the first step, so that the column name is not hard-coded.

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Power BI Blog

Refresh in service fails when changing columns in the data source, but works on desktop

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024