Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi everyone,
I'm having an issue where I'm trying combine some files through using the SharePoint Folder as the source.
In Power BI Power Query, I can locate the files and filter down to the folder they sit in but when clicking combine the "Evaluating Query" window appears, then disappears, and nothing happens.
If I try to combine files in a folder on my locals drives, everything combines fine and I can start transforming the data.
The reason I want to combine the files on from a SharePoint folder is because I may not be the only one in my team adding further files in the future, so the location needs to be accessible to everyone in my team.
I have attempted this with both CSV files and and XLSX files with the same result. The columns are all the same in the each file and the naming conventions of each file are also the same - the only difference being the date in the file name.
Any help would be appreciated.
If you’re encountering issues with File Combining, it could be related to the file type or the file name. Here’s a step-by-step solution that has worked for me in similar situation:
Hopefully following these steps should resolve the problem.
I have the same problem, when getting the files from a hard drive folder, or a synced sharepoint folder (which also routes to a local path) it loads and combines the files just fine, but when I get it from the 'real' sharepoint URL, I get the combine files error. The files are all *.xlsx files with identical layout.
Hello - please give this a try...
Things like that sometimes happen when the files in the folder have different extensions (and files are not always visible). These steps will help you to retrieve the desired files dynamically so that they can be combined successfully.
Add a custom function to retrieve the files from the Sharepoint folder. To do this, add a new blank query and name it fnGetSharepointFilesFromFolder. Go to the Advanced Editor, replace the contents with this script.
let
// ****************************************************************************************************************/
// fnGetSharepointFilesFromFolder
// ****************************************************************************************************************/
// ****************************************************************************************************************/
// START: Documentation
// ----------------------------------------------------------------------------------------------------------------/
// PURPOSE
// - Filter the source file directory to return the applicable folder of files.
//
// REQUIREMENTS
// - The source table must be result of SharePoint.Files().
//
// ----------------------------------------------------------------------------------------------------------------/
// INPUTS
//
// SharepointDocsDirectory (text, case insensitive)
// - There are two options for a Sharepoint sites directory: "Documents" and "Shared Documents"
// - If an invalid value is entered a default of Shared Documents will be applied.
//
// SharepointParentFolders (text, case insensitive)
// - Subfolders between the Sharepoint directory and the file name.
// - Do not include a forward slash character at the beginning or end of the string.
// - Example: MyFolder/MySubfolder
//
// EarliestFileModificationDate (datetime)
// - Only files with a Date Modified that is on or after this date will be included in the evaluation.
// - Example: #datetime(2021, 1, 1, 0, 0, 0)
//
// LatestFileModificationDate (datetime)
// - Only files with a Date Modified that is on or before this date will be included in the evaluation.
// - Example: #datetime(2021, 3, 31, 0, 0, 0)
//
// FirstNFiles (integer, optional)
// - If this is populated, only the first n files that meet the date criteria will be evaluated.
// - Example: 10
// ----------------------------------------------------------------------------------------------------------------/
//
// DEVELOPER
// - Jenn Ratten
// - last revised: 8/17/2021
// ----------------------------------------------------------------------------------------------------------------/
// END: Documentation
// ****************************************************************************************************************/
fn = (
Table as table,
SharepointDocsDirectory as text,
SharepointParentFolders as text,
EarliestFileModificationDate as datetime,
LatestFileModificationDate as datetime,
optional FirstNFiles as number
) =>
let
// ****************************************************************************************************************/
// START: Declare variables
//
varDocsDirectory = if Text.Lower ( SharepointDocsDirectory ) = "documents" then "Documents" else "Shared Documents",
// If the latest date is before the earliest date, use the earliest date.
varMaxDateForBinaries = if LatestFileModificationDate < EarliestFileModificationDate then EarliestFileModificationDate else LatestFileModificationDate,
// Remove the specified character from the start of the string.
ParentFolders_TrimCharFromStart =
if Text.StartsWith ( SharepointParentFolders, "/" )
then Text.TrimStart ( SharepointParentFolders, 1 )
else SharepointParentFolders,
// Remove the specified character from the end of the string.
ParentFolders_TrimCharFromEnd =
if Text.EndsWith ( ParentFolders_TrimCharFromStart, "/" )
then Text.TrimEnd ( ParentFolders_TrimCharFromStart, 1 )
else ParentFolders_TrimCharFromStart,
//Final variable
varParentFolders = ParentFolders_TrimCharFromEnd,
//
// ----------------------------------------------------------------------------------------------------------------
// END: Declare Variables
// ****************************************************************************************************************/
//
Source = Table,
// Add columns for hierarchies in a parent-child manner so the folders may be filtered without risk of the
// results including nested folders of the same name.
add_child_site = Table.AddColumn(Source, "Child Site", each Text.BetweenDelimiters([Folder Path], "/sites/", "/", 0, 0), type text),
// Sharepoint sites can have two directories; Documents and Shared Documents
add_dir = Table.AddColumn(add_child_site, "Directory", each Text.BetweenDelimiters([Folder Path], [Child Site] & "/", "/", 0, 0), type text),
// Count the number path separators that exist between the directory and the file name.
add_folder_delimiter_count = Table.AddColumn(
add_dir,
"Delimiters Btwn Dir and End",
each List.Count ( Text.PositionOfAny ( [Folder Path], {","}, Occurrence.All ) ) + 1,
Int64.Type
),
// Extract the remaining subfolders in the path.
add_parent_folders = Table.AddColumn (
add_folder_delimiter_count,
"Parent Folders",
each Text.BetweenDelimiters(
[Folder Path],
[Directory] & "/",
"/",
0,
[Delimiters Btwn Dir and End]
), type text
),
remove_folder_path = Table.RemoveColumns(add_parent_folders, {"Folder Path"}),
// Keep only files that are in the specified directory with the specified parent folders.
// When selecting rows in a Sharepoint directory in dataflows, the filter must begin
// after the directory in order to avoid the dynamic datasource error message.
filter_files = Table.SelectRows(
remove_folder_path,
each [Attributes]?[Hidden]? <> true
and Text.Contains ( [Directory], varDocsDirectory, Comparer.OrdinalIgnoreCase )
and Text.Contains ( [Parent Folders], varParentFolders, Comparer.OrdinalIgnoreCase )
),
// Filter the files by the Date Modified.
select_files_by_date_modified = Table.SelectRows (
filter_files,
each [Date modified] >= EarliestFileModificationDate and [Date modified] <= varMaxDateForBinaries
),
// Optionally limit the results to the first N files.
//first_n_files = if FirstNFiles + 0 = 0 then filter_files else Table.FirstN ( filter_files, FirstNFiles ),
first_n_files = if FirstNFiles = null then select_files_by_date_modified else Table.FirstN ( select_files_by_date_modified, FirstNFiles ),
// Add a unique ID (integer) for each file.
add_id = Table.AddIndexColumn(first_n_files, "FileId", 1, 1, Int64.Type)
in
add_id
in
fn
Add a new query that retrieves the files from the Sharepoint root site. This returns all files for the parimary site regardless of location within subsites or subfolders. In the sample below, I have stored the root URL in a parameter. This query is named: source_SharepointRootContent.
= SharePoint.Files(paramSharepointRootUrl, [ApiVersion = 15])
Create a new blank query to invoke the custom function to get the files from the folder.
let
Source = fnGetFilesFromSharepointFolder(source_SharepointRootContent, paramSharepointDocsDirectory, paramSharepointParentFolders, paramMinDateTime, paramMaxDateTime)
in
Source
Instead of creating a new blank query, you could also use the custom functions UI to invoke the function.
Now, the intent is to combine the files that have been returned. For example, let's assume all of the files that should be combined are Excel files. Future-proof the query be filtering the list of files to only include those with the desired file extension, but make it dynamic so that different versions of Excel and differences in upper/lowercase will not be excluded. Even though the folder may only contain Excel files at present, if a file of another type is added in the future, the query will fail.
In this list, there are variations and I want to make sure they are all included.
I clicked on the filter button in the Extension column and I am going to filter the values for those that contain .xls (as opposed to equals .xlsx).
This is the script that is generated in the filter step. 11 files are returned.
Table.SelectRows(Source, each Text.Contains([Extension], ".xls"))
The Power Query documentation for Text.Contains shows that the syntax for this function contains an additonal, optional arugment for Comparer.
Text.Contains(text as nullable text, substring as text, optional comparer as nullable function) as nullable logical
I'm going to add that to tell Power Query to compare the text with case insensitivity. Now I have 13 files returned. Note, I could have converted the Extension column to lower case, but doing it this way saves a step.
Table.SelectRows(Source, each Text.Contains([Extension], ".xls", Comparer.OrdinalIgnoreCase))
Use this same process for CSV files.
Table.SelectRows(Source, each Text.Contains([Extension], ".csv", Comparer.OrdinalIgnoreCase))
When combining Excel files, you will also need to make sure the worksheet name is the same in each file.
This is how you can add a column that returns true/false based on whether or not the file contains a sheet of a certain name.
FilterExt = Table.SelectRows(Source, each Text.Contains([Extension], "xls", Comparer.OrdinalIgnoreCase)),
varSheetName = "MyWorksheet",
SelectColumns = Table.SelectColumns(FilterExt,{"Content"}),
TestSheetNames = Table.AddColumn(
SelectColumns,
"SheetNameTestResult",
each
let
FileContent = Excel.Workbook ( [Content] ),
FilteredFileContent = Table.SelectRows (
FileContent,
each [Hidden]? <> true and [Kind] = "Sheet" and [Item]? = varSheetName
)
in not Table.IsEmpty ( FilteredFileContent ),
type logical
)
@Anonymous
Check the post and the tutorial video.
Solved: Combine Files From SharePoint Folder - Microsoft Power BI Community
How to combine data from multiple files from SharePoint folder in Power BI - YouTube
Paul Zheng _ Community Support Team
If this post helps, please Accept it as the solution to help the other members find it more quickly.
Thanks, but this tutorial and video is what I was already doing.
Regardless, I followed this tutorial and when I click the combine button, nothing happens still.
Hello @Anonymous ,
Would you be able to provide the Power Query steps from Advance Editor?
Also, if possible,can you please share the screenshot of what you are trying to do?
This is what is in the Advanced Editor before I click Combine
These are the files I'm trying to combine.
This is intended to to create a dataset which can be refreshed when new files of the same format, same naming convention, same columns within the file, are put into this folder. I will not always be the one dropping the files in here, but also others in my team.