Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
CiceroBC
Advocate I
Advocate I

Add row numbers to each text file from a folder

I'm looking for a way to inject a row number when using the folder source optoin in power query (in Excel). I have a large amount of text files with unique and different file extensions based on the data in each (even though they are actually all text files). These different files have value that are specific to the row number in each file type based on a value in the first row of each file.

 

So for example, if file 123.xyz has the word "MONTH" in the first row of text, then I know that the key value is in row 46. If the first row has the word "YEAR" in the first row, then I know that the key value is in row 4. However, if the file is 123.abc, I need to look for the word in row 12, which sends me to different rows for key values, etc...

 

To build out the logic, I need to know the row number for every file. Essentially, I am looking for an index counter that is nested to start over every time new filename is in the combined output.

 

I've seen multiple solutions using DAX, but as this is Power Query in Excel (before you ask...yes, it has to be Excel) I really need a solution in M.

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @CiceroBC ,

 

this part is reasonably simple:

1. Import files from a folder (a case may be from an online folder):

PBI.png

2. Filter, if necessary, and then click on the button in the "Content" header:

PBI.png

3. Confirm Ok on the next step. PBI will generate some code to import your data in both the current query and in a separate group. The group will look something like this:

PBI.png

4. Select the Transform Sample File query and add an index column:

 

PBI.png

Once it is added in the template query, PBI will update a function linked to this query and once you come back to your original query you should see something like this:

 

PBI.png

 

Hope this helps :).

 

This is the code generated by PBI:

Main query:

let
    Source = Folder.Files("D:\"),
    #"Filtered Rows" = Table.SelectRows(Source, each ([Name] = "123.abc" or [Name] = "123.xyz")),
    #"Filtered Hidden Files1" = Table.SelectRows(#"Filtered Rows", each [Attributes]?[Hidden]? <> true),
    #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File (2)", each #"Transform File (2)"([Content])),
    #"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File (2)"})
in #"Removed Other Columns1"

 

And the transform function called from the main query (in my case the function is called Transform File (2)😞

let
    Source = (Parameter2 as binary) => let
        Source = Table.FromColumns({Lines.FromBinary(Parameter2, null, null, 1252)}),
        #"Added Index" = Table.AddIndexColumn(Source, "Index", 1, 1)
    in
        #"Added Index"
in
    Source

 

Kind regards,

JB

 

 

 

View solution in original post

6 REPLIES 6
v-frfei-msft
Community Support
Community Support

Hi @CiceroBC ,

 

Group by should help you.

grou.PNG

 

Then we can duplicate the table and merge.

 

22.PNG

 

If it doesn't meet your requirement, kindly share your sample data and excepted result to me if you don't have any Confidential Information. Please upload your files to One Drive and share the link here.

 

Community Support Team _ Frank
If this post helps, then please consider Accept it as the solution to help the others find it more quickly.

Hi @v-frfei-msft,

 

Please accept my apologies for not providing enough specifics. Getting a row count might be helpful in some future steps, but my objective right now is to have a running index of each row in each file - which I'll call a "File Row Index". I'm afraid I can't provide any source files, as it is company proprietary information, but I've tried to approximate what I have in mind below:

  

Source.NameColumn1 (file content)File Row Index
123.xyzThis is a MONTH report (I'll be using the first line to identify the file)1
123.xyzextra report information from the program that makes these reports2
123.xyzother report information3
123.xyzData I need from this row of the file4
123.xyzstill some other report information5
123.abcThis is a YEAR report (FYI there are dozens of these variations)1
123.abcsome other report information I don't need2
123.abcData I need from this row of this file3
123.abcSometimes I need data from multiple rows depending on the first row4

 

Does that help clarify what I am looking for?

Thanks!

Anonymous
Not applicable

Hi @CiceroBC ,

 

this part is reasonably simple:

1. Import files from a folder (a case may be from an online folder):

PBI.png

2. Filter, if necessary, and then click on the button in the "Content" header:

PBI.png

3. Confirm Ok on the next step. PBI will generate some code to import your data in both the current query and in a separate group. The group will look something like this:

PBI.png

4. Select the Transform Sample File query and add an index column:

 

PBI.png

Once it is added in the template query, PBI will update a function linked to this query and once you come back to your original query you should see something like this:

 

PBI.png

 

Hope this helps :).

 

This is the code generated by PBI:

Main query:

let
    Source = Folder.Files("D:\"),
    #"Filtered Rows" = Table.SelectRows(Source, each ([Name] = "123.abc" or [Name] = "123.xyz")),
    #"Filtered Hidden Files1" = Table.SelectRows(#"Filtered Rows", each [Attributes]?[Hidden]? <> true),
    #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File (2)", each #"Transform File (2)"([Content])),
    #"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File (2)"})
in #"Removed Other Columns1"

 

And the transform function called from the main query (in my case the function is called Transform File (2)😞

let
    Source = (Parameter2 as binary) => let
        Source = Table.FromColumns({Lines.FromBinary(Parameter2, null, null, 1252)}),
        #"Added Index" = Table.AddIndexColumn(Source, "Index", 1, 1)
    in
        #"Added Index"
in
    Source

 

Kind regards,

JB

 

 

 

Hi @Anonymous ,

 

That's exactly what I was looking for!  It hadn't occured to me to add the index in the sample file level.

 

Thanks very much!

Anonymous
Not applicable

Hi @CiceroBC ,

 

do you have a reference table similar to:

FileNameSearchInRowValueKeyInRow
123.xyz1MONTH46
123.xyz
1YEAR4
123.abc
12??????

 

And obviously the text files can be parsed using CSV.Document to rows in a way that makes sense?

 

Thanks,

JB

Hi @Anonymous,

 

I haven't built out a reference table for the logic as of yet. This issue is a step towards that end.

 

Concerning the CSV.Document command, I'm using the folder as a source because this tool will be to extract values from a few thousand reports in a folder.  To better clarify, here is the M I'm using so far to view the file contents (please note that "Search Folder" is a parameter so this can be tested with a smaller folder of files first).

 

let
Source = Folder.Files(SearchFolder),
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] <> ".xlsm" and [Extension] <> ".xlsx")),
#"Filtered Hidden Files1" = Table.SelectRows(#"Filtered Rows", each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Report File", each #"Report File"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Report File"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Report File", Table.ColumnNames(#"Report File"(#"Sample File"))),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Column1", type text}})
in
#"Changed Type"

 

The end result is two columns of "Source.Name" (the file name) and "Column1" the content of the file, stacked on top of each other. My objective is to establish an row index that numbers each row, and starts over every time there is a new file. Once that is done, I can begin my reference table based on the logic that I've documented so far.

 

Thanks!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors