cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
New Member

Power Query to import text file having two rows for each field

Hello,

 

I want to import a legacy text file which is having data split into two rows for each field. The headers are repeated for each page.

 

Screenshot of Sample DataSample.jpg

 

 Thank you

2 ACCEPTED SOLUTIONS

Accepted Solutions
Highlighted
Community Champion
Community Champion

Re: Power Query to import text file having two rows for each field

That's quite a challenge.

My suggestion would be to first create a table with the data of each first row and another table with the data of each second row.

Then combine both tables.

 

I created the following example file with part of your data (at estimated positions):

 

 Example text file.png

Assuming you need no data from the headers and assuming input is at fixed positions, I created the following code.
I renamed each step and otherwise I adjusted generated code (or created code using the advance editor) for the steps:
Source (adjusted in order to get only 1 column)

SplitByPosition1 and -2

RemovedTopRows1 and - 2 (remove based on condition)

CombinedAsRecords

 

let
    Source = Csv.Document(File.Contents("C:\Users\Marcel\Documents\Forum bijdragen\Power BI Community\Power Query to import text file having two rows for each field\Legacy.txt"),1,"#(lf)"),
// First the first rows
    SplitByPosition1 = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByRanges({{0,13},{17,15},{35,15},{63,10}})),
    Trimmed1 = Table.TransformColumns(SplitByPosition1,{},Text.Trim),
    RemovedTopRows1 = Table.Skip(Trimmed1,each _[Column1.1]<>"A/C NO"),
    PromotedHeaders1 = Table.PromoteHeaders(RemovedTopRows1),
    Typed1 = Table.TransformColumnTypes(PromotedHeaders1,{{"A/C NO", type number}, {"Princ Bal", type number}, {"Int Accr", type number}}, "en-US"),
    RemovedErrors1 = Table.RemoveRowsWithErrors(Typed1),
    ResultPart1 = Table.SelectRows(RemovedErrors1, each ([#"A/C NO"] <> null)),
//  Now parse the second rows
    SplitByPosition2 = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByRanges({{18,15},{50,10}})),
    Trimmed2 = Table.TransformColumns(SplitByPosition2,{},Text.Trim),
    RemovedTopRows2 = Table.Skip(Trimmed2,each _[Column1.1]<>"Interest Rat"),
    PromotedHeaders2 = Table.PromoteHeaders(RemovedTopRows2),
    Typed2 = Table.TransformColumnTypes(PromotedHeaders2,{{"Interest Rat", type number}, {"Fees Recv", type number}}, "en-US"),
    RemovedErrors2 = Table.RemoveRowsWithErrors(Typed2),
    ResultPart2 = Table.SelectRows(RemovedErrors2, each ([Interest Rat] <> null)),
// Now combine the results
    CombinedAsRecords = Table.FromColumns({Table.ToRecords(ResultPart1),Table.ToRecords(ResultPart2)}),
    Expanded1 = Table.ExpandRecordColumn(CombinedAsRecords, "Column1", {"A/C NO", "Borrower", "Princ Bal", "Int Accr"}, {"A/C NO", "Borrower", "Princ Bal", "Int Accr"}),
    Result = Table.ExpandRecordColumn(Expanded1, "Column2", {"Interest Rat", "Fees Recv"}, {"Interest Rat", "Fees Recv"})
in
    Result

 

This should help you to create the code for your file. If you need further help then please provide a representative sample file.

 

This is the result from my example file:

Example result table.png

 

You may want to add some finishing steps, like reordering columns.

 

Specializing in Power Query Formula Language (M)

View solution in original post

Highlighted
Community Champion
Community Champion

Re: Power Query to import text file having two rows for each field

An alternative solution which is probably much faster is the following in which the source is duplicated in 2 table columns, with the 2nd column shifted 1 row.

 

let
    Source = Csv.Document(File.Contents("C:\Users\Marcel\Documents\Forum bijdragen\Power BI Community\Power Query to import text file having two rows for each field\Legacy.txt"),1,"#(lf)"),
// Add column shifted 1 row
    AddedShiftedColumn = Table.FromColumns({Source[Column1],List.Skip(Source[Column1],1)}),
    SplitByPosition1 = Table.SplitColumn(AddedShiftedColumn,"Column1",Splitter.SplitTextByRanges({{0,13},{17,15},{35,15},{63,10}})),
    SplitByPosition2 = Table.SplitColumn(SplitByPosition1,"Column2",Splitter.SplitTextByRanges({{18,15},{50,10}})),
    Trimmed = Table.TransformColumns(SplitByPosition2,{},Text.Trim),
    RemovedTopRows = Table.Skip(Trimmed,each _[Column1.1] <> "A/C NO"),
    PromotedHeaders = Table.PromoteHeaders(RemovedTopRows),
    Typed = Table.TransformColumnTypes(PromotedHeaders,{{"A/C NO", Int64.Type}, {"Princ Bal", type number}, {"Int Accr", type number}, {"Interest Rat", type number}, {"Fees Recv", type number}}, "en-US"),
    RemovedErrors = Table.RemoveRowsWithErrors(Typed),
    Filtered = Table.SelectRows(RemovedErrors, each ([#"A/C NO"] <> null))
in
    Filtered

 

Specializing in Power Query Formula Language (M)

View solution in original post

4 REPLIES 4
Highlighted
Community Champion
Community Champion

Re: Power Query to import text file having two rows for each field

That's quite a challenge.

My suggestion would be to first create a table with the data of each first row and another table with the data of each second row.

Then combine both tables.

 

I created the following example file with part of your data (at estimated positions):

 

 Example text file.png

Assuming you need no data from the headers and assuming input is at fixed positions, I created the following code.
I renamed each step and otherwise I adjusted generated code (or created code using the advance editor) for the steps:
Source (adjusted in order to get only 1 column)

SplitByPosition1 and -2

RemovedTopRows1 and - 2 (remove based on condition)

CombinedAsRecords

 

let
    Source = Csv.Document(File.Contents("C:\Users\Marcel\Documents\Forum bijdragen\Power BI Community\Power Query to import text file having two rows for each field\Legacy.txt"),1,"#(lf)"),
// First the first rows
    SplitByPosition1 = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByRanges({{0,13},{17,15},{35,15},{63,10}})),
    Trimmed1 = Table.TransformColumns(SplitByPosition1,{},Text.Trim),
    RemovedTopRows1 = Table.Skip(Trimmed1,each _[Column1.1]<>"A/C NO"),
    PromotedHeaders1 = Table.PromoteHeaders(RemovedTopRows1),
    Typed1 = Table.TransformColumnTypes(PromotedHeaders1,{{"A/C NO", type number}, {"Princ Bal", type number}, {"Int Accr", type number}}, "en-US"),
    RemovedErrors1 = Table.RemoveRowsWithErrors(Typed1),
    ResultPart1 = Table.SelectRows(RemovedErrors1, each ([#"A/C NO"] <> null)),
//  Now parse the second rows
    SplitByPosition2 = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByRanges({{18,15},{50,10}})),
    Trimmed2 = Table.TransformColumns(SplitByPosition2,{},Text.Trim),
    RemovedTopRows2 = Table.Skip(Trimmed2,each _[Column1.1]<>"Interest Rat"),
    PromotedHeaders2 = Table.PromoteHeaders(RemovedTopRows2),
    Typed2 = Table.TransformColumnTypes(PromotedHeaders2,{{"Interest Rat", type number}, {"Fees Recv", type number}}, "en-US"),
    RemovedErrors2 = Table.RemoveRowsWithErrors(Typed2),
    ResultPart2 = Table.SelectRows(RemovedErrors2, each ([Interest Rat] <> null)),
// Now combine the results
    CombinedAsRecords = Table.FromColumns({Table.ToRecords(ResultPart1),Table.ToRecords(ResultPart2)}),
    Expanded1 = Table.ExpandRecordColumn(CombinedAsRecords, "Column1", {"A/C NO", "Borrower", "Princ Bal", "Int Accr"}, {"A/C NO", "Borrower", "Princ Bal", "Int Accr"}),
    Result = Table.ExpandRecordColumn(Expanded1, "Column2", {"Interest Rat", "Fees Recv"}, {"Interest Rat", "Fees Recv"})
in
    Result

 

This should help you to create the code for your file. If you need further help then please provide a representative sample file.

 

This is the result from my example file:

Example result table.png

 

You may want to add some finishing steps, like reordering columns.

 

Specializing in Power Query Formula Language (M)

View solution in original post

Highlighted
Community Champion
Community Champion

Re: Power Query to import text file having two rows for each field

Note this topic is cross posted over here.

Specializing in Power Query Formula Language (M)
Highlighted
Community Champion
Community Champion

Re: Power Query to import text file having two rows for each field

An alternative solution which is probably much faster is the following in which the source is duplicated in 2 table columns, with the 2nd column shifted 1 row.

 

let
    Source = Csv.Document(File.Contents("C:\Users\Marcel\Documents\Forum bijdragen\Power BI Community\Power Query to import text file having two rows for each field\Legacy.txt"),1,"#(lf)"),
// Add column shifted 1 row
    AddedShiftedColumn = Table.FromColumns({Source[Column1],List.Skip(Source[Column1],1)}),
    SplitByPosition1 = Table.SplitColumn(AddedShiftedColumn,"Column1",Splitter.SplitTextByRanges({{0,13},{17,15},{35,15},{63,10}})),
    SplitByPosition2 = Table.SplitColumn(SplitByPosition1,"Column2",Splitter.SplitTextByRanges({{18,15},{50,10}})),
    Trimmed = Table.TransformColumns(SplitByPosition2,{},Text.Trim),
    RemovedTopRows = Table.Skip(Trimmed,each _[Column1.1] <> "A/C NO"),
    PromotedHeaders = Table.PromoteHeaders(RemovedTopRows),
    Typed = Table.TransformColumnTypes(PromotedHeaders,{{"A/C NO", Int64.Type}, {"Princ Bal", type number}, {"Int Accr", type number}, {"Interest Rat", type number}, {"Fees Recv", type number}}, "en-US"),
    RemovedErrors = Table.RemoveRowsWithErrors(Typed),
    Filtered = Table.SelectRows(RemovedErrors, each ([#"A/C NO"] <> null))
in
    Filtered

 

Specializing in Power Query Formula Language (M)

View solution in original post

Highlighted
New Member

Re: Power Query to import text file having two rows for each field

Thank you so much Sir.

Helpful resources

Announcements
Meet the 2020 Season 2 Power BI Super Users!

Meet the 2020 Season 2 Power BI Super Users!

Find out who's part of the program this season, and welcome the new Super Users.

August 2020 Community Challenge: Can You Solve These?

August 2020 Community Challenge: Can You Solve These?

We're excited to announce our first cross-community 'Can You Solve These?' challenge!

July 2020 Community Highlights

July 2020 Community Highlights

Learn about the exciting things that happened in July.

Featured Data Story of The Month

Featured Data Story of The Month

All Data Stories Gallery contributions are reviewed for each month. We select a contribution and feature the community member the following month.

Power BI Dev Camp - Developing with .NET Core

Power BI Dev Camp - Developing with .NET Core

Learn how to develop custom web applications for Power BI using .NET Core 3.1 and .NET 5.

Top Solution Authors
Top Kudoed Authors