Solved: Import of TXT files > columns

Willborn · ‎01-29-2016

Hi there!

I'm currently testing Power BI and struggling with the following task. Did not found a solution in the web so far...:

We have an interface delivering .txt files with blank spaces according to the origin field lenght. When I try to import this file (.txt or auto, UTF-8), Power BI Desktop removes the empty text and due to the mixed content I'm not able to get it transformed into the right columns.

An Example:

Text 1 000122 122.22.00 Restic Shipping (Rotterdam) Ltd 122.50 Max Muster

Text 2 000134 122.89.00 Norabosk Gas Terminal NV 50.20 John Doe

Text 3 000458 122.88.01 Tenkay Handelsgesellschaft m.b.H. 1250.25 Jane Smith

Ends up in PBI as:

Text 1 000122 122.22.00 Restic Shipping (Rotterdam) Ltd 122.50 Max Muster

Text 2 000134 122.89.00 Norabosk Gas Terminal NV 50.20 John Doe

Text 3 000458 122.88.01 Tenkay Handelsgesellschaft m.b.H. 1250.25 Jane Smith

With above import, I' cannot split the tex using number of characters, what actually would be the best way as I know the lenght of the exported fields, and the conten may vary.

Does somone can help me? Thanks!!!

Greg_Deckler · ‎01-29-2016

@Willborn - I couldn't get my first thought of columns in source to work for some weird reason. However, I was able to split the column on 33 and make it look right:

let
    Source = Csv.Document(File.Contents("C:\temp\powerbi\Willborn\text2.txt"),null,{0,5,11,22,36,79},null,1252),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", Int64.Type}, {"Column3", Int64.Type}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}}),
    #"Split Column by Position" = Table.SplitColumn(#"Changed Type","Column5",Splitter.SplitTextByRepeatedLengths(33),{"Column5.1", "Column5.2"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Position",{{"Column5.1", type text}, {"Column5.2", type text}})
in
    #"Changed Type1"

So I split it on Position 33 essentially after the initial source step. So the UI way to do it is to edit the query, go to Transform tab, choose column, choose Split Column from Text Column area and in drop down choose "By Position". Then you can enter the number of characters, 33 and you should be good to go.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

Greg_Deckler · ‎01-29-2016

@Willborn - Are you using "Text" as the data source versus CSV? I took your data and edited it to make sure that in the txt file that everything lined up in columns. Results are below using Text as the data source type.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Willborn · ‎01-29-2016

Thanks - that shall work... I'll try...

BTW: Just got the information from our developer, that the currently used tool is importing the text file and splits the columns according to the fixed amount of characters (11;11;14;33;10;20...).

Willborn · ‎01-29-2016

Smoupre, thanks for that, finally got it!

Have a nice weekend!

Greg_Deckler · ‎01-29-2016

@Willborn - I couldn't get my first thought of columns in source to work for some weird reason. However, I was able to split the column on 33 and make it look right:

let
    Source = Csv.Document(File.Contents("C:\temp\powerbi\Willborn\text2.txt"),null,{0,5,11,22,36,79},null,1252),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", Int64.Type}, {"Column3", Int64.Type}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}}),
    #"Split Column by Position" = Table.SplitColumn(#"Changed Type","Column5",Splitter.SplitTextByRepeatedLengths(33),{"Column5.1", "Column5.2"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Position",{{"Column5.1", type text}, {"Column5.2", type text}})
in
    #"Changed Type1"

So I split it on Position 33 essentially after the initial source step. So the UI way to do it is to edit the query, go to Transform tab, choose column, choose Split Column from Text Column area and in drop down choose "By Position". Then you can enter the number of characters, 33 and you should be good to go.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Willborn · ‎01-29-2016

Hi Smoupre

Thanks! Guess I found the cause, but still have no solution. The above file was en extract of the first lines. I've loaded this and got the same results as you have, but:

The original file has some lines where the 33-characters original field was fully filled, so there is no space between this column and the next one. In the sample text file, the amount column isn't separated.

Sorry - can't give the original file as this is confidential.

Willborn · ‎01-29-2016

Herer the sample file as code:

Text 1     000122     122.22.00     Restic Shipping (Rotterdam) Ltd  122.50    Max Muster
Text 2     000134     122.89.00     Norabosk Gas Terminal NV         50.20     John Doe
Text 3     000134     122.89.00     Norabosk Gas Terminal NV         50.20     John Doe
Text 4     000134     122.89.00     Norabosk Gas Terminal NV         50.20     John Doe
Text 5     000134     122.89.00     Norabosk Gas Terminal NV         50.20     John Doe
Text 6     000458     122.88.01     Tenkay Handelsgesellschaft m.b.H.1250.25   Jane Smith
Text 7     000458     122.88.01     Tenkay Handelsgesellschaft m.b.H.1250.25   Jane Smith
Text 8     000458     122.88.01     Tenkay Handelsgesellschaft m.b.H.1250.25   Jane Smith
Text 9     000134     122.89.00     Norabosk Gas Terminal NV         50.20     John Doe
Text 10    000134     122.89.00     Norabosk Gas Terminal NV         50.20     John Doe
Text 11    000134     122.89.00     Norabosk Gas Terminal NV         50.20     John Doe
Text 12    000134     122.89.00     Norabosk Gas Terminal NV         50.20     John Doe
Text 13    000122     122.22.00     Restic Shipping (Rotterdam) Ltd  122.50    Max Muster
Text 14    000122     122.22.00     Restic Shipping (Rotterdam) Ltd  122.50    Max Muster
Text 15    000122     122.22.00     Restic Shipping (Rotterdam) Ltd  122.50    Max Muster
Text 16    000122     122.22.00     Restic Shipping (Rotterdam) Ltd  122.50    Max Muster

Greg_Deckler · ‎01-29-2016

@Willborn - I've got your back on this one! Edit your query and go to View | Advanced Editor, you should have a query that looks like:

let
    Source = Csv.Document(File.Contents("C:\temp\powerbi\Willborn\text2.txt"),null,{0,5,11,22,36,79},null,1252),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", Int64.Type}, {"Column3", Int64.Type}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}})
in
    #"Changed Type"

In the Source line, you can edit the column breaks.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

jagostinhoCT · ‎02-28-2018

I am having a similar problem.

@Greg_Deckler How can I tell in the code where to change the column breaks?

Thanks

J

Greg_Deckler · ‎02-28-2018

Hopefully this answers the question, I have highlighted where you would make the change:

let
    Source = Csv.Document(File.Contents("C:\temp\powerbi\Willborn\text2.txt"),null,{0,5,11,22,36,79},null,1252),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", Int64.Type}, {"Column3", Int64.Type}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}}),
    #"Split Column by Position" = Table.SplitColumn(#"Changed Type","Column5",Splitter.SplitTextByRepeatedLengths(33),{"Column5.1", "Column5.2"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Position",{{"Column5.1", type text}, {"Column5.2", type text}})
in
    #"Changed Type1"

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

jagostinhoCT · ‎03-01-2018

How to split it at a character (backlash, space, colon...)?

Greg_Deckler · ‎03-01-2018

@jagostinhoCT- Are you trying to split a single column or while you are importing text?

For the latter see red bold below

let
    Source = Csv.Document(File.Contents("C:\temp\powerbi\test.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Column1", type text}, {"Column2", Int64.Type}})
in
    #"Changed Type"

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

jagostinhoCT · ‎03-01-2018

thanks @Greg_Deckler

I think the way I included the file contents does not allow me to split it that way, actually.

I am adding the file's content via the use of a new column so that I can keep both files metadata (I want to be able to use the Files Modified Date as data)

Here is the code I have on my query

let
    Source = SharePoint.Files("https://corporateSPsite", [ApiVersion = 15]),
    #"Filtered Rows - AuditsLog Folder" = Table.SelectRows(Source, each Text.StartsWith([Folder Path], "https://https://corporateSPsite/Data Sources/Audits/")),
    #"Filtered Rows - ModelMetrics2-4" = Table.SelectRows(#"Filtered Rows - AuditsLog Folder", each ([Name] = "ModelMetrics2-4.txt")),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows - ModelMetrics2-4", "TXTContent", each Csv.Document([Content])),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Content"}),
    #"Expanded TXTContent" = Table.ExpandTableColumn(#"Removed Columns", "TXTContent", {"Column1"}, {"Column1"})
in
    #"Expanded TXTContent"

Greg_Deckler · ‎03-01-2018

OK, if you have a column that you want to split, then that would be either:

    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "LEVEL2OWNER", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"LEVEL2OWNER.1", "LEVEL2OWNER.2"}),

or

    #"Split Column by Position" = Table.SplitColumn(#"Changed Type", "LEVEL2OWNER", Splitter.SplitTextByRepeatedLengths(10), {"LEVEL2OWNER.1", "LEVEL2OWNER.2"}),

or

    #"Split Column by Position" = Table.SplitColumn(#"Changed Type", "LEVEL2OWNER", Splitter.SplitTextByPositions({0, 10}, false), {"LEVEL2OWNER.1", "LEVEL2OWNER.2"}),

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Import of TXT files > columns

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024