Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
krvel
Frequent Visitor

ETL from a NO Standard Report to standard Table

Hi...

Well... I have one Report of Sales, and this report have Headers on each page, and footers on the end of the report...

The schema of the report is basicaly this:

 

============== Headers in the top of Page =====================

=====================================================

=====================================================

ColumName1  ColumName2  ColumName3   ColumName4 ColumName5......

SalesMan Name

DataofReport separated columsn

DataofReport separated columsn

DataofReport separated columsn

DataofReport separated columsn

DataofReport separated columsn

Subtotal Line

Rest of Document

 

 

I need to extract only the BOLD data from this report and normalize in one table like that:

SalesMan Name   ColumName1  ColumName2  ColumName3   ColumName4 ColumName5......

nameofguy             value                value                  value                  value               value

nameofguy             value                value                  value                  value               value

nameofguy2           value                value                  value                  value               value

nameofguy2           value                value                  value                  value               value

nameofguy2           value                value                  value                  value               value

nameofguy3           value                value                  value                  value               value

nameofguy3           value                value                  value                  value               value

nameofguy4           value                value                  value                  value               value

 

I have some dificulties, because in the report some Salesman, have low products items, then in some pages I have 3 or 4 Salesman with your product data, values, and subtotal on end... And in some pages one Salesman have 2 or more Pages of Items Sales, and in the middle of product data, break of page with headers...

Below its a little demo of part of the report in excel. The Red Box is the part of the report that I need to extract to one table, or to another sheet...

 

The main ReportThe main Report

Below one preview of the output that I need from the main Report above...

 

How i Need the Extraction of DataHow i Need the Extraction of Data

Some Ideas of how I make transformation or extraction using Power BI Desktop ???

Thanks

Claudinei Mendes

 

1 ACCEPTED SOLUTION

With any luck, this will be pretty straightforward:

1) Add a column with this formula:

if Text.Contains([Column1], "-") = true then [Column1] else null

2) Check this column -> FillDown: This should return a column with all Vendornames.

 

3) Check column4 -> Transform Format to Decimal number -> remove errors -> Filter on figures >0

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

7 REPLIES 7
krvel
Frequent Visitor

Hi @ImkeF .. I tried your suggestion... and i can get the VendorNames on a Column...

I removed the null fields, and the Company name header that have "-" separating the names of the subsidiaries and get this:

 

000015-AMANDA
000020-GUSTAVO
000028-ADM SOLUTION
100006-ADRIANA
100016-RIBEIRO
100017-LEAL JUNIOR
100025-DAYANA
100031-GORETH
100062-ELIZABETE
100096-JUCILENE
100107-LOURIVAL
100148-JOAO VITOR
100187-GABRIELA
100211-ARNALDO
100226-LEIANE
100244-FELISMINO
100269-ALIADNA
100274-ERICA
100276-WILLAMES
200176-MARA
300265-VIANA
400038-PEDRO
400041-ERIVAN
500007-AGAMENON
500055-DANIEL
500057-JALLYSON
500062-ANDRE
500068-ISABEL
500075-EDGAR
500077-ENALDO

 

The drill Down result is here... But now... how I will make to join the VendorName... with the products that they are sell ??

 

Really its a strange data input... I think for people who will analyze the printed report, make some sense... Because when the papers in hand, the guy see:

Vendorname
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
AnotherVendorName
yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

I think that in this way, for some guy that are analyzing the paper, make some sense, because the data come with a subtotal of items of each Vendorname...
But for PowerBI use, its to easy if the system developer had done something like this:

VendorName xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
VendorName xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
VendorName xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
VendorName xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
AnotherVendor yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
AnotherVendor yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
AnotherVendor yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
AnotherVendor yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

Like this, in PowerBI, my life it's too more easy.. Lol.

If anyone else can take a look on the complete file, to see if have other approach, I will share in a link the file, with 48 hours expiring...

https://expirebox.com/download/965c3359b1eeecd05041230ccdb60411.html

Thanks again

This query gives me the correct results:

 

let
    Source = Excel.CurrentWorkbook(){[Name="Tabelle1"]}[Content],
    #"Added Custom" = Table.AddColumn(Source, "Custom", each if Text.Contains([Spalte1], "-") = true then [Spalte1] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
    #"Filtered Rows1" = Table.SelectRows(#"Filled Down", each ([Spalte4] <> null)),
    #"Promoted Headers" = Table.PromoteHeaders(#"Filtered Rows1", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Faturamento", type number}}),
    #"Removed Errors" = Table.RemoveRowsWithErrors(#"Changed Type", {"Faturamento"}),
    #"Filtered Rows" = Table.SelectRows(#"Removed Errors", each not Text.Contains([#"Pos."], "Total"))
in
    #"Filtered Rows"

 

Replace the "Spalte" by "Column" or however this is called in your local language.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Please let me know if I shall upload the file with ths solution somewhere.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

krvel
Frequent Visitor


@ImkeF wrote:

Please let me know if I shall upload the file with ths solution somewhere.


Hi @ImkeF... I have tryed your answer. However, did not look well as I would like. But following the tips and information you gave, in the past weekend I have "broke my mind" trying to come up with a solution, and finally I managed to get what I wanted.

 

Below its a print of screen, showing the final table... And below the image, I put the commands that are generated by Advanced Editor .

 

Query after applied commandsQuery after applied commands

let
    Origem = Excel.Workbook(File.Contents("C:\BI\power\Relatorio.xls"), null, true),
    Página1 = Origem{[Name="Página1"]}[Data],
    #"Tipo Alterado" = Table.TransformColumnTypes(Página1,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type text}, {"Column9", type text}, {"Column10", type text}, {"Column11", type text}, {"Column12", type text}, {"Column13", type text}, {"Column14", type text}, {"Column15", type text}, {"Column16", type text}, {"Column17", Int64.Type}, {"Column18", type text}, {"Column19", type text}, {"Column20", type text}, {"Column21", type text}}),
    #"Linhas Filtradas" = Table.SelectRows(#"Tipo Alterado", each [Column1] <> "MARKO SOLUTION-KIOTEC-CON"),
    #"Linhas Filtradas1" = Table.SelectRows(#"Linhas Filtradas", each [Column1] <> "Filtro:"),
    #"Linhas Filtradas2" = Table.SelectRows(#"Linhas Filtradas1", each [Column1] <> null),
    #"Linhas Filtradas3" = Table.SelectRows(#"Linhas Filtradas2", each [Column1] <> "Vendedor"),
    #"Linhas Filtradas4" = Table.SelectRows(#"Linhas Filtradas3", each [Column1] <> "Milênio 11.0 Build 13795"),
    #"Cabeçalhos Promovidos" = Table.PromoteHeaders(#"Linhas Filtradas4", [PromoteAllScalars=true]),
    #"Linhas Filtradas5" = Table.SelectRows(#"Cabeçalhos Promovidos", each [#"Pos."] <> "Pos."),
    #"Personalizado Adicionado" = Table.AddColumn(#"Linhas Filtradas5", "Vendedor", each if Text.Contains([Column1], "-") = true then [Column1] else null),
    #"Colunas Removidas" = Table.RemoveColumns(#"Personalizado Adicionado",{"Vendedor"}),
    #"Colunas com Nome Mudado" = Table.RenameColumns(#"Colunas Removidas",{{"Pos.", "Column1"}}),
    #"Personalizado Adicionado1" = Table.AddColumn(#"Colunas com Nome Mudado", "Vendedor", each if Text.Contains([Column1], "-") = true then [Column1] else null),
    #"Preenchido para Baixo" = Table.FillDown(#"Personalizado Adicionado1",{"Vendedor"}),
    #"Linhas Filtradas6" = Table.SelectRows(#"Preenchido para Baixo", each [Agrupamento] <> null),
    #"Colunas Removidas1" = Table.RemoveColumns(#"Linhas Filtradas6",{"Column1", "Column2", "Column5", "Column7", "Comissão", "Total", "Acum.", "Column15", "Column17", "Acum._1", "Column19", "Par. tot.", "Column21"})
in
    #"Colunas Removidas1"

The Columns names and tags are in Brazilian Portuguese, but its just translate... and see how the procedure are make...

 

Well, my english it's a little "suffering" Smiley LOL... But I think that you and others can understand.

 



I want to leave here my thanks to you @ImkeF, for the tips, so that I could arrive at final result...

Claudinei Mendes

krvel
Frequent Visitor

Hi... I trying to reply my another post about this weird report... but when I post.. the message disappear... Stay only my post and 2 replies... I have trying to post on my another post since yesterday, but without sucess...

Then my only alternative its to create a new post, to see if I get some help... Sorry for the double post... but I need some help with this report...

Well... below its the reply message that I'm trying to post but without sucess...

 

 

 

 

 

Hi @ImkeF .. I tried your suggestion... and i can get the VendorNames on a Column...

I removed the null fields, and the Company name header that have "-" separating the names of the subsidiaries and get this:

 

000015-AMANDA
000020-GUSTAVO
000028-ADM SOLUTION
100006-ADRIANA
100016-RIBEIRO
100017-LEAL JUNIOR
100025-DAYANA
100031-GORETH
100062-ELIZABETE
100096-JUCILENE
100107-LOURIVAL
100148-JOAO VITOR
100187-GABRIELA
100211-ARNALDO
100226-LEIANE
100244-FELISMINO
100269-ALIADNA
100274-ERICA
100276-WILLAMES
200176-MARA
300265-VIANA
400038-PEDRO
400041-ERIVAN
500007-AGAMENON
500055-DANIEL
500057-JALLYSON
500062-ANDRE
500068-ISABEL
500075-EDGAR
500077-ENALDO

 

The drill Down result is here... But now... how I will make to join the VendorName... with the products that they are sell ??

 

Really its a strange data input... I think for people who will analyze the printed report, make some sense... Because when the papers in hand, the guy see:

Vendorname
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
AnotherVendorName
yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

I think that in this way, for some guy that are analyzing the paper, make some sense, because the data come with a subtotal of items of each Vendorname...
But for PowerBI use, its to easy if the system developer had done something like this:

VendorName xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
VendorName xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
VendorName xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
VendorName xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
AnotherVendor yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
AnotherVendor yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
AnotherVendor yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
AnotherVendor yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy

Like this, in PowerBI, my life it's too more easy.. Lol.

If anyone else can take a look on the complete file, to see if have other approach, I will share in a link the file, with 48 hours expiring...

https://expirebox.com/download/965c3359b1eeecd05041230ccdb60411.html

Thanks again

Greg_Deckler
Super User
Super User

That's a crazy input data format. Seems like you could potentially check column A to see if it begins with a number and if so, keep it. You could then just manually add in (rename) your column headers. If anyone can get you there, it would be @ImkeF I have seen her do some amazing things in M but, wow...


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

With any luck, this will be pretty straightforward:

1) Add a column with this formula:

if Text.Contains([Column1], "-") = true then [Column1] else null

2) Check this column -> FillDown: This should return a column with all Vendornames.

 

3) Check column4 -> Transform Format to Decimal number -> remove errors -> Filter on figures >0

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.