cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
SCMfan
Regular Visitor

Multiple duplicating column in one step

Hello everyone!

I want to optimalize big report in my company. In this report is more than 700,000 rows, so time for refreshing is very long. In 5 steps PowerQuery creates five duplicate columns(for creating some keys).

 

For Example,

700,000 rows - I am adding new duplicate column, the file takes a very long time to recalculate. After that I am adding the second duplicate column and after that the file is going to recalculate again.

 

I want to add five duplicate columns in one step. How to do it?

1 ACCEPTED SOLUTION

let
    
    col0 = List.Repeat({List.Random(700000,111)}, 75),
    col1 = List.Random(700000,1),
    col2= List.Random(700000,2),
    col3= List.Random(700000,3),
    col4=List.Random(700000,4),
    col5=List.Random(700000,5),
    tfc= Table.Buffer(Table.FromColumns(col0&{col1,col2,col3,col4,col5})),

    #"Duplicata colonna" = Table.DuplicateColumn(tfc, "Column1", "Column1 - Copia"),
    #"Duplicata colonna1" = Table.DuplicateColumn(#"Duplicata colonna", "Column2", "Column2 - Copia"),
    #"Duplicata colonna2" = Table.DuplicateColumn(#"Duplicata colonna1", "Column3", "Column3 - Copia"),
    #"Duplicata colonna3" = Table.DuplicateColumn(#"Duplicata colonna2", "Column4", "Column4 - Copia")

in
#"Duplicata colonna3"

 

 

Try this. be aware of using table.Buffer before you start adding duplicates. I have seen that it adds about 1k lines / sec. So in 700sec (I locked after a few seconds, though) it should complete the lap.

View solution in original post

4 REPLIES 4
SCMfan
Regular Visitor

Hi,

Thank your for the answer. There is short example:

 

let
Źródło = Excel.CurrentWorkbook(){[Name="Tabela1"]}[Content],
#"Zduplikowano kolumnę" = Table.DuplicateColumn(Źródło, "Value", "Value — kopia"),
#"Zduplikowano kolumnę1" = Table.DuplicateColumn(#"Zduplikowano kolumnę", "Country", "Country — kopia"),
#"Zduplikowano kolumnę2" = Table.DuplicateColumn(#"Zduplikowano kolumnę1", "Brand Manager", "Brand Manager — kopia"),
#"Zduplikowano kolumnę3" = Table.DuplicateColumn(#"Zduplikowano kolumnę2", "Day", "Day — kopia"),
#"Zduplikowano kolumnę4" = Table.DuplicateColumn(#"Zduplikowano kolumnę3", "Month", "Month — kopia")
in
#"Zduplikowano kolumnę4"

 

I want to merge steps(#"Zduplikowano kolumnę",#"Zduplikowano kolumnę1",#"Zduplikowano kolumnę2",#"Zduplikowano kolumnę3",#"Zduplikowano kolumnę4") for one step.

 

35 sec is impossible time. I have 700,000 rows and more than 80 columns in this file.

let
    
    col0 = List.Repeat({List.Random(700000,111)}, 75),
    col1 = List.Random(700000,1),
    col2= List.Random(700000,2),
    col3= List.Random(700000,3),
    col4=List.Random(700000,4),
    col5=List.Random(700000,5),
    tfc= Table.Buffer(Table.FromColumns(col0&{col1,col2,col3,col4,col5})),

    #"Duplicata colonna" = Table.DuplicateColumn(tfc, "Column1", "Column1 - Copia"),
    #"Duplicata colonna1" = Table.DuplicateColumn(#"Duplicata colonna", "Column2", "Column2 - Copia"),
    #"Duplicata colonna2" = Table.DuplicateColumn(#"Duplicata colonna1", "Column3", "Column3 - Copia"),
    #"Duplicata colonna3" = Table.DuplicateColumn(#"Duplicata colonna2", "Column4", "Column4 - Copia")

in
#"Duplicata colonna3"

 

 

Try this. be aware of using table.Buffer before you start adding duplicates. I have seen that it adds about 1k lines / sec. So in 700sec (I locked after a few seconds, though) it should complete the lap.

View solution in original post

You should keep the 5 columns to duplicate in a separate table.

The duplication og these takes less than 1 minute.

Test with the script I posted and measure the time.

If you carry the burden of a 700kX80 table with you then yes it takes a long time ...

Rocco_sprmnt21
Super User II
Super User II

let
    col1 = List.Random(700000,1),
    col2= List.Random(700000,2),
    col3= List.Random(700000,3),
    col4=List.Random(700000,4),

    tfc= Table.FromColumns({col1,col2,col3,col4}),
    #"Duplicata colonna" = Table.DuplicateColumn(tfc, "Column1", "Column1 - Copia"),
    #"Duplicata colonna1" = Table.DuplicateColumn(#"Duplicata colonna", "Column2", "Column2 - Copia"),
    #"Duplicata colonna2" = Table.DuplicateColumn(#"Duplicata colonna1", "Column3", "Column3 - Copia"),
    #"Duplicata colonna3" = Table.DuplicateColumn(#"Duplicata colonna2", "Column4", "Column4 - Copia")

in
#"Duplicata colonna3"

this query produces the final table of 700000 lines and 4 duplicate columns at the rate of 20k lines per sec. So 700k / 20k / sec = about 35sec

 

this makes me think that the problem lies elsewhere.

 

Try to explain in more detail the situation you start from, what you do and what happens ...

 

Helpful resources

Announcements
PBI User Groups

Welcome to the User Group Public Preview

Check out new user group experience and if you are a leader please create your group!

MBAS Attendee Badge

Claim Your Badge & Digital Swag!

Check out how to claim yours today!

secondImage

Are You Ready?

Test your skills now with the Cloud Skills Challenge.

Top Solution Authors