Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello everyone!
I want to optimalize big report in my company. In this report is more than 700,000 rows, so time for refreshing is very long. In 5 steps PowerQuery creates five duplicate columns(for creating some keys).
For Example,
700,000 rows - I am adding new duplicate column, the file takes a very long time to recalculate. After that I am adding the second duplicate column and after that the file is going to recalculate again.
I want to add five duplicate columns in one step. How to do it?
Solved! Go to Solution.
let
col0 = List.Repeat({List.Random(700000,111)}, 75),
col1 = List.Random(700000,1),
col2= List.Random(700000,2),
col3= List.Random(700000,3),
col4=List.Random(700000,4),
col5=List.Random(700000,5),
tfc= Table.Buffer(Table.FromColumns(col0&{col1,col2,col3,col4,col5})),
#"Duplicata colonna" = Table.DuplicateColumn(tfc, "Column1", "Column1 - Copia"),
#"Duplicata colonna1" = Table.DuplicateColumn(#"Duplicata colonna", "Column2", "Column2 - Copia"),
#"Duplicata colonna2" = Table.DuplicateColumn(#"Duplicata colonna1", "Column3", "Column3 - Copia"),
#"Duplicata colonna3" = Table.DuplicateColumn(#"Duplicata colonna2", "Column4", "Column4 - Copia")
in
#"Duplicata colonna3"
Try this. be aware of using table.Buffer before you start adding duplicates. I have seen that it adds about 1k lines / sec. So in 700sec (I locked after a few seconds, though) it should complete the lap.
Hi,
Thank your for the answer. There is short example:
let
Źródło = Excel.CurrentWorkbook(){[Name="Tabela1"]}[Content],
#"Zduplikowano kolumnę" = Table.DuplicateColumn(Źródło, "Value", "Value — kopia"),
#"Zduplikowano kolumnę1" = Table.DuplicateColumn(#"Zduplikowano kolumnę", "Country", "Country — kopia"),
#"Zduplikowano kolumnę2" = Table.DuplicateColumn(#"Zduplikowano kolumnę1", "Brand Manager", "Brand Manager — kopia"),
#"Zduplikowano kolumnę3" = Table.DuplicateColumn(#"Zduplikowano kolumnę2", "Day", "Day — kopia"),
#"Zduplikowano kolumnę4" = Table.DuplicateColumn(#"Zduplikowano kolumnę3", "Month", "Month — kopia")
in
#"Zduplikowano kolumnę4"
I want to merge steps(#"Zduplikowano kolumnę",#"Zduplikowano kolumnę1",#"Zduplikowano kolumnę2",#"Zduplikowano kolumnę3",#"Zduplikowano kolumnę4") for one step.
35 sec is impossible time. I have 700,000 rows and more than 80 columns in this file.
let
col0 = List.Repeat({List.Random(700000,111)}, 75),
col1 = List.Random(700000,1),
col2= List.Random(700000,2),
col3= List.Random(700000,3),
col4=List.Random(700000,4),
col5=List.Random(700000,5),
tfc= Table.Buffer(Table.FromColumns(col0&{col1,col2,col3,col4,col5})),
#"Duplicata colonna" = Table.DuplicateColumn(tfc, "Column1", "Column1 - Copia"),
#"Duplicata colonna1" = Table.DuplicateColumn(#"Duplicata colonna", "Column2", "Column2 - Copia"),
#"Duplicata colonna2" = Table.DuplicateColumn(#"Duplicata colonna1", "Column3", "Column3 - Copia"),
#"Duplicata colonna3" = Table.DuplicateColumn(#"Duplicata colonna2", "Column4", "Column4 - Copia")
in
#"Duplicata colonna3"
Try this. be aware of using table.Buffer before you start adding duplicates. I have seen that it adds about 1k lines / sec. So in 700sec (I locked after a few seconds, though) it should complete the lap.
You should keep the 5 columns to duplicate in a separate table.
The duplication og these takes less than 1 minute.
Test with the script I posted and measure the time.
If you carry the burden of a 700kX80 table with you then yes it takes a long time ...
let
col1 = List.Random(700000,1),
col2= List.Random(700000,2),
col3= List.Random(700000,3),
col4=List.Random(700000,4),
tfc= Table.FromColumns({col1,col2,col3,col4}),
#"Duplicata colonna" = Table.DuplicateColumn(tfc, "Column1", "Column1 - Copia"),
#"Duplicata colonna1" = Table.DuplicateColumn(#"Duplicata colonna", "Column2", "Column2 - Copia"),
#"Duplicata colonna2" = Table.DuplicateColumn(#"Duplicata colonna1", "Column3", "Column3 - Copia"),
#"Duplicata colonna3" = Table.DuplicateColumn(#"Duplicata colonna2", "Column4", "Column4 - Copia")
in
#"Duplicata colonna3"
this query produces the final table of 700000 lines and 4 duplicate columns at the rate of 20k lines per sec. So 700k / 20k / sec = about 35sec
this makes me think that the problem lies elsewhere.
Try to explain in more detail the situation you start from, what you do and what happens ...
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.