Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Sam12
Regular Visitor

adding rows in power query (and performance issue)

Hi PBI and PQ community,

i learn a lot from all the postings.

 

But for the following situation i did not find anything yet. May be you can help.

 

I want to improve the data that is loaded into PBI in Power Query.

On the left side: the data that is loaded from a system, but it has missing values.

On the right side: the data how i need to have it. additional rows are added, with the previous value. (PQs Filldown might help here).

 

I have already calender tables that state all the available dates, but i do not know how to get those missing dates in this list

Any thoughts?

Sam.

 

PBI-concept.PNG

 

date: 21-05-2018:

Previously i posted this message. Interkoubess replied with nice insights,

 

But i have a new challenge on this. When use the proposed method, the performance is low, as the table grow to over 20Gb and takes hours to process.

I added an additional sample below

pbi-question.PNG

 

 

The challenge is now that i have 1000 products and multiyear data.

 

This really slows down the preparation of the data in PBI.

 

I found that there is quite some insights on single product solutions (like A only in the above sample), but a list with all products i cannot find. 

 

Any suggestions to solve this in M or PBI?

 

 

 

thanks,

Sam12

 

below the sample data:

 

dateproductprice
1-2-2017A120
2-2-2017A130
3-2-2017A140
6-2-2017A150
7-2-2017A160
8-2-2017A170
9-2-2017A160
10-2-2017A150
13-2-2017A140
14-2-2017A150
15-2-2017A145
1-2-2017B60
2-2-2017B52
3-2-2017B54
6-2-2017B49
7-2-2017B51
8-2-2017B46
9-2-2017B49
10-2-2017B51
13-2-2017B50
14-2-2017B48
15-2-2017B46
1 ACCEPTED SOLUTION
Interkoubess
Solution Sage
Solution Sage

Hi @Sam12,

 

Please try this code in the advanced query editor after exporting your data and arrange the corresponding fields:

- First I have field Date and Amount ( from your raw data)

 

1) Make sure your Date is type Date

2) Create an index

3) Find the difference of day(s) between consecutive date

4) Create a list of date

5) Expand the list created and removed the unnecessary fields ( index, customs)

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Amount", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1),
    #"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each try #"Added Index"[Date]{[Index]}-[Date] otherwise 1),
    #"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"Custom", Int64.Type}}),
    #"Added Custom1" = Table.AddColumn(#"Changed Type1", "Custom.1", each List.Dates(Date.From([Date]),[Custom],#duration(1,0,0,0))),
    #"Expanded Custom.1" = Table.ExpandListColumn(#"Added Custom1", "Custom.1"),
    #"Reordered Columns" = Table.ReorderColumns(#"Expanded Custom.1",{"Custom.1", "Amount", "Date", "Index", "Custom"}),
    #"Removed Other Columns" = Table.SelectColumns(#"Reordered Columns",{"Custom.1", "Amount"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"Custom.1", "Date"}})
in
    #"Renamed Columns"

CaptureRPQ.PNG

View solution in original post

7 REPLIES 7
Interkoubess
Solution Sage
Solution Sage

Hi @Sam12,

 

Please try this code in the advanced query editor after exporting your data and arrange the corresponding fields:

- First I have field Date and Amount ( from your raw data)

 

1) Make sure your Date is type Date

2) Create an index

3) Find the difference of day(s) between consecutive date

4) Create a list of date

5) Expand the list created and removed the unnecessary fields ( index, customs)

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Amount", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1),
    #"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each try #"Added Index"[Date]{[Index]}-[Date] otherwise 1),
    #"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"Custom", Int64.Type}}),
    #"Added Custom1" = Table.AddColumn(#"Changed Type1", "Custom.1", each List.Dates(Date.From([Date]),[Custom],#duration(1,0,0,0))),
    #"Expanded Custom.1" = Table.ExpandListColumn(#"Added Custom1", "Custom.1"),
    #"Reordered Columns" = Table.ReorderColumns(#"Expanded Custom.1",{"Custom.1", "Amount", "Date", "Index", "Custom"}),
    #"Removed Other Columns" = Table.SelectColumns(#"Reordered Columns",{"Custom.1", "Amount"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"Custom.1", "Date"}})
in
    #"Renamed Columns"

CaptureRPQ.PNG

Hi Interkoubess,

 

I have played around with your suggestion.  My original csv file has 65000 rows. In itself this file reads quickly. 

 

I find that the following lines let the query explode in time to "Apply query changes".

#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1),

#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each try #"Added Index"[Date]{[Index]}-[Date] otherwise 1),

 

 

The resulting PQ-apply-process gets to 12Gb before throwing "OLE DB or ODBC errors. We cannot convert the value null to type number."

 

Have you seen this before?

 

I think it would be great if we can make the additions you propose, but then saving this as hardcoded value, so that no calculation is needed anymore.

 

any thoughts?
Sam12

Hi @Sam12 ,

 

I have never seen this message>

But what about checking the field ( removing null or replacing by a constant) and then proceed....

Please let me know if it does not help, and we can figure out something different.

 

 

Ninter

Hi, i just posted the message.

I tried to check this (null) but scolling through 65k rows is not that fast in PQ. Is there a way to extract it from PQ? Like copy-paste into Excel with all rows and not only 999 rows? Let me know.

P.

Hi @Sam12,

 

Yeah you can extract all your data from PQ simply by close-and-load ( In the top left corner ).

 

Please let me know if I missed something.

But what about replacing the null  value in PQ, it did not work?

 

Thank you.

 

Ninter

Hi, I did a replace, but it did not have an effect, still errors.

 

What about splitting the data set in smaller lists of approx 10000 rows  (original file is 65000 row: 500products*130rows), and apply the proposed script. We can make sub selections using another table that states which products to include (innerjoin?)

 

Can we append the queries afterwards back to the total list again of 65000 rows? Does this reduce the load on the system?

thanks for you ideas.

P.

 

 

 

 

 

Hi Interkoubess this is indeed what I need. I played around a bit and it worked. Thanks!

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.