Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
ziying35
Impactful Individual
Impactful Individual

Data overflow error when the "List.Accumulate" function was used to process data over 2000 rows

My Source Table:

Source TableSource Table

 

Desired Table:

Desired TableDesired Table

Based on the Source Table, insert the missing "SN" with a value of 0, My M code as below:

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    rows = List.Buffer(Table.ToRows(Source)),
    acc = List.Accumulate(
              rows,
              {"", {}},
              (s,c)=>if s{0}="" then {c{0}, {c}}
                     else if s{0}+1=c{0} then {c{0},s{1}&{c}}
                          else {c{0}, s{1}&List.Transform({s{0}+1..c{0}-1}, each {_,0})&{c}}
          ),
    result = Table.FromRows(acc{1}, Table.ColumnNames(Source))
in
    result

 

When my data volume is less than 2000 lines, I use the List.Accumulate function to handle, the code is running properly, the data volume is more than 2000 lines, there will be a stack overflow error, Using just the List.Accumulate solution, is there a way to get code to work properly with more than 2,000 lines of data.

Thank in advance

 

the link to my excel file: Insert Missing SN 

 

 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Curt Hagenlocher's argument   seems interesting to me.

<<...

But broadly speaking, M is a lazy language and some of the things you compute actually result in promises instead of immediate values. If the combination function passed to List.Accumulate results in a promise, then the result of accumulating over your list is a promise on top of a promise on top of... etc. At some point, when the actual value is calculated, it's done recursively -- the chain of promises is too long and the stack flows over.

>>

 

View solution in original post

19 REPLIES 19
Jimmy801
Community Champion
Community Champion

Hello @ziying35 

 

try this approach. Should be faster

let
	Source = #table
	(
		{"SN","Value"},
		{
			{"0","1"},	{"1","2"},	{"2","3"},	{"3","4"},	{"5","5"},	{"10","6"},	{"18","7"},	{"20","8"},	{"25","9"},	{"10000","10"}
		}
	),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"SN", Int64.Type}}),
	GetListSN = List.Numbers(List.Min(#"Changed Type"[SN]), List.Max(#"Changed Type"[SN])-List.Min(#"Changed Type"[SN])),
    #"Converted to Table" = Table.FromList(GetListSN, Splitter.SplitByNothing(),1, null, ExtraValues.Error),
    #"Changed Type1" = Table.TransformColumnTypes(#"Converted to Table",{{"Column1", Int64.Type}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type1",{{"Column1", "SN"}}),
	Join = Table.NestedJoin(#"Renamed Columns","SN",#"Changed Type","SN","Real SN"),
    #"Expanded Real SN" = Table.ExpandTableColumn(Join, "Real SN", {"Value"}, {"Value"}),
    #"Replaced Value" = Table.ReplaceValue(#"Expanded Real SN",null,"0",Replacer.ReplaceValue,{"Value"})
in
    #"Replaced Value"

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

ziying35
Impactful Individual
Impactful Individual

Hi, @Jimmy801 

Thank you for your solution, but my problem is not this, the problem of inserting missing data, I can write several solutions myself. But my problem today is mainly to figure out how to solve the List.Accumulate function when there is a stack overflow of data.

Anonymous
Not applicable

it is not clear to me where and why the problem of overflow with the list.accumulate function is needed (apart from the one in which an infinite loop is created).
But for the original problem, a faster solution than that using list.accumulate might be to turn the initial table into a record ( i called sorgente) where the SN column are the fields and the Value column are the values.
after with the following expression get the final table

 

 

 S = Table.FromRows(List.Transform(List.Transform({1..200000},each Text.From(_)),each {_,Record.FieldOrDefault(sorgente,_,"0")}),{"SN","Value"})

 

 

Anonymous
Not applicable

ziying35
Impactful Individual
Impactful Individual

@Anonymous 

I think it is the problem at the bottom of the program. When the amount of data is less than 2000, the speed is still very fast. When the amount of data is more than 2000 rows, the speed will slow down and even make mistakes, I read the post linked above, but Microsoft hasn't fixed the problem. Thanks

Anonymous
Not applicable

I made a little experiment with those two basic expressions:

 

 

 

    sum = List.Accumulate(Sorgn,0,(s,c)=> s+c),
    combine = List.Accumulate(Sorgn,{},(s,c)=> List.Combine({s,{c}}))

 

 

 the first one it can even calculate millions of terms, instead of the second overflowing after about 64000/64500 elements

 

 

this

 

List.Accumulate(Sorgn,[a={0}],(s,c)=> s&[a=s[a]&{c}])

and this

 List.Accumulate(Sorgn,[a=0],(s,c)=> s&[a=s[a]+c])

goes in overflow just after 7000 elements.

 

 

ziying35
Impactful Individual
Impactful Individual

Iterating over a record with the List.Accumulate function usually doesn't go wrong, but it does when it comes to combining lists or tables etc.

In our PQ group, we usually only dare to use it to deal with some data under 2000 lines

Anonymous
Not applicable

with a Sorgn={1..1000000}, for example

 

 

    comrec = List.Accumulate(Sorgn,[a=0],(s,c)=> s&[a=c]),

    comrec=List.Accumulate(Sorgn,[acc=0,cur=0],(s,c)=> s&[acc=[acc]+c,cur=c]),

    comrec=List.Accumulate(Sorgn,[acc=0,cur=0],(s,c)=> s&[acc=s[acc]+c,cur=c])

 

the first expression gives the result immediately;
the second quickly gives as a result a record with the field acc = error(*) and the field cur = 100000;
the third expression overflows

 

(*) the correct form is the third expression.

Why is everyone making this so complicated:

Will this work for you

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    SNRange = {0 .. List.Max(Source[SN])},
    AsTable = Table.FromColumns({SNRange}, {"SN"}),
    Join = Table.NestedJoin(AsTable, "SN", Source, "SN", "Value", JoinKind.LeftOuter),
    #"Expanded Value" = Table.ExpandTableColumn(Join, "Value", {"Value"}, {"Value"}),
    #"Replaced Value" = Table.ReplaceValue(#"Expanded Value",null,0,Replacer.ReplaceValue,{"Value"}),
    #"Sorted Rows" = Table.Sort(#"Replaced Value",{{"SN", Order.Ascending}})
in
    #"Sorted Rows"

 

ziying35
Impactful Individual
Impactful Individual

@artemus 

I've written code solutions like that before, but my problem isn't that

ziying35
Impactful Individual
Impactful Individual

I couldn't find a solution, so I decided to close the topic

Anonymous
Not applicable

Curt Hagenlocher's argument   seems interesting to me.

<<...

But broadly speaking, M is a lazy language and some of the things you compute actually result in promises instead of immediate values. If the combination function passed to List.Accumulate results in a promise, then the result of accumulating over your list is a promise on top of a promise on top of... etc. At some point, when the actual value is calculated, it's done recursively -- the chain of promises is too long and the stack flows over.

>>

 

Ok, what problem are you trying to solve then? You can iterate though a list and refer to previous values without using recursion. Just use List.Generate instead

ziying35
Impactful Individual
Impactful Individual

@artemus 

I just want to know that the amount of data being processed is not that large, and that it's easy to get stack overflows by using the List.Accumulate function to handle them.

I can also write a code solution using the List.Genarate function, which can handle this kind of data in a second or two

Anonymous
Not applicable

I don't think the problem is in the size of the input table.

 

You should check if you have some null value in the SN column.

 

Seems to me that you initial if construct a sort of recursive loop that never ends.

 

 

ziying35
Impactful Individual
Impactful Individual

@Anonymous 

This is just one of my cases, and there are many others that have this problem. I see that you often use List.Accumulate function to help others provide solutions, you can try to increase the data volume of those cases, I am sure that the same problem will occur

Anonymous
Not applicable

hi @ziying35 ,

In fact, I have never experienced such a situation and I cannot absolutely exclude for all the solutions I have proposed using constructs that create loops something like this may happen.

Anonymous
Not applicable

in your you could try this

 

    S = Table.Buffer(Table.RemoveMatchingRows(Source, {[SN=null]},"SN")),
    acc = List.Accumulate(
              {1..Table.RowCount(S)-1},
              {Record.FieldValues(S{0})},
              (s,c)=>if S{c}[SN]=S{c-1}[SN]+1 then s&{Record.FieldValues(S{c})}
                          else s&List.Transform({S{c-1}[SN]+1..S{c}[SN]-1}, each {_,0})&{Record.FieldValues(S{c})}
          ),
          result=Table.FromRows(acc,{"SN","Value"})
in
   result

 

or this

 

    S = Table.Buffer(Table.RemoveMatchingRows(Source, {[SN=null]},"SN")),
    acc = List.Accumulate(
              {1..Table.RowCount(S)-1},
              {S{0}},
              (s,c)=>if S{c}[SN]=S{c-1}[SN]+1 then s&{S{c}}
                          else s&List.Transform({S{c-1}[SN]+1..S{c}[SN]-1}, each [SN=_,Value=0])&{S{c}}
          ),
          result=Table.FromRecords(acc)
in
   result

 

 

 

 

but, I'm sure, you don't need any help on this aspect.

I proposed a solution only to partecipate at the discussion.

 

 

 

 

ziying35
Impactful Individual
Impactful Individual

@Anonymous 

The file I uploaded to cloud disk is some data I simulated. The data volume is more than 190,000 lines, with no null value, I asked a few friends to use a higher configuration of the computer in my file to test your code, although it will not make mistakes, but there is no way to load out the data

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors