cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
BjoernSchaefer Regular Visitor
Regular Visitor

Complex Custom Column in PowerQuery

Hello everybody,

 

for me, i'm havin trouble realising the following task.

 

I want to add the Column "Expected Outcome" in PowerQuery. I was able to add the column in PowerPivot, but therefor i have to load all the rows (ca. 3 Million). And for the case, where ZAECO=2 and i have to pick the BTRST from the earlier Date and times it by -1 i had to create a new calculated column.

 

If there is a way to add this column via PowerQuery, i can delete the columns BTRST, INDEX and ZAECO and uncheck the null-Values wich makes the Datamodel more efficient. I don't think that a simple IF-Statement would be sufficient.

 

Thanks a lot for every respond.

 

Regards

 

BjoernSchaefer

FAZIDZAECODATUMIndexBTRSTExpected OutcomeDescription of "Expected Outcome"
SYS0000009null11.10.1994310018null 
SYS0000009null11.10.1994410018null 
SYS0000009null11.10.1994510018null 
SYS0000009null25.10.1995610569null 
SYS0000009null29.01.1998711766null 
SYS0000009null30.04.1998811953null 
SYS0000009null20.10.1999912959null 
SYS0000009null03.12.1999101296012960<- if ZEACO = null then BTRST by FAZID with MAX DATUM
SYS0000210null04.05.200071582null 
SYS0000210null09.05.2001716158null 
SYS0000210null06.05.2002717246null 
SYS0000210null19.05.2003718364null 
SYS0000210null02.06.2004719444null 
SYS0000210null28.06.2005720632null 
SYS0000210null18.07.2006721706null 
SYS0000210null25.07.2007722844null 
SYS0000210null30.07.2008723928928<- if ZAECO=2 then BTRST where Index = Index-1
SYS0000210206.10.2009724136-136<- if ZAECO=2 then BTRST *-1
SYS0000210null28.10.2010725243null 
SYS0000210null14.07.2011726296null 
SYS0000210null09.07.2012727366null 
SYS0000210null18.09.2013728439null 
SYS0000210null26.09.2014729497null 
SYS0000210null07.10.2015730562null 
SYS0000210null28.09.2016731641null 
SYS0000210null20.07.2017732698null 
SYS0000210null19.07.2018733771771<- if ZEACO = null then BTRST by FAZID with MAX DATUM
SYS0000337null14.02.200110305277null 
SYS0000337null16.07.200110316231null 
SYS0000337null20.02.200210327485null 
SYS0000337111.09.2002103384778477<- if ZEACO = 1 then BTRST
SYS0000337null16.01.200310344376null 
SYS0000337null20.05.2003103545644564<- if ZEACO = null then BTRST by FAZID with MAX DATUM
5 REPLIES 5
Super User
Super User

Re: Complex Custom Column in PowerQuery

Seems like you want to use EARLIER. See my article on Mean Time Before Failure (MTBF) which uses EARLIER: http://community.powerbi.com/t5/Community-Blog/Mean-Time-Between-Failure-MTBF-and-Power-BI/ba-p/3395...


Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!


Super User
Super User

Re: Complex Custom Column in PowerQuery

try the code below - it uses grouping to identify specific scenarios and multiple conditionals to get final result - not sure about performance with 3mln rows

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZVPaxsxEMW/ivCxJItmRn9Whxy2SQo5hEDtQps0l6QuDgQfSkLab1/NW4l6s0ZOaQ1GWOvfzOjNG+3NzWL5ZWnxSYujxfb58TEvRB3ZjlJy+YfohrXU59Usbo/egLi/R/ybEPYF0b8HID6kNpI6S4po5Ii0MYQmIrazriI9kOSlncWWwnRPv8TJtwuz0hFXhOzIBF2/Plsr98fm4bu5Ph9Or8yJUcY8bdZb8371cbkyd7/Mh+H64sy8PDxtzOXw2ZwNq0+Xu8kYMWsy11nfcd5WFUjl6/l1eVMiFYJAQGw/684UCQVhICo2u5nUE4RqFgGi8SW4dhbucqKMOCCqnnNthPuC6Lkj65Mg7eNTRqIiAQhEsO2zZHOOSASi8fsDhanTgMCcrCIk7l9ZYDg/vTrh3e6/bNY/1uZi+239M3sD6zHNk/DYk2xMHp0XGZMp4WCGd/vi/VETIbEX2aPNs/GYqunGcxJ05IC5bKup/gMCM3GEMw6YKfcsKQIzQUcnsyGcniUUBGZimCnFdmGxHB9mEn3iQ9tMXAuDmURFCI7aSHEGpijCrCG1x4+qYjCTQIRI/+0+EYnThnK9HciOKnCcKTeFQnF7gaADy0yICaRCcL1UMgQ7uN7vhxCZRrUrIRhE1LZHCdqRoVk51Wsqh8RLTuLMjrPK/Q6kJTuP2+1fOnL7Gw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [FAZID = _t, ZAECO = _t, DATUM = _t, Index = _t, BTRST = _t, #"Description of ""Expected Outcome""" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"FAZID", type text}, {"ZAECO", Int64.Type}, {"DATUM", type date}, {"Index", Int64.Type}, {"BTRST", Int64.Type}, {"Description of ""Expected Outcome""", type text}}),
    Buffer = Table.Buffer(#"Changed Type"),
    A_null_only = Table.SelectRows(Buffer, each ([ZAECO] = null)),
    A_null_only_grouped = Table.Group(A_null_only, {"FAZID", "ZAECO"}, {{"DATUM", each List.Max([DATUM]), type date}}),
    B_2_only = Table.SelectRows(Buffer, each ([ZAECO] = 2)),
    B_2_only_index = Table.SelectColumns(B_2_only,{"FAZID", "Index"}),
    B_2_only_new_index = Table.TransformColumns(B_2_only_index, {{"Index", each _ - 1, type number}}),
    Custom = Buffer,
    A_merged = Table.NestedJoin(Custom,{"FAZID", "ZAECO", "DATUM"},A_null_only_grouped,{"FAZID", "ZAECO", "DATUM"},"Custom1",JoinKind.LeftOuter),
    A_expanded = Table.ExpandTableColumn(A_merged, "Custom1", {"FAZID"}, {"A"}),
    B_merged = Table.NestedJoin(A_expanded,{"FAZID", "Index"},B_2_only_new_index,{"FAZID", "Index"},"Expanded Custom1",JoinKind.LeftOuter),
    B_expanded = Table.ExpandTableColumn(B_merged, "Expanded Custom1", {"FAZID"}, {"B"}),
    #"Merged Queries" = Table.NestedJoin(B_expanded,{"FAZID", "Index"},B_2_only_index,{"FAZID", "Index"},"B_expanded",JoinKind.LeftOuter),
    #"Expanded B_expanded" = Table.ExpandTableColumn(#"Merged Queries", "B_expanded", {"FAZID"}, {"B_minus"}),
    #"Added Custom" = Table.AddColumn(#"Expanded B_expanded", "Value", each if [FAZID]=[A] then [BTRST] else 
if [FAZID]=[B] then [BTRST] else
if [FAZID]=[B_minus] then [BTRST]*-1 else
if [ZAECO]=1 then [BTRST] else
null)
in
    #"Added Custom"
BjoernSchaefer Regular Visitor
Regular Visitor

Re: Complex Custom Column in PowerQuery

Hi @Greg_Deckler,

 

thanks a lot for your reply. I want to realise this in PowerQuery. When i add a calculated column in PowerPivot, the size of my file is almost as twice so big as without that column.

 

Regards

 

BjoernSchaefer

BjoernSchaefer Regular Visitor
Regular Visitor

Re: Complex Custom Column in PowerQuery

Hi @Stachu,

 

thank you very much. I'm still triying to figuere out how the code works. But the performance is extrmely slow. It's trying to finish the preview in PowerQuery for nearly 2 hours.

 

There is one phenomenon that doesn't make sense to me. As soon as i'm adding the Index-Column to the query, the size of my file increases by nearly 4 times. Before the Index-Column, the size of the file, wich includes 2.6 Million rows, was 13 MB. With the Index-Column, it rises to 53 MB! Do you have a clue why?

 

Regards

 

BjoernSchaefer

Highlighted
Super User
Super User

Re: Complex Custom Column in PowerQuery

index cannot be compressed as well as the original data, I'd suggest adding it and then removeing again after it's not needed anymore
https://powerpivotpro.com/2015/12/compression-with-power-pivot/