Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
BjoernSchaefer
Helper II
Helper II

Complex Custom Column in PowerQuery

Hello everybody,

 

for me, i'm havin trouble realising the following task.

 

I want to add the Column "Expected Outcome" in PowerQuery. I was able to add the column in PowerPivot, but therefor i have to load all the rows (ca. 3 Million). And for the case, where ZAECO=2 and i have to pick the BTRST from the earlier Date and times it by -1 i had to create a new calculated column.

 

If there is a way to add this column via PowerQuery, i can delete the columns BTRST, INDEX and ZAECO and uncheck the null-Values wich makes the Datamodel more efficient. I don't think that a simple IF-Statement would be sufficient.

 

Thanks a lot for every respond.

 

Regards

 

BjoernSchaefer

FAZIDZAECODATUMIndexBTRSTExpected OutcomeDescription of "Expected Outcome"
SYS0000009null11.10.1994310018null 
SYS0000009null11.10.1994410018null 
SYS0000009null11.10.1994510018null 
SYS0000009null25.10.1995610569null 
SYS0000009null29.01.1998711766null 
SYS0000009null30.04.1998811953null 
SYS0000009null20.10.1999912959null 
SYS0000009null03.12.1999101296012960<- if ZEACO = null then BTRST by FAZID with MAX DATUM
SYS0000210null04.05.200071582null 
SYS0000210null09.05.2001716158null 
SYS0000210null06.05.2002717246null 
SYS0000210null19.05.2003718364null 
SYS0000210null02.06.2004719444null 
SYS0000210null28.06.2005720632null 
SYS0000210null18.07.2006721706null 
SYS0000210null25.07.2007722844null 
SYS0000210null30.07.2008723928928<- if ZAECO=2 then BTRST where Index = Index-1
SYS0000210206.10.2009724136-136<- if ZAECO=2 then BTRST *-1
SYS0000210null28.10.2010725243null 
SYS0000210null14.07.2011726296null 
SYS0000210null09.07.2012727366null 
SYS0000210null18.09.2013728439null 
SYS0000210null26.09.2014729497null 
SYS0000210null07.10.2015730562null 
SYS0000210null28.09.2016731641null 
SYS0000210null20.07.2017732698null 
SYS0000210null19.07.2018733771771<- if ZEACO = null then BTRST by FAZID with MAX DATUM
SYS0000337null14.02.200110305277null 
SYS0000337null16.07.200110316231null 
SYS0000337null20.02.200210327485null 
SYS0000337111.09.2002103384778477<- if ZEACO = 1 then BTRST
SYS0000337null16.01.200310344376null 
SYS0000337null20.05.2003103545644564<- if ZEACO = null then BTRST by FAZID with MAX DATUM
5 REPLIES 5
Stachu
Community Champion
Community Champion

try the code below - it uses grouping to identify specific scenarios and multiple conditionals to get final result - not sure about performance with 3mln rows

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZVPaxsxEMW/ivCxJItmRn9Whxy2SQo5hEDtQps0l6QuDgQfSkLab1/NW4l6s0ZOaQ1GWOvfzOjNG+3NzWL5ZWnxSYujxfb58TEvRB3ZjlJy+YfohrXU59Usbo/egLi/R/ybEPYF0b8HID6kNpI6S4po5Ii0MYQmIrazriI9kOSlncWWwnRPv8TJtwuz0hFXhOzIBF2/Plsr98fm4bu5Ph9Or8yJUcY8bdZb8371cbkyd7/Mh+H64sy8PDxtzOXw2ZwNq0+Xu8kYMWsy11nfcd5WFUjl6/l1eVMiFYJAQGw/684UCQVhICo2u5nUE4RqFgGi8SW4dhbucqKMOCCqnnNthPuC6Lkj65Mg7eNTRqIiAQhEsO2zZHOOSASi8fsDhanTgMCcrCIk7l9ZYDg/vTrh3e6/bNY/1uZi+239M3sD6zHNk/DYk2xMHp0XGZMp4WCGd/vi/VETIbEX2aPNs/GYqunGcxJ05IC5bKup/gMCM3GEMw6YKfcsKQIzQUcnsyGcniUUBGZimCnFdmGxHB9mEn3iQ9tMXAuDmURFCI7aSHEGpijCrCG1x4+qYjCTQIRI/+0+EYnThnK9HciOKnCcKTeFQnF7gaADy0yICaRCcL1UMgQ7uN7vhxCZRrUrIRhE1LZHCdqRoVk51Wsqh8RLTuLMjrPK/Q6kJTuP2+1fOnL7Gw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [FAZID = _t, ZAECO = _t, DATUM = _t, Index = _t, BTRST = _t, #"Description of ""Expected Outcome""" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"FAZID", type text}, {"ZAECO", Int64.Type}, {"DATUM", type date}, {"Index", Int64.Type}, {"BTRST", Int64.Type}, {"Description of ""Expected Outcome""", type text}}),
    Buffer = Table.Buffer(#"Changed Type"),
    A_null_only = Table.SelectRows(Buffer, each ([ZAECO] = null)),
    A_null_only_grouped = Table.Group(A_null_only, {"FAZID", "ZAECO"}, {{"DATUM", each List.Max([DATUM]), type date}}),
    B_2_only = Table.SelectRows(Buffer, each ([ZAECO] = 2)),
    B_2_only_index = Table.SelectColumns(B_2_only,{"FAZID", "Index"}),
    B_2_only_new_index = Table.TransformColumns(B_2_only_index, {{"Index", each _ - 1, type number}}),
    Custom = Buffer,
    A_merged = Table.NestedJoin(Custom,{"FAZID", "ZAECO", "DATUM"},A_null_only_grouped,{"FAZID", "ZAECO", "DATUM"},"Custom1",JoinKind.LeftOuter),
    A_expanded = Table.ExpandTableColumn(A_merged, "Custom1", {"FAZID"}, {"A"}),
    B_merged = Table.NestedJoin(A_expanded,{"FAZID", "Index"},B_2_only_new_index,{"FAZID", "Index"},"Expanded Custom1",JoinKind.LeftOuter),
    B_expanded = Table.ExpandTableColumn(B_merged, "Expanded Custom1", {"FAZID"}, {"B"}),
    #"Merged Queries" = Table.NestedJoin(B_expanded,{"FAZID", "Index"},B_2_only_index,{"FAZID", "Index"},"B_expanded",JoinKind.LeftOuter),
    #"Expanded B_expanded" = Table.ExpandTableColumn(#"Merged Queries", "B_expanded", {"FAZID"}, {"B_minus"}),
    #"Added Custom" = Table.AddColumn(#"Expanded B_expanded", "Value", each if [FAZID]=[A] then [BTRST] else 
if [FAZID]=[B] then [BTRST] else
if [FAZID]=[B_minus] then [BTRST]*-1 else
if [ZAECO]=1 then [BTRST] else
null)
in
    #"Added Custom"


Did I answer your question? Mark my post as a solution!
Thank you for the kudos 🙂

Hi @Stachu,

 

thank you very much. I'm still triying to figuere out how the code works. But the performance is extrmely slow. It's trying to finish the preview in PowerQuery for nearly 2 hours.

 

There is one phenomenon that doesn't make sense to me. As soon as i'm adding the Index-Column to the query, the size of my file increases by nearly 4 times. Before the Index-Column, the size of the file, wich includes 2.6 Million rows, was 13 MB. With the Index-Column, it rises to 53 MB! Do you have a clue why?

 

Regards

 

BjoernSchaefer

index cannot be compressed as well as the original data, I'd suggest adding it and then removeing again after it's not needed anymore
https://powerpivotpro.com/2015/12/compression-with-power-pivot/



Did I answer your question? Mark my post as a solution!
Thank you for the kudos 🙂

Greg_Deckler
Super User
Super User

Seems like you want to use EARLIER. See my article on Mean Time Before Failure (MTBF) which uses EARLIER: http://community.powerbi.com/t5/Community-Blog/Mean-Time-Between-Failure-MTBF-and-Power-BI/ba-p/3395...


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

Hi @Greg_Deckler,

 

thanks a lot for your reply. I want to realise this in PowerQuery. When i add a calculated column in PowerPivot, the size of my file is almost as twice so big as without that column.

 

Regards

 

BjoernSchaefer

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.