Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello everybody,
for me, i'm havin trouble realising the following task.
I want to add the Column "Expected Outcome" in PowerQuery. I was able to add the column in PowerPivot, but therefor i have to load all the rows (ca. 3 Million). And for the case, where ZAECO=2 and i have to pick the BTRST from the earlier Date and times it by -1 i had to create a new calculated column.
If there is a way to add this column via PowerQuery, i can delete the columns BTRST, INDEX and ZAECO and uncheck the null-Values wich makes the Datamodel more efficient. I don't think that a simple IF-Statement would be sufficient.
Thanks a lot for every respond.
Regards
BjoernSchaefer
FAZID | ZAECO | DATUM | Index | BTRST | Expected Outcome | Description of "Expected Outcome" |
SYS0000009 | null | 11.10.1994 | 3 | 10018 | null | |
SYS0000009 | null | 11.10.1994 | 4 | 10018 | null | |
SYS0000009 | null | 11.10.1994 | 5 | 10018 | null | |
SYS0000009 | null | 25.10.1995 | 6 | 10569 | null | |
SYS0000009 | null | 29.01.1998 | 7 | 11766 | null | |
SYS0000009 | null | 30.04.1998 | 8 | 11953 | null | |
SYS0000009 | null | 20.10.1999 | 9 | 12959 | null | |
SYS0000009 | null | 03.12.1999 | 10 | 12960 | 12960 | <- if ZEACO = null then BTRST by FAZID with MAX DATUM |
SYS0000210 | null | 04.05.2000 | 715 | 82 | null | |
SYS0000210 | null | 09.05.2001 | 716 | 158 | null | |
SYS0000210 | null | 06.05.2002 | 717 | 246 | null | |
SYS0000210 | null | 19.05.2003 | 718 | 364 | null | |
SYS0000210 | null | 02.06.2004 | 719 | 444 | null | |
SYS0000210 | null | 28.06.2005 | 720 | 632 | null | |
SYS0000210 | null | 18.07.2006 | 721 | 706 | null | |
SYS0000210 | null | 25.07.2007 | 722 | 844 | null | |
SYS0000210 | null | 30.07.2008 | 723 | 928 | 928 | <- if ZAECO=2 then BTRST where Index = Index-1 |
SYS0000210 | 2 | 06.10.2009 | 724 | 136 | -136 | <- if ZAECO=2 then BTRST *-1 |
SYS0000210 | null | 28.10.2010 | 725 | 243 | null | |
SYS0000210 | null | 14.07.2011 | 726 | 296 | null | |
SYS0000210 | null | 09.07.2012 | 727 | 366 | null | |
SYS0000210 | null | 18.09.2013 | 728 | 439 | null | |
SYS0000210 | null | 26.09.2014 | 729 | 497 | null | |
SYS0000210 | null | 07.10.2015 | 730 | 562 | null | |
SYS0000210 | null | 28.09.2016 | 731 | 641 | null | |
SYS0000210 | null | 20.07.2017 | 732 | 698 | null | |
SYS0000210 | null | 19.07.2018 | 733 | 771 | 771 | <- if ZEACO = null then BTRST by FAZID with MAX DATUM |
SYS0000337 | null | 14.02.2001 | 1030 | 5277 | null | |
SYS0000337 | null | 16.07.2001 | 1031 | 6231 | null | |
SYS0000337 | null | 20.02.2002 | 1032 | 7485 | null | |
SYS0000337 | 1 | 11.09.2002 | 1033 | 8477 | 8477 | <- if ZEACO = 1 then BTRST |
SYS0000337 | null | 16.01.2003 | 1034 | 4376 | null | |
SYS0000337 | null | 20.05.2003 | 1035 | 4564 | 4564 | <- if ZEACO = null then BTRST by FAZID with MAX DATUM |
try the code below - it uses grouping to identify specific scenarios and multiple conditionals to get final result - not sure about performance with 3mln rows
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZVPaxsxEMW/ivCxJItmRn9Whxy2SQo5hEDtQps0l6QuDgQfSkLab1/NW4l6s0ZOaQ1GWOvfzOjNG+3NzWL5ZWnxSYujxfb58TEvRB3ZjlJy+YfohrXU59Usbo/egLi/R/ybEPYF0b8HID6kNpI6S4po5Ii0MYQmIrazriI9kOSlncWWwnRPv8TJtwuz0hFXhOzIBF2/Plsr98fm4bu5Ph9Or8yJUcY8bdZb8371cbkyd7/Mh+H64sy8PDxtzOXw2ZwNq0+Xu8kYMWsy11nfcd5WFUjl6/l1eVMiFYJAQGw/684UCQVhICo2u5nUE4RqFgGi8SW4dhbucqKMOCCqnnNthPuC6Lkj65Mg7eNTRqIiAQhEsO2zZHOOSASi8fsDhanTgMCcrCIk7l9ZYDg/vTrh3e6/bNY/1uZi+239M3sD6zHNk/DYk2xMHp0XGZMp4WCGd/vi/VETIbEX2aPNs/GYqunGcxJ05IC5bKup/gMCM3GEMw6YKfcsKQIzQUcnsyGcniUUBGZimCnFdmGxHB9mEn3iQ9tMXAuDmURFCI7aSHEGpijCrCG1x4+qYjCTQIRI/+0+EYnThnK9HciOKnCcKTeFQnF7gaADy0yICaRCcL1UMgQ7uN7vhxCZRrUrIRhE1LZHCdqRoVk51Wsqh8RLTuLMjrPK/Q6kJTuP2+1fOnL7Gw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [FAZID = _t, ZAECO = _t, DATUM = _t, Index = _t, BTRST = _t, #"Description of ""Expected Outcome""" = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"FAZID", type text}, {"ZAECO", Int64.Type}, {"DATUM", type date}, {"Index", Int64.Type}, {"BTRST", Int64.Type}, {"Description of ""Expected Outcome""", type text}}), Buffer = Table.Buffer(#"Changed Type"), A_null_only = Table.SelectRows(Buffer, each ([ZAECO] = null)), A_null_only_grouped = Table.Group(A_null_only, {"FAZID", "ZAECO"}, {{"DATUM", each List.Max([DATUM]), type date}}), B_2_only = Table.SelectRows(Buffer, each ([ZAECO] = 2)), B_2_only_index = Table.SelectColumns(B_2_only,{"FAZID", "Index"}), B_2_only_new_index = Table.TransformColumns(B_2_only_index, {{"Index", each _ - 1, type number}}), Custom = Buffer, A_merged = Table.NestedJoin(Custom,{"FAZID", "ZAECO", "DATUM"},A_null_only_grouped,{"FAZID", "ZAECO", "DATUM"},"Custom1",JoinKind.LeftOuter), A_expanded = Table.ExpandTableColumn(A_merged, "Custom1", {"FAZID"}, {"A"}), B_merged = Table.NestedJoin(A_expanded,{"FAZID", "Index"},B_2_only_new_index,{"FAZID", "Index"},"Expanded Custom1",JoinKind.LeftOuter), B_expanded = Table.ExpandTableColumn(B_merged, "Expanded Custom1", {"FAZID"}, {"B"}), #"Merged Queries" = Table.NestedJoin(B_expanded,{"FAZID", "Index"},B_2_only_index,{"FAZID", "Index"},"B_expanded",JoinKind.LeftOuter), #"Expanded B_expanded" = Table.ExpandTableColumn(#"Merged Queries", "B_expanded", {"FAZID"}, {"B_minus"}), #"Added Custom" = Table.AddColumn(#"Expanded B_expanded", "Value", each if [FAZID]=[A] then [BTRST] else if [FAZID]=[B] then [BTRST] else if [FAZID]=[B_minus] then [BTRST]*-1 else if [ZAECO]=1 then [BTRST] else null) in #"Added Custom"
Hi @Stachu,
thank you very much. I'm still triying to figuere out how the code works. But the performance is extrmely slow. It's trying to finish the preview in PowerQuery for nearly 2 hours.
There is one phenomenon that doesn't make sense to me. As soon as i'm adding the Index-Column to the query, the size of my file increases by nearly 4 times. Before the Index-Column, the size of the file, wich includes 2.6 Million rows, was 13 MB. With the Index-Column, it rises to 53 MB! Do you have a clue why?
Regards
BjoernSchaefer
index cannot be compressed as well as the original data, I'd suggest adding it and then removeing again after it's not needed anymore
https://powerpivotpro.com/2015/12/compression-with-power-pivot/
Seems like you want to use EARLIER. See my article on Mean Time Before Failure (MTBF) which uses EARLIER: http://community.powerbi.com/t5/Community-Blog/Mean-Time-Between-Failure-MTBF-and-Power-BI/ba-p/3395...
Hi @Greg_Deckler,
thanks a lot for your reply. I want to realise this in PowerQuery. When i add a calculated column in PowerPivot, the size of my file is almost as twice so big as without that column.
Regards
BjoernSchaefer
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
109 | |
98 | |
77 | |
66 | |
54 |
User | Count |
---|---|
144 | |
104 | |
100 | |
86 | |
64 |