Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
Hello Community,
I am new here and basically in Power Bi. First of all, I tried to find the solution to this problem in the community and Youtube but honestly I couldn't find it and that's why I am here now.
I am having trouble with the syntax to create a column. I want to get as an output the values ''1'' or ''null'' to later help me to filter the dataset in some visualisations. The dataset is a recording of actions, where an ID can have different actions during the time and there is also different Status.
To make it more easy to understand, let's say my dataset has 3 columns: ID, Sent Date and Status. What I want to do is create a new column where grouping by ID, checks the latest Sent Date and returns for that row a ''1'' , for the actions that happen earlier with the same ID, it returns ''null'. Here is the complex part, it can happen that there are duplicate rows (that I don't want to delete) and basically I just want to return a ''1'' to only one of them, and ''null'' to the other.
Im sure that this can be done writing the correct syntax. The reason why I want to do it with a column is to automate the process.
Here is an example of the dataset and the ideal result on the right with a column called CHECK :
ID | Sent Date | Status | CHECK |
1001 | 27/03/2020 14:59:00 | Withdrawn | null |
1001 | 27/03/2020 14:59:00 | Withdrawn | null |
1001 | 31/03/2020 12:50:00 | Awaiting Viewing | null |
1001 | 07/04/2020 14:01:00 | Agreed | null |
1001 | 07/04/2020 14:01:00 | Agreed | 1 |
1002 | 15/06/2020 12:30:00 | Viewed | null |
1002 | 03/07/2020 13:42:00 | Agreed | 1 |
1003 | 30/03/2020 08:50:00 | Withdrawn | null |
1003 | 30/03/2020 17:19:00 | Awaiting Viewing | null |
1003 | 31/03/2020 10:52:00 | Viewed | null |
1003 | 07/04/2020 14:12:00 | Agreed | null |
1003 | 07/04/2020 14:12:00 | Agreed | 1 |
If is not clear what I want to do, I will be very happy to clarify it. Thank you so much in advance to anyone who take their time to read this.
Manu.
Solved! Go to Solution.
Hi @ManuApo ,
Based on your description, you need to create an index column in power query editor first, refer my sample file in the below.
If you want to achieve it by DAX, [Index] column could be retained and the calculated column is like this:
CHECK =
IF (
[Index]
= CALCULATE (
MAX ( 'Table'[Index] ),
FILTER ( ALL ( 'Table' ), 'Table'[ID] = EARLIER ( 'Table'[ID] ) )
),
1
)
If you just want the CHECK column withtout index column, you can do it by power query:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZDPDoIwDIdfZdmZZP2zOenNl9AD4UDCgrtwICR7fUHRiEwSPTVpvvbXr1WlEQB1oQkIDBvyCq24UgCm5iWO13ZoUq/r4n+SUSGJgwd5Sk0cY9+pcwxpqpsBa+6bARe+G0Jof6PoSR0Mujmcl/A5M4N5wwpZLOWW8ZsJqONLZKO8BtELlvvKvP4RiKPcmfzhjPtnfqHqGw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, #"Sent Date" = _t, Status = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Sent Date", type datetime}, {"Status", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"Count", each _, type table [ID=nullable number, Sent Date=nullable datetime, Status=nullable text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([Count],"Index",1)),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each List.Max([Custom][Index])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom1", "Custom", {"Sent Date", "Status", "Index"}, {"Custom.Sent Date", "Custom.Status", "Custom.Index"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Count"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Custom.Sent Date", "Sent Date"}, {"Custom.Status", "Status"}, {"Custom.Index", "Index"}, {"Custom.1", "Max"}}),
#"Added Custom2" = Table.AddColumn(#"Renamed Columns", "CHECK", each if
[Index] = [Max] then 1 else null),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom2",{"Index", "Max"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns1",{{"CHECK", Int64.Type}})
in
#"Changed Type1"
Attached a sample file in the below, hopes to help you.
Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @ManuApo ,
Based on your description, you need to create an index column in power query editor first, refer my sample file in the below.
If you want to achieve it by DAX, [Index] column could be retained and the calculated column is like this:
CHECK =
IF (
[Index]
= CALCULATE (
MAX ( 'Table'[Index] ),
FILTER ( ALL ( 'Table' ), 'Table'[ID] = EARLIER ( 'Table'[ID] ) )
),
1
)
If you just want the CHECK column withtout index column, you can do it by power query:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZDPDoIwDIdfZdmZZP2zOenNl9AD4UDCgrtwICR7fUHRiEwSPTVpvvbXr1WlEQB1oQkIDBvyCq24UgCm5iWO13ZoUq/r4n+SUSGJgwd5Sk0cY9+pcwxpqpsBa+6bARe+G0Jof6PoSR0Mujmcl/A5M4N5wwpZLOWW8ZsJqONLZKO8BtELlvvKvP4RiKPcmfzhjPtnfqHqGw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, #"Sent Date" = _t, Status = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Sent Date", type datetime}, {"Status", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"Count", each _, type table [ID=nullable number, Sent Date=nullable datetime, Status=nullable text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([Count],"Index",1)),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each List.Max([Custom][Index])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom1", "Custom", {"Sent Date", "Status", "Index"}, {"Custom.Sent Date", "Custom.Status", "Custom.Index"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Count"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Custom.Sent Date", "Sent Date"}, {"Custom.Status", "Status"}, {"Custom.Index", "Index"}, {"Custom.1", "Max"}}),
#"Added Custom2" = Table.AddColumn(#"Renamed Columns", "CHECK", each if
[Index] = [Max] then 1 else null),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom2",{"Index", "Max"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns1",{{"CHECK", Int64.Type}})
in
#"Changed Type1"
Attached a sample file in the below, hopes to help you.
Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @v-yingjl ,
Thanks for your answer. It doesn't look very clear for me. I tried to look at the sample file, but I don't know how you create the index column.
So you mentioned I can do it without an Index column, but I don't see the solution in that case.
If the only way to do it is by creating an index column, how do I create an Index column based in the ID column. What if my dataset is not sorted by the ID column ? Will the index column, based in the ID column, work?
Thanks for your help once again.
Hi @ManuApo ,
In my sample file, both of two workarounds in my sample use index column, one of them using power query method just not show it in the data view.
To create index column, you can check these two steps in power query:
If your dataset not sorted by the ID column, it will not affect becasuse the group by feature is based on each field in the column. You can refer this document: Grouping or summarizing rows
Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @ManuApo ,
I think something like this as the formula for your calculated column might work.
CalculatedCHECK =
VAR UID = 'Table1'[ID]
VAR LastSentDate =
CALCULATE (
MAX ( 'Table1'[Sent Date] ),
FILTER ( 'Table1', 'Table1'[ID] = UID )
)
RETURN
IF ( 'Table1'[Sent Date] = LastSentDate, 1 )
@ManuApo sorry, just realized I overlooked your requirement to only show the check in one of the rows, even though the rows are not unique.
This makes the problem a bit more challenging... Is there a need to actually retain those duplicate rows in the dataset?
Hi @ebeery thanks for your quick response.
Unfortunately I cant remove the duplicates, in the original dataset there is one column that make them different, so they are relevant. Also in the future the data is going to be refreshing constantly, I dont want to be modelling the data constantly( removing duplicates) which I believe with a calculated column it won't happen.
The solution that you gave, it is very close to the solution I am looking for ( This is my output in the table below) . What if we create a second column CHECK2? Where checks by ID if there is more than 1 ''1'' and return nulls to the others
ID Sent Date Status CHECK CHECK2
1001 27/03/2020 14:59:00 Withdrawn null null
1001 27/03/2020 14:59:00 Withdrawn null null
1001 31/03/2020 12:50:00 Awaiting null null
1001 07/04/2020 14:01:00 Agreed 1 null
1001 07/04/2020 14:01:00 Agreed 1 1
1002 15/06/2020 12:30:00 Viewed null null
1002 03/07/2020 13:42:00 Agreed 1 1
1003 30/03/2020 08:50:00 Withdrawn null null
1003 30/03/2020 17:19:00 Awaiting null null
1003 31/03/2020 10:52:00 Viewed null null
1003 07/04/2020 14:12:00 Agreed 1 null
1003 07/04/2020 14:12:00 Agreed 1 1
Thanks again @ebeery
User | Count |
---|---|
128 | |
109 | |
100 | |
65 | |
62 |
User | Count |
---|---|
137 | |
115 | |
102 | |
71 | |
61 |