Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi there,
The dataset consists of columns for ID, start time, and end time, with tens of thousands of rows.
To observe the number of simultaneous events in minutes, I created the following query.
However, the performance is poor.
Please let me know if there is a better way.
#DataSet
#Query
SampleTime
let
Source = Csv.Document(File.Contents("C:\SampleTime.csv"),[Delimiter=",", Columns=3, Encoding=65001, QuoteStyle=QuoteStyle.None]),
Header = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
ConvertType = Table.TransformColumnTypes(Header,{{"Id", Int64.Type}, {"StartDate", type datetime}, {"EndDate", type datetime}}),
Custom = Table.AddColumn(ConvertType, "Round_StartDate", each DateTime.From(Number.RoundDown(Number.From(DateTime.From([StartDate]))*(24*60))/(24*60))),
Custom1 = Table.AddColumn(Custom, "Round_EndDate", each DateTime.From(Number.RoundUp(Number.From(DateTime.From([EndDate]))*(24*60))/(24*60)))
in
Custom1
Concurrent
let
StartDateTime = DateTime.From(Date.From(List.Min(SampleTime[StartDate]))),
EndDateTime = Date.AddDays(DateTime.From(Date.From(List.Max(SampleTime[EndDate]))), 1),
Source = List.DateTimes(StartDateTime, Duration.TotalMinutes(EndDateTime - StartDateTime), #duration(0, 0, 1, 0)),
ConvertToTable = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
RenameColumn = Table.RenameColumns(ConvertToTable,{{"Column1", "DateTime"}}),
ConvertType = Table.TransformColumnTypes(RenameColumn,{{"DateTime", type datetime}}),
TableBuffer = Table.Buffer(SampleTime),
Custom = Table.AddColumn(ConvertType, "Concurrent", each Table.RowCount(Table.SelectRows(TableBuffer, (x) => x[Round_StartDate] <= [DateTime] and x[Round_EndDate] >= [DateTime])))
in
Custom
Solved! Go to Solution.
Your Custom step is iterating through the entire TableBuffer and filtering for each of the 100k+ rows of ConvertType. Not terribly efficient.
I propose the following redefinition of Concurrent:
let
Source = SampleTime,
#"Added Custom" =
Table.AddColumn(Source, "Custom", each
List.DateTimes(
[Round_StartDate],
Duration.TotalMinutes([Round_EndDate] - [Round_StartDate]),
#duration(0, 0, 1, 0))
),
#"Removed Columns" = Table.SelectColumns(#"Added Custom",{"Id", "Custom"}),
#"Expanded Custom" = Table.ExpandListColumn(#"Removed Columns", "Custom"),
#"Grouped Rows" = Table.Group(#"Expanded Custom", {"Custom"}, {{"IdCount", each Table.RowCount(Table.Distinct(_)), Int64.Type}})
in
#"Grouped Rows"
This evaluates much faster.
Your Custom step is iterating through the entire TableBuffer and filtering for each of the 100k+ rows of ConvertType. Not terribly efficient.
I propose the following redefinition of Concurrent:
let
Source = SampleTime,
#"Added Custom" =
Table.AddColumn(Source, "Custom", each
List.DateTimes(
[Round_StartDate],
Duration.TotalMinutes([Round_EndDate] - [Round_StartDate]),
#duration(0, 0, 1, 0))
),
#"Removed Columns" = Table.SelectColumns(#"Added Custom",{"Id", "Custom"}),
#"Expanded Custom" = Table.ExpandListColumn(#"Removed Columns", "Custom"),
#"Grouped Rows" = Table.Group(#"Expanded Custom", {"Custom"}, {{"IdCount", each Table.RowCount(Table.Distinct(_)), Int64.Type}})
in
#"Grouped Rows"
This evaluates much faster.
Thank you for your advice!
Queries are much faster!
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
112 | |
97 | |
85 | |
67 | |
59 |
User | Count |
---|---|
150 | |
120 | |
99 | |
87 | |
68 |