Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
Hello everyone,
I have one big table like this :
file_id | procedure_ref | label | value |
10001 | procedure_1 | Date | 20 septembre 2019 |
10001 | procedure_1 | Voiture | VW |
10001 | procedure_1 | Nb de passagers | 3 |
10001 | procedure_1 | Nombre d'enfants | 2 |
10002 | procedure_1 | Date | 20 septembre 2019 |
10002 | procedure_1 | Voiture | BMW |
10002 | procedure_1 | Nb de passagers | 5 |
10002 | procedure_1 | Nombre d'enfants | 4 |
10003 | procedure_2 | Nom | Brabrahuy |
10003 | procedure_2 | Sexe de l enfant | M |
10003 | procedure_2 | Nom du sport | Tennis de table |
10003 | procedure_2 | Nom du club | Club_Tenis2 |
10004 | procedure_2 | Nom | Huyhuybra |
10004 | procedure_2 | Sexe de l enfant | F |
10004 | procedure_2 | Nom du sport | Football |
10004 | procedure_2 | Nom du club | Club_Football |
10005 | procedure_2 | Nom | Bratratra |
10005 | procedure_2 | Sexe de l enfant | M |
10005 | procedure_2 | Nom du sport | Football |
10005 | procedure_2 | Nom du club | Club_Football |
There are more than 200 procedures with sometimes 78 differents labels.
I have to split this table like this (one table by procedure) and users can add new procedure, so, it must be automatic :
Table Procedure_1
file_id | procedure_ref | Date | Voiture | Nb de passagers | Nombre d'enfants |
10001 | procedure_1 | 20 septembre 2019 | VW | 3 | 2 |
10002 | procedure_1 | 20 septembre 2019 | BMW | 5 | 4 |
Table Procedure_2
file_id | procedure_ref | Nom | Sexe de l enfant | Nom du sport | Nom du club |
10003 | procedure_2 | Brabrahuy | M | Tennis de table | Club_Tenis2 |
10004 | procedure_2 | Huyhuybra | F | Football | Club_Football |
10005 | procedure_2 | Bratratra | M | Football | Club_Football |
Is it possible to do this in PowerBI ? I'm trying to use m language but i dont find a way to accomplish that.
Thanks a lot.
Solved! Go to Solution.
Here is my m code :
(ProcedureName as text) => let Source = PostgreSQL.Database("ServerName", "DBName"), public_form_field_large_dataset = Source{[Schema="public",Item="form_field_large_dataset"]}[Data], #"Grouped Rows" = Table.Group(public_form_field_large_dataset, {"procedure_name"}, {{"Tables", each _, type table [procedure_name=text, label=text, value=text, dossier_id=text]}}), demarche_1 = #"Grouped Rows"{[procedure_name=ProcedureName]}[Tables], #"Pivoted Column" = Table.Pivot(demarche_1, List.Distinct(demarche_1[label]), "label", "value"), Custom1 = let FnDetectType = (#"Pivoted Column", ColumnName)=> let ListColumn = Table.Column(#"Pivoted Column",ColumnName), NumberPercentage = List.Count( List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type number)) ) / List.Count(ListColumn), IntPercentage = List.Count( List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type number)) ) / List.Count(ListColumn), DatePercentage = List.Count( List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type date)) ) / List.Count(ListColumn), DateTimePercentage = List.Count( List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type datetime)) ) / List.Count(ListColumn), Max = List.Max( {NumberPercentage, DatePercentage, DateTimePercentage, IntPercentage} ), Result = if Max < .95 then {ColumnName, type text} else if DateTimePercentage = Max then {ColumnName, type datetime} else if DatePercentage = Max then {ColumnName, type date} else if IntPercentage = Max then {ColumnName, Int64.Type} else {ColumnName, type number} in Result, ConvertTypesList = List.Transform( Table.ColumnNames(#"Pivoted Column"), each FnDetectType(#"Pivoted Column", _) ), ConvertedTable = Table.TransformColumnTypes(#"Pivoted Column", ConvertTypesList) in ConvertedTable in Custom1
FnDetectType is a function to detect the data type of a column.
With this solution I have to manually enter the name of the procedure that I want to analyse (only the first time or at each change of structure). It's acceptable and will requiere less manipulation for the client.
If the client wants to do it himself he will have to (for a new procedure or if there is a change in the structure of the procedure) :
- open the pbix
- go to the Query Editor
- clic on the function
- enter the name of the procedure that he wants to analyse
- rename the new query with the name of the procedure
- clic on "apply and close"
- do some graphs in the visual view
- publish the report
I don't think I could be simplier. If you have any idea don't hesitate.
Hi @balios_ald ,
You can use Group By and then add each table as a new query.
You can refer to this video: Power Query - Split Master Table into Sub Tables by unique Column Keys.
Best Regards,
Icey
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @Icey ! Thank you a lot ! Your suggestion is really good and is nearly what I want. It works fine but it is not really automatic. I would like to not do the same operations for each procedure.
I'm looking for a way to do so but if you have any suggestion it would be great !
I don't think there is a method in M that does that, especially when values are dynamic.
But...why? I see no possible use of such a manipulation. Everything can be filtered (and dynamically) in PowerBI, so why?
Hi @Anonymous , thanks for your response.
I dont explain fully my problem. I have one big table (the first one) and I have to let people that know less in Power BI do their own analysis and visuals with Power BI.
To do so, my first solution was to pivot the table on the label value a create a table with a lot of columns and because one label is unique for a procedure, there is a lot of empty columns and I think that in one or two years (maybe more) my system will become too slow.
Another solution is that each procedure is one table to resolve this problem of empty cells. Moreover, in a ideal case, I wouldnt have to create myself a new table when the client create a new procedure.
And my question is how to achieve that ?
Well, generally speaking empty columns are NOT an issue with memory in powerbi due to the way data is compressed in memory. You can have 200 columns for 1.000.000 rows with emtpy values and your memory footprint will be close to zero.
@Anonymous after few testing with 70 columns and 1 000 rows in only one table, it's very slow. I don't think it's a sustainable solution.
@Icey I'm still looking to automatically generate new tables. For the moment a create a function where we must specify the name of the procedure.. it's an alternative but not totally the good solution.
70 columns x 1000 rows should be sooo easy for powerbi.
I suspect that the culprits are metrics.
Here is my m code :
(ProcedureName as text) => let Source = PostgreSQL.Database("ServerName", "DBName"), public_form_field_large_dataset = Source{[Schema="public",Item="form_field_large_dataset"]}[Data], #"Grouped Rows" = Table.Group(public_form_field_large_dataset, {"procedure_name"}, {{"Tables", each _, type table [procedure_name=text, label=text, value=text, dossier_id=text]}}), demarche_1 = #"Grouped Rows"{[procedure_name=ProcedureName]}[Tables], #"Pivoted Column" = Table.Pivot(demarche_1, List.Distinct(demarche_1[label]), "label", "value"), Custom1 = let FnDetectType = (#"Pivoted Column", ColumnName)=> let ListColumn = Table.Column(#"Pivoted Column",ColumnName), NumberPercentage = List.Count( List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type number)) ) / List.Count(ListColumn), IntPercentage = List.Count( List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type number)) ) / List.Count(ListColumn), DatePercentage = List.Count( List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type date)) ) / List.Count(ListColumn), DateTimePercentage = List.Count( List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type datetime)) ) / List.Count(ListColumn), Max = List.Max( {NumberPercentage, DatePercentage, DateTimePercentage, IntPercentage} ), Result = if Max < .95 then {ColumnName, type text} else if DateTimePercentage = Max then {ColumnName, type datetime} else if DatePercentage = Max then {ColumnName, type date} else if IntPercentage = Max then {ColumnName, Int64.Type} else {ColumnName, type number} in Result, ConvertTypesList = List.Transform( Table.ColumnNames(#"Pivoted Column"), each FnDetectType(#"Pivoted Column", _) ), ConvertedTable = Table.TransformColumnTypes(#"Pivoted Column", ConvertTypesList) in ConvertedTable in Custom1
FnDetectType is a function to detect the data type of a column.
With this solution I have to manually enter the name of the procedure that I want to analyse (only the first time or at each change of structure). It's acceptable and will requiere less manipulation for the client.
If the client wants to do it himself he will have to (for a new procedure or if there is a change in the structure of the procedure) :
- open the pbix
- go to the Query Editor
- clic on the function
- enter the name of the procedure that he wants to analyse
- rename the new query with the name of the procedure
- clic on "apply and close"
- do some graphs in the visual view
- publish the report
I don't think I could be simplier. If you have any idea don't hesitate.
Hi @balios_ald ,
I suggest you to accept your own reply above as solution to your question so that people who may have the same question can get the solution directly.
Best Regards,
Icey
User | Count |
---|---|
140 | |
113 | |
104 | |
76 | |
63 |
User | Count |
---|---|
135 | |
126 | |
110 | |
70 | |
61 |