Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
balios_ald
Frequent Visitor

Split table in multiple tables automitcally based on category

Hello everyone, 

I have one big table like this : 

 

file_idprocedure_reflabelvalue
10001procedure_1Date20 septembre 2019
10001procedure_1VoitureVW
10001procedure_1Nb de passagers3
10001procedure_1Nombre d'enfants2
10002procedure_1Date20 septembre 2019
10002procedure_1VoitureBMW
10002procedure_1Nb de passagers5
10002procedure_1Nombre d'enfants4
10003procedure_2NomBrabrahuy
10003procedure_2Sexe de l enfantM
10003procedure_2Nom du sportTennis de table
10003procedure_2Nom du clubClub_Tenis2
10004procedure_2NomHuyhuybra
10004procedure_2Sexe de l enfantF
10004procedure_2Nom du sportFootball
10004procedure_2Nom du clubClub_Football
10005procedure_2NomBratratra
10005procedure_2Sexe de l enfantM
10005procedure_2Nom du sportFootball
10005procedure_2Nom du club

Club_Football

 

There are more than 200 procedures with sometimes 78 differents labels.

I  have to split this table like this (one table by procedure)  and users can add new procedure, so, it must be automatic : 

Table Procedure_1

file_idprocedure_refDateVoitureNb de passagersNombre d'enfants
10001procedure_120 septembre 2019VW32
10002procedure_120 septembre 2019BMW54

 

Table Procedure_2

file_idprocedure_refNomSexe de l enfantNom du sportNom du club
10003procedure_2BrabrahuyMTennis de tableClub_Tenis2
10004procedure_2HuyhuybraFFootballClub_Football
10005procedure_2BratratraMFootballClub_Football

 

Is it possible to do this in PowerBI ? I'm trying to use m language but i dont find a way to accomplish that. 

 

Thanks a lot. 

1 ACCEPTED SOLUTION

Here is my m code : 

(ProcedureName as text) =>  

let
    Source = PostgreSQL.Database("ServerName", "DBName"),
    public_form_field_large_dataset = Source{[Schema="public",Item="form_field_large_dataset"]}[Data],
    #"Grouped Rows" = Table.Group(public_form_field_large_dataset, {"procedure_name"}, {{"Tables", each _, type table [procedure_name=text, label=text, value=text, dossier_id=text]}}),
    demarche_1 = #"Grouped Rows"{[procedure_name=ProcedureName]}[Tables],
    #"Pivoted Column" = Table.Pivot(demarche_1, List.Distinct(demarche_1[label]), "label", "value"),
    Custom1 = let
    
FnDetectType =
    (#"Pivoted Column", ColumnName)=>
    let
        ListColumn = Table.Column(#"Pivoted Column",ColumnName),
        NumberPercentage =
            List.Count(
                List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type number))
            ) / List.Count(ListColumn),
        IntPercentage =
            List.Count(
                List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type number))
            ) / List.Count(ListColumn),
        DatePercentage =
            List.Count(
                List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type date))
            ) / List.Count(ListColumn),
        DateTimePercentage =
            List.Count(
                List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type datetime))
            ) / List.Count(ListColumn),
        Max = List.Max(
            {NumberPercentage, DatePercentage, DateTimePercentage, IntPercentage}
        ),
        Result =
            if Max < .95 then
                {ColumnName, type text}
            else if DateTimePercentage = Max then
                {ColumnName, type datetime}
            else if DatePercentage = Max then
                {ColumnName, type date}
            else if IntPercentage = Max then
                {ColumnName, Int64.Type}
            else
                {ColumnName, type number}
    in
        Result,

    ConvertTypesList =
        List.Transform(
            Table.ColumnNames(#"Pivoted Column"),
            each FnDetectType(#"Pivoted Column", _)
        ),
    ConvertedTable = Table.TransformColumnTypes(#"Pivoted Column", ConvertTypesList)
in
    ConvertedTable
in
    Custom1

FnDetectType is a function to detect the data type of a column. 

With this solution I have to manually enter the name of the procedure that I want to analyse (only the first time or at each change of structure). It's acceptable and will requiere less manipulation for the client. 

If the client wants to do it himself he will have to (for a new procedure or if there is a change in the structure of the procedure) : 
- open the pbix
- go to the Query Editor 
- clic on the function 
- enter the name of the procedure that he wants to analyse
- rename the new query with the name of the procedure
- clic on "apply and close"

- do some graphs in the visual view
- publish the report


I don't think I could be simplier. If you have any idea don't hesitate. 

View solution in original post

9 REPLIES 9
Icey
Community Support
Community Support

Hi @balios_ald ,

 

You can use Group By and then add each table as a new query.

You can refer to this video: Power Query - Split Master Table into Sub Tables by unique Column Keys.

groupby.PNGgroupby2.jpg

Best Regards,
Icey

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi @Icey  ! Thank you a lot ! Your suggestion is really good and is nearly what I want. It works fine but it is not really automatic. I would like to not do the same operations for each procedure. 

I'm looking for a way to do so but if you have any suggestion it would be great ! 

Anonymous
Not applicable

I don't think there is a method in M that does that, especially when values are dynamic.

 

But...why? I see no possible use of such a manipulation. Everything can be filtered (and dynamically) in PowerBI, so why?

Hi @Anonymous , thanks for your response. 

I dont explain fully my problem. I have one big table (the first one) and I have to let people that know less in Power BI do their own analysis and visuals with Power BI. 

To do so, my first solution was to pivot the table on the label value a create a table with a lot of columns and because one label is unique for a procedure, there is a lot of empty columns and I think that in one or two years (maybe more) my system will become too slow. 

 

Another solution is that each procedure is one table to resolve this problem of empty cells. Moreover, in a ideal case, I wouldnt have to create myself a new table when the client create a new procedure. 

 

And my question is how to achieve that ? 

Anonymous
Not applicable

Well, generally speaking empty columns are NOT an issue with memory in powerbi due to the way data is compressed in memory. You can have 200 columns for 1.000.000 rows with emtpy values and your memory footprint will be close to zero.

 

@Anonymous  after few testing with 70 columns and 1 000 rows in only one table, it's very slow. I don't think it's a sustainable solution. 

@Icey  I'm still looking to automatically generate new tables. For the moment a create a function where we must specify the name of the procedure.. it's an alternative but not totally the good solution. 

Anonymous
Not applicable

70 columns x 1000 rows should be sooo easy for powerbi. 
I suspect that the culprits are metrics.

Here is my m code : 

(ProcedureName as text) =>  

let
    Source = PostgreSQL.Database("ServerName", "DBName"),
    public_form_field_large_dataset = Source{[Schema="public",Item="form_field_large_dataset"]}[Data],
    #"Grouped Rows" = Table.Group(public_form_field_large_dataset, {"procedure_name"}, {{"Tables", each _, type table [procedure_name=text, label=text, value=text, dossier_id=text]}}),
    demarche_1 = #"Grouped Rows"{[procedure_name=ProcedureName]}[Tables],
    #"Pivoted Column" = Table.Pivot(demarche_1, List.Distinct(demarche_1[label]), "label", "value"),
    Custom1 = let
    
FnDetectType =
    (#"Pivoted Column", ColumnName)=>
    let
        ListColumn = Table.Column(#"Pivoted Column",ColumnName),
        NumberPercentage =
            List.Count(
                List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type number))
            ) / List.Count(ListColumn),
        IntPercentage =
            List.Count(
                List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type number))
            ) / List.Count(ListColumn),
        DatePercentage =
            List.Count(
                List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type date))
            ) / List.Count(ListColumn),
        DateTimePercentage =
            List.Count(
                List.Select(ListColumn , each Value.Is(Value.FromText( _ ), type datetime))
            ) / List.Count(ListColumn),
        Max = List.Max(
            {NumberPercentage, DatePercentage, DateTimePercentage, IntPercentage}
        ),
        Result =
            if Max < .95 then
                {ColumnName, type text}
            else if DateTimePercentage = Max then
                {ColumnName, type datetime}
            else if DatePercentage = Max then
                {ColumnName, type date}
            else if IntPercentage = Max then
                {ColumnName, Int64.Type}
            else
                {ColumnName, type number}
    in
        Result,

    ConvertTypesList =
        List.Transform(
            Table.ColumnNames(#"Pivoted Column"),
            each FnDetectType(#"Pivoted Column", _)
        ),
    ConvertedTable = Table.TransformColumnTypes(#"Pivoted Column", ConvertTypesList)
in
    ConvertedTable
in
    Custom1

FnDetectType is a function to detect the data type of a column. 

With this solution I have to manually enter the name of the procedure that I want to analyse (only the first time or at each change of structure). It's acceptable and will requiere less manipulation for the client. 

If the client wants to do it himself he will have to (for a new procedure or if there is a change in the structure of the procedure) : 
- open the pbix
- go to the Query Editor 
- clic on the function 
- enter the name of the procedure that he wants to analyse
- rename the new query with the name of the procedure
- clic on "apply and close"

- do some graphs in the visual view
- publish the report


I don't think I could be simplier. If you have any idea don't hesitate. 

Hi @balios_ald ,

I suggest you to accept your own reply above as solution to your question so that people who may have the same question can get the solution directly.

 

Best Regards,

Icey

 

 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.