Solved: Generate data from existing data and appending to ...

freezeen3 · ‎07-10-2022

Hi guys, so I would like to generate data based on some aggregates of known data as well as combine these into a single table for easier visualization.

Current situation

The current data is structured like the following:

For each entry, it contains

Entry_id, Unit, Job, User, Position, Man-hour

Currently I have a visual matrix basically directly using one table (with some minor direct referencing on some tables like getting the Position of User but overall it's still using just one table)

The rows are as follows where more to the left, the higher up the hierarchy of the grouping.

Unit

Job

User

The column contains the Position (each User corresponds to only 1 Position), and the value contains the Man-hour.

Each Job can only be associated with one Unit, i.e. no two units working on the same job.

To give a better visualization, it currently looks like the following

			Position1	Position2	Position3
Unit1							Total Man-hours for Unit1 = 460
	Job1					Total Man-hours for job1 = 320
		User1	100
		User2		150
		User3	50
		User4			20
	Job2					Total for job2 = 140
		User1	50
		User7		20
		User8		70
Unit2							Total Man-hours for Unit2
	Job3					Total for job3 = 170
		User9	20
		User10		50
		User11			100
	...
...							Grand total man-hours G

There is no aggregation or group by in the input data table and such aggregation is automatically done by the matrix visual in Power BI.

Problem Statement

Now I would like to add UserX, such that this User has a special Position called PositionX. The man-hours of UserX do not come from real recording, but rather from a scaling according to the total man-hours of other Users. I want to append the allocation of such artificially generated man-hours of UserX by proportion to the man-hours of each job and that in the end these generated man-hours sum up to a certain constant predefined value, let's denote as P.

For example, let's say in job1 with total of 320 man-hours in total by Users doing the job, UserX would get allocated a value of

(predefined total sum for UserX) * (total man-hours for job 1 by everyone else)/(grand total man-hours of all Units) = P * (320/G) for job 1. Similarly, the allocated value would be P * (140/G) for job 2, and so on.

To better visualize: (important changes in bold, the 320+320P/G may not be as important, keeping both the original 320 and 320P/G can also be a way out)

			Pos1	Pos2	Pos3	PosX
Unit1								460+460P/G
	Job1						320+320P/G
		User1	100
		User2		150
		User3	50
		User4			20
		UserX				320P/G
	Job2
		User1	50
		User7		20
		User8		70
		UserX				140P/G

How can I achieve this?

My attempt:

Here are some thoughts I have.

Should I somehow obtain the aggregated data form (not only through the automatic visualization aggregates, but obtain them in stored data format that I can use conveniently instead of just for display) summing the Man-hours by Job first?

So that I can get usable data of the form: R1(Unit, Job, Summed_Manhour)

Then I can generate User Entries specifically for UserX and PosX of the form: R2(Entry_id, Unit, Job, User=UserX, Position=PosX, Man-hour) where the Man-hour would be calculated from the above Summed_Manhour. I also need to somehow store the grand total as a static constant for me to use in such calculations.

In SQL I can think of doing something like

SELECT Unit, Job, SUM(Man-hour)

FROM User_Entry

GROUP BY Unit, Job

to obtain the first usable data form R1.

Maybe I should do something like SUMMARIZE or ALLEXCEPT in DAX for the SQL equivalent of GROUP BY?
Then I am not sure how to proceed to retrieve information for division needed in geetting R2.

And in the end I need to do an append new of R1 with the original User_Entry table, then drag the corresponding table fields into the matrix visual.

I'm also not sure if this approach I'm using i.e. putting everything in a single table for displaying on the matrix visual is a good one.

Appreciate any help!

Please let me know if there is anything unclear.

Thank you.

v-jianboli-msft · ‎07-12-2022

Hi @freezeen3 ,

You can create a new table, which is based on the original table, to achieve this purpose.

First, create a measure for P:

P value = 100

The create the new table:

New Table =

VAR _s1 =

    SUMMARIZE (

        'Table',

        'Table'[Unit],

        [Job],

        [User],

        'Table'[Poisition],

        'Table'[Man-hour]

    )

VAR _s2 =

    ADDCOLUMNS (

        SUMMARIZE ( 'Table', [Unit], 'Table'[Job] ),

        "User", "UserX",

        "Poisirion", "PoisitionX",

        "Man-hour",

            [P value]

                * CALCULATE (

                    SUM ( 'Table'[Man-hour] ),

                    FILTER (

                        'Table',

                        [Unit] = EARLIER ( 'Table'[Unit] )

                            && 'Table'[Job] = EARLIER ( 'Table'[Job] )

                    )

                )

                / SUM ( 'Table'[Man-hour] )

    )

RETURN

    UNION ( _s1, _s2 )

Create hierarchy for the New Table:

Apply these values to a matrix

Final output:

Best Regards,

Jianbo Li

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

v-jianboli-msft · ‎07-12-2022