Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
ddumas
Frequent Visitor

power bi scripting (data load)

I am new to Power BI.  Is there a scripting component that allows me to write custom code to load all my data into a data model?

 

One example is that I may have 1 flat table with dimensions embedded in.  I wish to create a star schema "on-the-fly", by selecting distinct values from the flat table for each of the dimensions, and during this process, create integer (surrogate) keys to the dimensions.  Then I would load in the facts, with just keys and measures.  At the end, I would have:

 

On large but narrow fact table

Serveral dimensions

 

Thus creating a star schema.  I would then drop / not use the large wide original fact table .

 

I do not have the luxury of creating database tables in this scenario.  I can only read from the database or read fro a flat file.

 

Thanks,

 

Dave

2 ACCEPTED SOLUTIONS
MarcelBeug
Community Champion
Community Champion

Your "scripting component" is Power Query, accessible via the query editor.

 

Typically each query will result in 1 table, so you need (at least) as many queries as the number of resulting tables in your datamodel. You can also have so called "staging queries" with intermediate results that can be referenced by other queries for further transformations resulting in the actaul table in your datamodel. 

 

Notice that Power Query is more than just a "scripting component": it has a complete underlying programming language ("M").

Specializing in Power Query Formula Language (M)

View solution in original post

erik_tarnvik
Solution Specialist
Solution Specialist

This is fairly easy to do in Power Query as suggested by Marcel, let me offer a minimal example and you should be able to go from there. Assume we have a table called Products:

Product	Color
A	Blue
B	Red
C	Green
D	Blue

First, bring this table into Power Query. Remove column Product (and all other columns except the one you want to index in your real world table), select the remaining Color column and click Remove Rows - Remove Duplicates. You should now have a table with the unique colors. Create an index column for this table with Add Column - Index Column. Rename the index column to something appropriate such as ColorIndex. Rename the query you just created to Colors.

 

Now, bring in the Products table again using Recent Sources. Use the Merge Queries function and select the Color column from Products. Then select the Colors table and select the Color column from that table. Click OK. The result is a new column Colors with a table on each row. Expand the tables by clicking the icon in the upper right corner of the column header. Click OK in the dialog box. Remove the two remaining color text columns and leave only the numeric index column. You may want to rename that column to something sensible.

 

You are now left with a table Products with an index to the table Colors. Now rinse and repeat for all other dimensions in your fact table.

View solution in original post

2 REPLIES 2
erik_tarnvik
Solution Specialist
Solution Specialist

This is fairly easy to do in Power Query as suggested by Marcel, let me offer a minimal example and you should be able to go from there. Assume we have a table called Products:

Product	Color
A	Blue
B	Red
C	Green
D	Blue

First, bring this table into Power Query. Remove column Product (and all other columns except the one you want to index in your real world table), select the remaining Color column and click Remove Rows - Remove Duplicates. You should now have a table with the unique colors. Create an index column for this table with Add Column - Index Column. Rename the index column to something appropriate such as ColorIndex. Rename the query you just created to Colors.

 

Now, bring in the Products table again using Recent Sources. Use the Merge Queries function and select the Color column from Products. Then select the Colors table and select the Color column from that table. Click OK. The result is a new column Colors with a table on each row. Expand the tables by clicking the icon in the upper right corner of the column header. Click OK in the dialog box. Remove the two remaining color text columns and leave only the numeric index column. You may want to rename that column to something sensible.

 

You are now left with a table Products with an index to the table Colors. Now rinse and repeat for all other dimensions in your fact table.

MarcelBeug
Community Champion
Community Champion

Your "scripting component" is Power Query, accessible via the query editor.

 

Typically each query will result in 1 table, so you need (at least) as many queries as the number of resulting tables in your datamodel. You can also have so called "staging queries" with intermediate results that can be referenced by other queries for further transformations resulting in the actaul table in your datamodel. 

 

Notice that Power Query is more than just a "scripting component": it has a complete underlying programming language ("M").

Specializing in Power Query Formula Language (M)

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.