Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
molegris
Advocate III
Advocate III

Best practice for getting data from csv file

Hi,

 

I just heard about Power BI ability to connect to csv files located on SharePoint online then automatically create a dataset.  And I'm wondering if there would be some benifits to update my current ETL strategy.  I'm not concern about refreshing the data because we get updated data only once a month but I'd like to be sure I'm using Microsoft product for what they were intended for and with the optimal architecure.

 

Currently, I use 3 dataflows to import data into Power BI Services. 

All sourced from csv files located in SharePoint online.

With a few transformations (rename headers, a few calculated columns and some value replacement)

 

  1. Fact table #1 : ~500k rows
  2. Fact table #2 : ~60k rows
  3. 10 dimension tables, 1 dimension has ~20k rows but the others are all under 1k.

Then I use 2 datasets.  In each dataset I created a star model with 1 fact table and its related dimensions.

Most transformations and data prep steps are done in the datasets.

Those datasets are used for many official reports AND and are available to analyts across the organisation.

 

Should I drop the dataflows and rethink the dataset to connect directly to the csv files located in Sharepoint online or it is actually more robust the way I designed it ??

 

Thank you

--mo

 

1 ACCEPTED SOLUTION
v-easonf-msft
Community Support
Community Support

Hi, @molegris 

Although, a dataset can directly get data from a data source, however, it is a best practice that a shared dataset gets the data from dataflows, this is to have a multi-developer implementation  of Power BI.

For more details,please refer to this document.

dataflow-vs-dataset 

 

Best Regards,
Community Support Team _ Eason

View solution in original post

2 REPLIES 2
v-easonf-msft
Community Support
Community Support

Hi, @molegris 

Although, a dataset can directly get data from a data source, however, it is a best practice that a shared dataset gets the data from dataflows, this is to have a multi-developer implementation  of Power BI.

For more details,please refer to this document.

dataflow-vs-dataset 

 

Best Regards,
Community Support Team _ Eason

Thank you, this was very useful.

I found that Microsoft documentation is very good with the "How To" use their products but it's very weak for explaining "Why would we use this product"  "What was it made to do"  "What is the product place in a BI ecosystem", etc.  

 

Fortunately, there some good bloggers and a great community!  🙂

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors