Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Files with 3 days of data in a folder

Hello All,

 

I have a folder with daily sales uploaded in .csv files.

The daily files are uploaded for previous 3 dates. When I pick this folder in Power BI ,there is repitition of data.

 

For eg: The file uploaded on 1st sep has data for: 29th aug,30th aug,31st aug

             The file uploaded on 2nd sep has data for: 30th aug,31st aug, 1st sep

            Hence, there is duplicate data for 30th aug & 31st aug in our database. How do we pick the lastest data for these 2 dates i.e the one uploaded on 2nd sep.

 

Need urgent help.

 

4 REPLIES 4
sturlaws
Resident Rockstar
Resident Rockstar

Hi,

 

you could do this in Power Query. I assume you have used Folder as data source. If you click at the first step in the Applied Steps box, you'll notice that there is a Date Created field. You can use this to find the last dataset for each date.

You will have to make use of the advanced editor. Locate the row which begins with #"Removed Other Columns1".
Inside the brackets add "Date created", and click Done.

 

Then go to the Transform tab of Power Query. On the left there is a button which says Group By. Click advanced. Group by the the date field of you input files. In the New column name box, type in something like LastCreatedDay, Set Operation to Max and Column to Date Created.

Then click on Add aggregation and add a name, like GroupedDate. Set Operation for this to All Rows, and click OK. Expand GroupedDate, and remove GroupedDate.Dato.

Create a new custom column, with the formula IsLastUpdate = if [LastCreatedDay]=[Date created] then 1 else 0, and click ok. Now filter this column on equal to 1, and you will have the last update for each date.

Anonymous
Not applicable

Hi @sturlaws 

 

Iam not getting the desired output.

I hope you have understand the requirement.

For 30th aug & 31st aug I'am getting duplicate data. Instead, I want the data only for the lastest .csv file.

How do i ensure that to pick the dates 30th aug & 31st aug from the latest file.

Take a look here:

power query

Or, if you can provide a sample set of anonymized files, I can send you back a pbix-file with the solution

Anonymous
Not applicable

Hi @sturlaws 

 

Where should I write "date created"?

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.