Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
cole_lehmkuhler
Helper II
Helper II

High number of rows imported

So I've imported some data, which should have around 100,000 rows or so. When I imported it it was fine and had about the same amount of rows that I expected when I imported it from a CSV file. I manipulated the data and added some other data from other sources into my report. I merged some columns into the query that I was saying should have around 100,000 rows and made a few other changes to the query. I made some visualizations using this query which looked correct when compared to the data that my company normally sees. However, power bi started running really slow. I was looking for reasons to why it might be running so slow and the amount of rows in the "100,000 row" query had jumped up to 173,000,000 rows. Is this counting by cells now instead of rows or is there something seriously wrong with my data. I've been using power bi for a little while now and I have never seen this. It appears that there are actually 173,000,000 rows. Is there anything in particular that anyone could help with as to what I did wrong here? Thanks

8 REPLIES 8
ankitpatira
Community Champion
Community Champion

@cole_lehmkuhler Did you do merge queries ? If yes then when you did Merge Queries which join type did you use. I would expect number to be same if you've used Inner Join but with other joins such as Left Outer you will have number of records increased.

I did use left outer, but I don't understand how that should increase the number of records. Could you explain why this would increase the number of records? My thinking is that it takes all records from the first query and then it takes matching rows from the second query and populates the fields adding that column. Is this correct? if not could you help explain? Thanks!

hi,

 

if you have multiple records on the right side that match the record on the left side that you would end up with more records that you have started with i.e.

LT

ID

1

2

3

4

 

RT

ID

1

1

3

4

4

 

If you left join LT with RT on LT.ID = RT.ID you end up with the following

LT.ID   RT.ID

1          1

1          1

2          NULL

3          3

4          4

4          4

 

the merge is the most likely reason why you would get so many "extra" rows in your query.

 

regards,

radpir

Thank you that makes a lot of sense so I should probably use and inner join?

an*

@cole_lehmkuhler

 

You can use "Group by" feature in Query Editor. Just use Left Outer Join first, then group by ID.

 

I assume you have two tables below:

 

Capture1.PNG

Capture2.PNG

After you merge queries, it looks like:

Capture3.PNG

 

Then you can group by ID column.

 

Capture4.PNG

The result looks like:

Capture5.PNG

 

Regards,

you're going to have the same problem using the inner join.

the only difference it's going to make is to remove the "2   NULL" row (3rd row in my sample output).

 

if you want to match LT.ID to the first matched RT.ID (similar to what VLOOKUP does in Excel as it stops searching for other matches when it finds the first one) you would need to remove all duplicate rows from the RT table that only then do the merge step.

cole_lehmkuhler
Helper II
Helper II

I stand corrected instead of it having around 100,000 rows it was supposed to have around 250,000 rows and it did have the correct amount of rows at that point

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.