Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Daniel_L
Frequent Visitor

Import complex CSV file

Hello,

 

Help needed... I have a quite messy complex CSV file that I'm trying to import to Power BI. I guess some pivoting/unpivoting and fill commands are needed, but I don't know how to get it done.

 

This is what the source CSV file looks like:

CategoryAccountTextCodeAmountUnitPrice per unitTotal
Financial account47787Org B    4107871
Main activity Head orgAC.771pcs2750033 
Resource Res 12 4pcs50200
Sub activity Annual costs 1year57941522749833
Sub activity:2 Cat ABC 2pcs433095792075
Resource Res 1 5pcs141415707075
Resource Res 44 1pcs8500085000
Sub activity:2 Cat DEF 2pcs482091498474
Resource Res 45 45kg6890310050
Resource Res 48 1pcs188424188424
Sub activity:2 Cat EFG 5pcs293423,482551
Resource Res AB 18m3686624
Resource Res R12 1pcs14521452
Resource Res R11 3pcs2250067500
Resource Res R03 155m456975
Sub activity:2 Tesla 1pcs454920454920
Resource Res R99 1pcs454920454920
Sub activity:2 Chevy 5pcs241108523758
Resource Res 3 2pcs261579523158
Resource Res 21 40m15600
Sub activity:2 Car 2 2pcs199027,5398055
Resource Res 98 1pcs170595170595
Resource Res T4 1pcs170595170595
Resource Res EE 5pcs1137356865
Main activity Main act 3SS.501pcs1357838 
Sub activity Subact RR2 3pcs113730342040
Resource Res 5 2pcs170595341190
Resource Res GG 1pcs850850
Sub activity:2 Annual cat B 1year9696096960
Resource Res ert 30weeks323296960
Sub activity Res cat G 1year10157981015798
Sub activity:2 Res cat H 1year5291141015798
Sub activity:3 Other res 6pcs1693001015798
Resource Res JK 2pcs3300066000
Resource Res RR 22pcs25000550000
Resource Res RW 14pcs28557399798
Financial account11474Org A    405910
Main activity Organization ACB.331pcs405910 
Sub activity Act OOK 1year405910405910
Sub activity:2 Subact AA 2pcs201982403964
Resource Resource 22 1pcs348116348116
Resource Resource 23 1pcs5584855848
Sub activity:2 Act EETRE 2pcs9731946
Resource Unit 556 3pcs5541662
Resource Unit 1323 2pcs142284

 

And this is the goal after import:

Financial accountCodeMain activitySub activitySub activity:2Sub activity:3ResourceAmountUnitPrice per unit
47787AC.77Head org   Res 124pcs50
47787AC.77Head orgAnnual costsCat ABC Res 15pcs141415
47787AC.77Head orgAnnual costsCat ABC Res 441pcs85000
47787AC.77Head orgAnnual costsCat DEF Res 4545kg6890
47787AC.77Head orgAnnual costsCat DEF Res 481pcs188424
47787AC.77Head orgAnnual costsCat EFG Res AB18m368
47787AC.77Head orgAnnual costsCat EFG Res R121pcs1452
47787AC.77Head orgAnnual costsCat EFG Res R113pcs22500
47787AC.77Head orgAnnual costsCat EFG Res R03155m45
47787AC.77Head orgAnnual costsTesla Res R991pcs454920
47787AC.77Head orgAnnual costsChevy Res 32pcs261579
47787AC.77Head orgAnnual costsChevy Res 2140m15
47787AC.77Head orgAnnual costsCar 2 Res 981pcs170595
47787AC.77Head orgAnnual costsCar 2 Res T41pcs170595
47787AC.77Head orgAnnual costsCar 2 Res EE5pcs11373
47787AC.77Main act 3Subact RR2Car 2 Res 52pcs170595
47787AC.77Main act 3Subact RR2Car 2 Res GG1pcs850
47787AC.77Main act 3Subact RR2Annual cat B Res ert30weeks3232
47787AC.77Main act 3Res cat GRes cat HOther resRes JK2pcs33000
47787AC.77Main act 3Res cat GRes cat HOther resRes RR22pcs25000
47787AC.77Main act 3Res cat GRes cat HOther resRes RW14pcs28557
11474CB.33Organization AAct OOKSubact AA Resource 221pcs348116
11474CB.33Organization AAct OOKSubact AA Resource 231pcs55848
11474CB.33Organization AAct OOKAct EETRE Unit 5563pcs554
11474CB.33Organization AAct OOKAct EETRE Unit 13232pcs142

 

Described in words: I wan't a table of all Resources with corresponding attributes which are fetched from lines above in the CSV.

 

Sample CSV file

 

Excel file with explainations, wanted result, hierarchy explanation

1 ACCEPTED SOLUTION
dufoq3
Super User
Super User

Hi @Daniel_L, different approach here.

 

In your expected result - there is a mistake in my opinion. This one should be correct:

Result

dufoq3_0-1713970849458.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jVZbb5swFP4rKM9R5dvxZW8kI6k2TZHSTnuo+sAy1KJ2ZCKkU/frd+zYQGuMqgh8IHznO3f77m6xqZuyOdTlc1YeDsdz0y2WC6GUVrju2odshWv27hKU4Ad0cb+8W3wr6wahXf1Sd6/+/+uq/JUd2wcU8/WVsqooXn8OJ7wzBYRw7j61CvbV6XhuD5XH4mNGWSDqUUAslBAHuTn/fE+ZN80ZfTgcT93Jv7KUr1XZWrQyggJz7MJoZH+v5lOgXJddlq/W/on1BgjOiQEUlGFEQdJ0L0MPpAJ/DkjUDFCIkdkXpMZAkX6dtfhzsYkt1owYqw09FkokiSEE2wpPNmlSG8vLKSFAkjgdGUy1FkwMwqzJxWYbxYoZLhhfWhWaAdAUdx6qklojfltbpZWkZEk/931VjQwWrijckoSFlPLBTAYuMdKWchJIeOAD8Fa6EEvjq2AyNLfV6bmMDBUgsOwGIcVpzIeg00l5rF5CO41SIiglNrbAuAKdYg6+DuXHJMW2uwBpGshCeAXxQXLNIucrvs1YREiNIUwtLZobTSDZamaichUB191eSCBv4yb9ILIo4sFAubJhA6klJGdpeOcifHNz5QbhiJ6D0vziUGI04iuL3+9DyPhbE1yrC0ZEsqzChGCx1xzLwySB29Dib4aavyfTG2Y5DonVCO9nuZFGkn5NEFdtF5y13/6tqidLzhlnb6AT4bJwS72NqSmxJa1HUtKHoOU61gLMUCrmlIRm2nWPVZu1VdjR5BB/abgbQWMdE3H48jVKHe5jl+Elyczw2gfcqKH9dgR2TSN/BI+HzZtpAOXa0gRbp44dGBVlQfbYkS8mjh1Yc5QkOwVhZVP/K7v62Dj8enXlzhmjWXjRMNMsOXbKbheiNspaDx1ZMZl532558GAUQEKNZk4FNzK1UbnHjMWbFReaUjkIs3AewQG027Iva7r50PaiuN0XkfnGTStqxCT196buMgAZzRgAV+u4NSdhlLN4/6DCndc0xun+Pw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, Account = _t, Text = _t, Code = _t, Amount = _t, Unit = _t, #"Price per unit" = _t, Total = _t]),
    ReplaceBlankToNull = Table.TransformColumns(Source, {}, each if _ is text and Text.Trim(_) = "" then null else _),
    FilledDown = Table.FillDown(ReplaceBlankToNull,{"Account", "Code"}),
    FilteredRows = Table.SelectRows(FilledDown, each not List.Contains({null, "Financial account"}, [Category])),
    Categories = List.Buffer(List.Distinct(List.Select(FilteredRows[Category], each _ <> "Resource"))),
    Ad_Columns = List.Accumulate(
        Categories,
        FilteredRows,
        (s,c)=> Table.AddColumn(s, c, each if List.Contains({c, "Main activity"}, [Category]) then [Text] else null, type text) ),
    FilledDown2 = Table.FillDown(Ad_Columns, Categories),
    FilteredRows2 = Table.SelectRows(FilledDown2, each ([Category] = "Resource")),
    ReplaceValues = Table.ReplaceValue(FilteredRows2,
        each [Main activity],
        each null,
        (x,y,z)=> if x = y then z else x,
        List.Skip(Categories) ),
    RestoreTypes = Value.ReplaceType(ReplaceValues, Value.Type(FilteredRows2)),
    RemovedColumns = Table.RemoveColumns(RestoreTypes,{"Category", "Total"}),
    RenamedColumns = Table.RenameColumns(RemovedColumns,{{"Account", "Financial account"}, {"Text", "Resource"}}),
    ReorderedColumnsDynamic = Table.ReorderColumns(RenamedColumns,{"Financial account", "Code"} & Categories & {"Resource", "Amount", "Unit", "Price per unit"}),
    ChangedType = Table.TransformColumnTypes(ReorderedColumnsDynamic,{{"Financial account", Int64.Type}, {"Code", type text}, {"Resource", type text}, {"Amount", type number}, {"Unit", type text}, {"Price per unit", Currency.Type}})
in
    ChangedType

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

5 REPLIES 5
dufoq3
Super User
Super User

Hi @Daniel_L, different approach here.

 

In your expected result - there is a mistake in my opinion. This one should be correct:

Result

dufoq3_0-1713970849458.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jVZbb5swFP4rKM9R5dvxZW8kI6k2TZHSTnuo+sAy1KJ2ZCKkU/frd+zYQGuMqgh8IHznO3f77m6xqZuyOdTlc1YeDsdz0y2WC6GUVrju2odshWv27hKU4Ad0cb+8W3wr6wahXf1Sd6/+/+uq/JUd2wcU8/WVsqooXn8OJ7wzBYRw7j61CvbV6XhuD5XH4mNGWSDqUUAslBAHuTn/fE+ZN80ZfTgcT93Jv7KUr1XZWrQyggJz7MJoZH+v5lOgXJddlq/W/on1BgjOiQEUlGFEQdJ0L0MPpAJ/DkjUDFCIkdkXpMZAkX6dtfhzsYkt1owYqw09FkokiSEE2wpPNmlSG8vLKSFAkjgdGUy1FkwMwqzJxWYbxYoZLhhfWhWaAdAUdx6qklojfltbpZWkZEk/931VjQwWrijckoSFlPLBTAYuMdKWchJIeOAD8Fa6EEvjq2AyNLfV6bmMDBUgsOwGIcVpzIeg00l5rF5CO41SIiglNrbAuAKdYg6+DuXHJMW2uwBpGshCeAXxQXLNIucrvs1YREiNIUwtLZobTSDZamaichUB191eSCBv4yb9ILIo4sFAubJhA6klJGdpeOcifHNz5QbhiJ6D0vziUGI04iuL3+9DyPhbE1yrC0ZEsqzChGCx1xzLwySB29Dib4aavyfTG2Y5DonVCO9nuZFGkn5NEFdtF5y13/6tqidLzhlnb6AT4bJwS72NqSmxJa1HUtKHoOU61gLMUCrmlIRm2nWPVZu1VdjR5BB/abgbQWMdE3H48jVKHe5jl+Elyczw2gfcqKH9dgR2TSN/BI+HzZtpAOXa0gRbp44dGBVlQfbYkS8mjh1Yc5QkOwVhZVP/K7v62Dj8enXlzhmjWXjRMNMsOXbKbheiNspaDx1ZMZl532558GAUQEKNZk4FNzK1UbnHjMWbFReaUjkIs3AewQG027Iva7r50PaiuN0XkfnGTStqxCT196buMgAZzRgAV+u4NSdhlLN4/6DCndc0xun+Pw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, Account = _t, Text = _t, Code = _t, Amount = _t, Unit = _t, #"Price per unit" = _t, Total = _t]),
    ReplaceBlankToNull = Table.TransformColumns(Source, {}, each if _ is text and Text.Trim(_) = "" then null else _),
    FilledDown = Table.FillDown(ReplaceBlankToNull,{"Account", "Code"}),
    FilteredRows = Table.SelectRows(FilledDown, each not List.Contains({null, "Financial account"}, [Category])),
    Categories = List.Buffer(List.Distinct(List.Select(FilteredRows[Category], each _ <> "Resource"))),
    Ad_Columns = List.Accumulate(
        Categories,
        FilteredRows,
        (s,c)=> Table.AddColumn(s, c, each if List.Contains({c, "Main activity"}, [Category]) then [Text] else null, type text) ),
    FilledDown2 = Table.FillDown(Ad_Columns, Categories),
    FilteredRows2 = Table.SelectRows(FilledDown2, each ([Category] = "Resource")),
    ReplaceValues = Table.ReplaceValue(FilteredRows2,
        each [Main activity],
        each null,
        (x,y,z)=> if x = y then z else x,
        List.Skip(Categories) ),
    RestoreTypes = Value.ReplaceType(ReplaceValues, Value.Type(FilteredRows2)),
    RemovedColumns = Table.RemoveColumns(RestoreTypes,{"Category", "Total"}),
    RenamedColumns = Table.RenameColumns(RemovedColumns,{{"Account", "Financial account"}, {"Text", "Resource"}}),
    ReorderedColumnsDynamic = Table.ReorderColumns(RenamedColumns,{"Financial account", "Code"} & Categories & {"Resource", "Amount", "Unit", "Price per unit"}),
    ChangedType = Table.TransformColumnTypes(ReorderedColumnsDynamic,{{"Financial account", Int64.Type}, {"Code", type text}, {"Resource", type text}, {"Amount", type number}, {"Unit", type text}, {"Price per unit", Currency.Type}})
in
    ChangedType

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Daniel_L
Frequent Visitor

After viewing the provided solution a bit more I also see that SubActivity:2 is missing SubActivity. I.e. "Cat DEF" should have SubActivity = "Annual costs".

I have changed the Code, please check it again.

Daniel_L
Frequent Visitor

Wow, thank you very much for providing this solution! Just one minor detail missing, the column for the resource name. I guess I'll figure that out.

wdx223_Daniel
Super User
Super User

 

 

let
    Source = Csv.Document(File.Contents("Your_Csv_File_Address"),[Delimiter=";", Columns=10,QuoteStyle=QuoteStyle.None]),
    Custom1 = let
                 t=Table.PromoteHeaders(Source)
              in
                 Table.FromPartitions(
                                      "Financial Account",
                                      Table.ToRows(
                                                   Table.Group(
                                                               t,
                                                               "Account",
                                                               {
                                                                "c1",
                                                                each Table.Combine(
                                                                                   Table.Group(
                                                                                               Table.Skip(_),
                                                                                               "Category",
                                                                                               {
                                                                                                "c2",
                                                                                                each let
                                                                                                        a=Table.Group(
                                                                                                                      Table.Skip(_),
                                                                                                                      "Category",
                                                                                                                      {"c3",each _},
                                                                                                                      0,
                                                                                                                      (x,y)=>Byte.From((Text.StartsWith(x,"Sub activity") and Text.StartsWith(y,"Resource")) or (Text.StartsWith(y,"Sub activity") and Text.StartsWith(x,"Resource")))
                                                                                                                     ),
                                                                                                        b=[Main activity=[Text]{0},Code=[Code]{0}]
                                                                                                     in
                                                                                                        List.Accumulate(
                                                                                                                        Table.ToRows(a),
                                                                                                                        {#table(0,{}),b},
                                                                                                                        (x,y)=>if Text.StartsWith(y{0},"Resource")
                                                                                                                               then {x{0}&Table.FromRecords(Table.TransformRows(y{1},each x{1}&[Resource=[Text]]&[[Amount],[Unit],[Price per unit],[Total]])),x{1}}
                                                                                                                               else {x{0},x{1}&Record.FromTable(Table.FromColumns({y{1}[Category],y{1}[Text]},{"Name","Value"}))}
                                                                                                                       ){0}
                                                                                               },
                                                                                               0,
                                                                                               (x,y)=>Byte.From(Text.StartsWith(y,"Main activity"))
                                                                                              )[c2]
                                                                                  )
                                                               },
                                                               0,
                                                               (x,y)=>Byte.From(y<>"" and y<>null)
                                                              )
                                                  )
                                     )
in
    Custom1

 

 

wdx223_Daniel_0-1713931402927.png

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors