Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello,
I have multiple csv files that I want to load at once and combined with each other. These tables aren't typically formatted. I need to extract the date from the first second row as a additional column (to distinguish which csv I have) and afterwards delete empty rows and then put the columns into the head. There are sevaral empty rows until I reach the column line:
Date | Timestamps | ||||||||||
01/01/2022 | |||||||||||
Column A | Column B | Column C | Column D | Column E | Column F | ||||||
xxx | xxx | xxx | xxx | xxx | xxx | ||||||
The problem I encounter now is that I have multiple csv files, where the column header sometimes has 4 rows that i need to delete to reach the top and sometimes more and less. Is there some command like delete only empty rows or until row =Column A?
Thank you very much in advance.
Best.
Solved! Go to Solution.
Hi @Applicable88,
So, for your second issue, here's a function I use regularly to combine multiple CSV files where the number of unwanted rows varies.
// fProcessFiles
let
fProcessFiles = (myFile as binary) =>
let
CSV = Csv.Document(myFile, [Encoding=1252]),
ConditionalBlankIndex = Table.AddColumn(CSV, "Custom", each if [Column2] = "" then 0 else 1),
HeaderPosition = List.PositionOf(ConditionalBlankIndex[Custom], 1),
RemoveRows = Table.Skip(CSV, HeaderPosition),
PromotedHeaders = Table.PromoteHeaders(RemoveRows, [PromoteAllScalars = true])
in
PromotedHeaders
in
fProcessFiles
Change the column reference '[Column2]' to a column that will be all blank until you hit a header.
Here's the other function I used in the main code.
// fNoHeader
let
fProcessFiles = (myFile as binary) =>
let
CSV = Csv.Document(myFile, [Encoding=1252])
in
CSV
in
fProcessFiles
Main code:
let
Source = Folder.Files("H:\My Drive\Power BI\Community Solutions\Files"),
#"Filtered Rows" = Table.SelectRows(Source, each ([Name] = "headers_filereference.csv")),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows", {"Content"}),
#"Invoked Custom Function" = Table.AddColumn(
#"Removed Other Columns",
"Query1",
each fProcessFiles([Content])
),
#"Invoked Custom Function1" = Table.AddColumn(
#"Invoked Custom Function",
"NoHeader",
each fNoHeader([Content])
),
#"Added Custom" = Table.AddColumn(
#"Invoked Custom Function1",
"Custom",
each [NoHeader]{1}[Column3]
),
#"Removed Other Columns1" = Table.SelectColumns(#"Added Custom", {"Query1", "Custom"}),
#"Expanded Query1" = Table.ExpandTableColumn(
#"Removed Other Columns1",
"Query1",
{
"Column A",
"",
"Column B",
"_1",
"Column C",
"_2",
"Column D",
"_3",
"Column E",
"_4",
"Column F",
"_5"
},
{
"Column A",
"Column1",
"Column B",
"_1",
"Column C",
"_2",
"Column D",
"_3",
"Column E",
"_4",
"Column F",
"_5"
}
)
in
#"Expanded Query1"
Basically...
Let me know if this requires further explanation.
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
Hi @Applicable88,
So, for your second issue, here's a function I use regularly to combine multiple CSV files where the number of unwanted rows varies.
// fProcessFiles
let
fProcessFiles = (myFile as binary) =>
let
CSV = Csv.Document(myFile, [Encoding=1252]),
ConditionalBlankIndex = Table.AddColumn(CSV, "Custom", each if [Column2] = "" then 0 else 1),
HeaderPosition = List.PositionOf(ConditionalBlankIndex[Custom], 1),
RemoveRows = Table.Skip(CSV, HeaderPosition),
PromotedHeaders = Table.PromoteHeaders(RemoveRows, [PromoteAllScalars = true])
in
PromotedHeaders
in
fProcessFiles
Change the column reference '[Column2]' to a column that will be all blank until you hit a header.
Here's the other function I used in the main code.
// fNoHeader
let
fProcessFiles = (myFile as binary) =>
let
CSV = Csv.Document(myFile, [Encoding=1252])
in
CSV
in
fProcessFiles
Main code:
let
Source = Folder.Files("H:\My Drive\Power BI\Community Solutions\Files"),
#"Filtered Rows" = Table.SelectRows(Source, each ([Name] = "headers_filereference.csv")),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows", {"Content"}),
#"Invoked Custom Function" = Table.AddColumn(
#"Removed Other Columns",
"Query1",
each fProcessFiles([Content])
),
#"Invoked Custom Function1" = Table.AddColumn(
#"Invoked Custom Function",
"NoHeader",
each fNoHeader([Content])
),
#"Added Custom" = Table.AddColumn(
#"Invoked Custom Function1",
"Custom",
each [NoHeader]{1}[Column3]
),
#"Removed Other Columns1" = Table.SelectColumns(#"Added Custom", {"Query1", "Custom"}),
#"Expanded Query1" = Table.ExpandTableColumn(
#"Removed Other Columns1",
"Query1",
{
"Column A",
"",
"Column B",
"_1",
"Column C",
"_2",
"Column D",
"_3",
"Column E",
"_4",
"Column F",
"_5"
},
{
"Column A",
"Column1",
"Column B",
"_1",
"Column C",
"_2",
"Column D",
"_3",
"Column E",
"_4",
"Column F",
"_5"
}
)
in
#"Expanded Query1"
Basically...
Let me know if this requires further explanation.
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
The simplest approach would be to filter blanks from one of the columns but this requires that column to not have blanks in any rows you want to preserve.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.