Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
pelowski
Helper III
Helper III

Combining multiple columns together using parameters derived from an aggregated table

I have a dataset that based upon user configuration and specification, what is essentially the same column can be listed in several different ways.  Here is an example...

  1. Company
  2. company
  3. company_name
  4. company_or_organization

I have developed a series of steps that gets me very close to what I want.  There are some limited assumptions being made in the code, but ultimately I get to a "NamesToCombine" list and a "Name to Use" column.  What I'm struggling to do next is given that this table might be 10 or more rows long, how can I run a Table.CombineColumns function using parameters in each row in my resulting table to combine all the columns I want to combine?  Expression.Evaluate combined with a List.Generate or something? 

	// Look at the columns, sort the list regardless of capitalization
	ColumnNames = List.Sort(Table.ColumnNames(PreviousStep), each Text.Lower(_)),

	// All of the following is an attempt to get to a consistent column name
		#"Convert to Table" = Table.FromList(ColumnNames, Splitter.SplitByNothing(), {"Column Name"}, null, ExtraValues.Error),
		#"Add Generic Name" = Table.AddColumn(#"Convert to Table", "Generic Column Name", each Text.Replace(Text.Replace(Text.Lower([Column Name]), "_", ""), " ", "")),
		#"Add Index" = Table.AddIndexColumn(#"Add Generic Name", "Index", 0, 1, Int64.Type),
		SynonymReplacements = {{"companyname", "company"}, {"companyororganization", "company"}, {"educationalinstitution", "company"}, {"jobtitle", "title"}, {"phonenumber", "phone"}, {"titleorrole", "title"}},
		#"Replacements for Existing Columns" = List.ReplaceMatchingItems(#"Add Generic Name"[Generic Column Name], SynonymReplacements),
		#"Add New Synonym Replacements Column" = Table.AddColumn(#"Add Index", "Synonym Replacements", each #"Replacements for Existing Columns"{[Index]}),
		#"Add List of Characters" = Table.AddColumn(#"Add New Synonym Replacements Column", "List of Characters", each List.Sort(List.Distinct(Text.ToList([Column Name]))), type list),
		LowerCaseCharacters = {"a".."z"},
		UpperCaseCharacters = {"A".."Z"},
		Uppercase = Table.AddColumn(#"Add List of Characters", "Uppercase", each if List.ContainsAny([List of Characters], UpperCaseCharacters) then 2 else 0, Int64.Type),
		Lowercase = Table.AddColumn(Uppercase, "Lowercase", each if List.ContainsAny([List of Characters], LowerCaseCharacters) then 1 else 0, Int64.Type),
		SpacesOrUnderscore = Table.AddColumn(Lowercase, "SpacesOrUnderscore", each if List.ContainsAny([List of Characters], {"_", " "}) then 1 else 0, Int64.Type),
		#"Add Name Score" = Table.AddColumn(SpacesOrUnderscore, "Name Score", each [Uppercase] + [Lowercase] + [SpacesOrUnderscore], Int64.Type),
		#"Names To Replace Table" = Table.SelectRows(Table.Group(#"Add Name Score", {"Synonym Replacements"}, {{"ColumnMatchCount", each Table.RowCount(_), Int64.Type}, {"Name To Use", each Table.Sort(_, {{"Name Score", Order.Descending}})[Column Name]{0}}}), each [ColumnMatchCount] > 1),
		#"Merged Queries" = Table.NestedJoin(#"Names To Replace Table", {"Synonym Replacements"}, #"Add Name Score", {"Synonym Replacements"}, "NamesToReplace", JoinKind.LeftOuter),
		#"Add NamesToCombine" = Table.AddColumn(#"Merged Queries", "NamesToCombine", each [NamesToReplace][Column Name], type list),

 

Thanks for any help you can provide.

4 REPLIES 4
lbendlin
Super User
Super User

I would brute force it. Do a simple Table.Combine that ignores the different spellings of the column, and then after the combine add a custom column that concatenates the content of all the spelling variations into a new value.

I'm sorry @lbendlin, but I guess I don't understand what you're saying.  I have one table with multiple sets of columns I want combined.  I wouldn't be adding one custom column, I'd be adding a new column for each set of combinations I want to combine and it's that loop that I don't know how to do.  Here is a screenshot of the summary results of my name-combining steps.  Then I want to merge this back into what amounts in this case to 12 different columns.  (The original table has something like 50 or 60 columns.)

After running a bunch of steps to identify the columns I'd like to combine.After running a bunch of steps to identify the columns I'd like to combine.

 

Ultimately this is a data source where the system allows people to configure their own column names for each new form and while I wish there was some sort of enforced consistency, there is not.  The good news is that when there is some sort of "Company" or "Country" field, it is 99% likely that there will only be one value.  (I'll do some checks after I can get the dynamic concatenations done to verify this is true.)

 

Once again, I'm sorry if I missed something you're trying to get at here but I don't see how Table.Combine helps.  I'm still in the situation where I need to dynamically run 12 different custom column creations and then remove the 31 columns shown here that have been combined together.

Do your incoming tables have the same structure?  Meaning are the columns in the same order even if they are spelled differently?

It is only one table with a fluctuating/ever-expanding number of columns.  Each time someone adds a new form in the GUI, new columns can be added to the output structure.  Depending on the labels of each column (by default the labels get used as the field names) new columns may get added to the output the next time I do an export.  This is why I run the steps to try to determine which columns to combine into one column output because no matter if the form creator types "company", "Company", "Company Name", "Organization", etc into the form creation, it all means the same thing and for any given row (form submission) there will only be one value total across all those columns for that row.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.