Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
BryanCarmichael
Advocate I
Advocate I

Lakehouse Add or Remove columns from table

Not sure if this is the right forum or not - but here is the issue.

 

We are loading data to a Lakehouse using gen 2 data flows (for now they are just pointing at exisitn gen 1 dataflows then doing the lakehouse insert - we will recify this later on).

 

Over time it is typical for columns to be added, removed and / or updated in a dataflow  - with a datamart these changes are reflected automatically in the schema - however with a laehouse when adding a new column to the dtaflow i can see no way to bring that into the lakehouse.

 

What do i need to do here - only options i can see are
1: import it as a new table but that seems to be very clunky as you would need to update queoroes / stored procedures on your sql end point to cater for this
2: Delete exisitng table in lakehouse and then add a new one with the same name.

 

Am i missing something ?

10 REPLIES 10
frithjof_v
Skilled Sharer
Skilled Sharer

It seems to be possible to add columns in Lakehouse table now by using notebook.

 

I am able to use the following type of command in a Notebook:

 

%%sql

ALTER TABLE tableName

ADD COLUMN columnName dataType

 

And the table will get updated also in SQL Analytics Endpoint and Direct Lake Semantic Model, something which was a problem before.

 

Ref. this thread:

https://community.fabric.microsoft.com/t5/General-Discussion/SQL-ALTER-command/m-p/3748079#M4861

 

However, I get an error if I try to rename or remove (drop) a column.

Maybe this is a solution for renaming columns, dropping columns and changing column type in Lakehouse tables:

https://community.fabric.microsoft.com/t5/General-Discussion/Dropping-and-recreating-lakehouse-table...

 

funtomas
Frequent Visitor

There is 3rd option which worked for me:

 

3.Rename original table. For exmplate rename "Table" to "Table1"

 

Then go to your Dataflow and setup destination of your Dataflow again. (Of course use create new table called "Table".)

 

funtomas_1-1699027621980.png

 

 

BryanCarmichael
Advocate I
Advocate I

So where we ended up with this is moving to using a warehouse for almost everything - and creating our own tables in it.

This allows you to alter them and create primary keys (not enforced) .
Lakehouse is good for super unstructured data but if you have structured data then a warehouse is a much better option.

asittrivedi
Regular Visitor

I have used a Python notebook to add a column to an existing table and that works just fine. One can use spark dataframe or pyspark.pandas dataframe to get the desired outcome. 

Would really appreciate any additional insight or links to resources that could be provided on this subject.

could you share how to do this in python?

import pyspark.pandas as ps
import pandas as pd
import numpy as np
from pyspark.sql import *
 
psdf = ps.read_delta('path_to_table')
psdf.head(10)
 
psdf['new_col'] = ''
psdf.head(10)
 
sdf = psdf.to_spark()
 
sdf.write.mode('overwrite').saveAsTable('existing_table')
 
Once can use a spark dataframe in lieu of pandas.
 
How ever the easiest option now is to use a SQL notebook and add a column. Please refer
marcuspaivio
New Member

I was wondering the same. 🙂

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

Expanding the Synapse Forums

New forum boards available in Synapse

Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayFBCUpdateCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.