cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
zoezoe
New Member

data cleansing

I wanted to ask what the best way for data cleansing would be in relation to some sale figures and dates.I got an excel files that requires some formatting but I am unsure about the way to proceed to clean the data through the software itself and make it efficient for my data analysis.

2 ACCEPTED SOLUTIONS

Accepted Solutions
Ross73312 Community Champion
Community Champion

Re: data cleansing

The first thing to understand is that how data is stored and how data is displayed on screen is not the same.  Data cleansing is all about how the data is stored.  When its nicely stored its much easier to have it display the way you want.

 

The best kind of data is any data that is stored numerically (Whole Number, Decimal Number, Date, DateTime etc)

The worst kind of data is text based data, although it is a necessary evil.

 

When you bring data in from Excel, you want to cleanse and convert the data as best as possible into their proper datatypes.  Doing this will ensure you can do everything you need once you have it inside the data Model itself.

 

This will require you to try to anticipate things that could turn up within the data from users.  If your Excel file is being generated from a system, you can be reasonbly confident that everything will stay the same.  If your Excel file is being populated by hand, you can expect that data cleansing will be an on-going maintenance item that you will need to address  from time to time whenever someone makes a keying error.

 

Common cleansing items are:

Converting Text based data into numerical types - This can take 1 or more steps depending on how good the data is.

Aligning Text values to look nice - This could be removing common spelling errors using 'Replace Values' to chopping the data up to remove uncessary prefixs and suffixes.

Removing bad characters - This can be done from a Trim and/or Clean operation, or might require a Replace Values to remove.


   

              Did I answer your question? Mark my post as a solution!
       

Proud to be a Datanaut!


   


View solution in original post

Community Support
Community Support

Re: data cleansing

Hi @zoezoe

Thanks for Ross73312's so detailed introduction, i would add something which you could learn how to clear with Excel or Power BI.

Reference:

Data Cleaning in Excel or in Power BI

 

Best Regards

Maggie

View solution in original post

2 REPLIES 2
Ross73312 Community Champion
Community Champion

Re: data cleansing

The first thing to understand is that how data is stored and how data is displayed on screen is not the same.  Data cleansing is all about how the data is stored.  When its nicely stored its much easier to have it display the way you want.

 

The best kind of data is any data that is stored numerically (Whole Number, Decimal Number, Date, DateTime etc)

The worst kind of data is text based data, although it is a necessary evil.

 

When you bring data in from Excel, you want to cleanse and convert the data as best as possible into their proper datatypes.  Doing this will ensure you can do everything you need once you have it inside the data Model itself.

 

This will require you to try to anticipate things that could turn up within the data from users.  If your Excel file is being generated from a system, you can be reasonbly confident that everything will stay the same.  If your Excel file is being populated by hand, you can expect that data cleansing will be an on-going maintenance item that you will need to address  from time to time whenever someone makes a keying error.

 

Common cleansing items are:

Converting Text based data into numerical types - This can take 1 or more steps depending on how good the data is.

Aligning Text values to look nice - This could be removing common spelling errors using 'Replace Values' to chopping the data up to remove uncessary prefixs and suffixes.

Removing bad characters - This can be done from a Trim and/or Clean operation, or might require a Replace Values to remove.


   

              Did I answer your question? Mark my post as a solution!
       

Proud to be a Datanaut!


   


View solution in original post

Community Support
Community Support

Re: data cleansing

Hi @zoezoe

Thanks for Ross73312's so detailed introduction, i would add something which you could learn how to clear with Excel or Power BI.

Reference:

Data Cleaning in Excel or in Power BI

 

Best Regards

Maggie

View solution in original post

Helpful resources

Announcements
New Ranks Launched March 24th!

New Ranks Launched March 24th!

The time has come: We are finally able to share more details on the brand-new ranks coming to the Power BI Community!

‘Better Together’ Contest Finalists Announced!

‘Better Together’ Contest Finalists Announced!

Congrats to the finalists of our ‘Better Together’-themed T-shirt design contest! Click for the top entries.

Arun 'Triple A' Event Video, Q&A, and Slides

Arun 'Triple A' Event Video, Q&A, and Slides

Missed the Arun 'Triple A' event or want to revisit it? We've got you covered! Check out the video, Q&A, and slides now.

Top Solution Authors
Top Kudoed Authors