Data is the mainstay of any sort of analysis that needs performance in Excel. And when it comes to data, there are loads of challenges that one can face – be it Spelling Errors, tenacious irregular spaces, unsolicited precedes, in appropriate cases, unrequired characters & irregular cell merge, all these in an excel spreadsheet give us all a lot of stress at times. On top of this, there are more such issues which one can face while performing analysis on any data.
In this post, I will be sharing few very simple ways, which one may use to sanitize data in excel-
#1. Getting Rid of Extra Spaces
It is highly difficult for anyone to identify extra spaces in excel spreadsheet. While some of you might be able to identify extra spaces in between text or number but it is most difficult to identify extra spaces after text or number. Thankfully, there is a simple & clean way to get rid of this issue in excel. Just Apply TRIM Function on the data where you see extra spaces and it will remove all such spaces.
Function - =TRIM(Text)
#2. Converting Number stored as Text back to Number
This is another very common issue that one faces, especially when data is imported from a database or any application, where number gets converted into text format due to which user can not perform any numeric calculation or analysis. The simplest way to convert the number from text format to number format is-
a. Select the data which is to be converted to Number Format
b. Press Ctrl +1
c. A “Format Cell” Dialogue box will pop up.
d. Go to “Number” option in Number Tab
e. Press “OK”
#3. Removing unnecessary blanks
Blank Cells can really cause a lot of confusion if they are not handled at the very beginning. I am sure even you have faced problem caused by blank cells in data. Here is a smart way to identify all blank cells at one go and replace them with any character or value.
a. Select the data to be analyzed.
b. Press Ctrl + G
c. Click on Special Button at left bottom.
d. Select “Blank” option in Go To Special & Press “OK”.
This will highlight all blank cells in data. Then it is up to you to replace them with any value like “Not Applicable” or “Nil”. Then, press Ctrl + Enter.
#4. Merging Columns
Another very common task that I perform after extracting data from an external database is to merge content in two or more columns into one. Imagine a scenario where you have downloaded an employee master, wherein employee first name & the last name is coming in two different column & now you want to identify cases where two employees have the same name. For this, you will be required to merge both these columns to generate a new column containing employee first & Last Name. Following are most used ways to merge cells or columns-
a. Use CONCATENATE Function to perform this task.
Function - =CONCATENATE(First Name, “ “, Last Name)
b. Use & for merging two cells
Function - =First Name&” “&Last Name
c. Use CONCAT Function to perform this task.
#5. Splitting text into Columns
Just like merging text of two or more columns into one, at times I am also required to split text of one column into two or more columns. Imagine a situation where you have downloaded an employee master, wherein employee last & the first name is coming in one column separated by a blank or comma & you want it to be in order where employee first name is followed by his/her last name. For this, you will be required to split text of one column to two columns using text to column function & then merge the two columns in desired order by using above-stated functions as stated in point 4.
Use Text to Column Function to split text. This is one of my favorite function in excel and is very handy at times. So, without wasting time, let us understand its use-
i. Select the cell or column whose content you want to split
ii. Go to Data >> Text to Columns
iii. A Text to Columns Wizard will open. Click Delimited option & press Next
iv. Check Comma, and clear the rest of the boxes, or check Comma and Space if that is how your text is split (Woods, Alan, with a comma and space between the names). You can see a preview of your data in the Data preview window. Click Next
v. In this step, select the format for your new columns. Press Finish.
How often we come across data where text cases are improper & irregular. Using excel case function, one can quickly convert text to any of these case formats.
LOWER Function -Converts all uppercase letters in a text string to lowercase letters
Function – =LOWER(Text)
UPPER Function -Converts all lowercase letters in a text string to uppercase letters
Function – =UPPER(Text)
PROPER Function - Capitalizes the first letter in a text string and any other letters in the text that follows any character other than a letter. Converts all other letters to lowercase letters.
Function – =PROPER(Text)
There are many other techniques to clean data in Excel, Which I will be sharing in my future posts.
Keeping Watching this space for more interesting tips & techniques…
Happy Learning Folks!!!
0 comments: