Excel’s Remove Duplicates feature can fail to identify duplicate rows that appear identical. This happens because invisible control characters hide within your cell data. These non-printing characters cause Excel to see two cells as different. This article explains how these characters get into your data and provides steps to clean them out so Remove Duplicates works correctly.
Key Takeaways: Fixing Remove Duplicates by Cleaning Data
- CLEAN function: Removes the first 32 non-printing characters in the ASCII set, including carriage returns and line feeds.
- Find and Replace with CHAR codes: Targets and deletes specific invisible characters like non-breaking spaces (CHAR 160).
- TRIM and Power Query: TRIM removes extra spaces, while Power Query provides advanced cleaning and deduplication tools.
Why Invisible Characters Break the Remove Duplicates Command
Excel’s Remove Duplicates command performs an exact, character-by-character comparison. When two cells look the same to you, they may contain different underlying codes. Common culprits are non-breaking spaces, tab characters, carriage returns, and other control characters from the ASCII range 0 to 31. These often get imported from web pages, databases, or other software.
A non-breaking space, CHAR(160), is a frequent offender. It looks identical to a regular space but has a different code. Excel treats “Data”&CHAR(160) and “Data” as two unique values. Similarly, a trailing tab character or a hidden line feed within a cell will cause a mismatch. The TRIM function alone cannot fix these, as it only handles regular space characters, CHAR(32).
How to Identify Cells with Hidden Characters
You can spot some characters by editing the cell. Click into the formula bar and use the arrow keys. The cursor may jump oddly or stop where you see no character. For a more definitive check, use the CODE or UNICODE functions on individual characters within a cell. The LEN function is also useful; if LEN returns a higher count than the visible text, invisible characters are present.
Steps to Clean Data and Fix Remove Duplicates
Follow these methods to strip out invisible control characters before using the Remove Duplicates tool.
Method 1: Use the CLEAN and TRIM Functions
- Insert a new helper column
Next to your data, add a column for cleaned data. For example, if your text is in column A, use column B. - Apply the CLEAN function
In the first cell of the new column (B2), enter the formula=CLEAN(A2). This removes non-printing ASCII characters 0 through 31. - Apply the TRIM function
Nest the CLEAN function inside TRIM to also remove excess spaces. The full formula is=TRIM(CLEAN(A2)). - Copy the formula down
Drag the fill handle down to apply the formula to all rows in your dataset. - Replace original data with cleaned values
Select all the cleaned cells in the helper column. Copy them with Ctrl+C. Select the original data column, right-click, choose Paste Special, and select Values. Click OK. - Delete the helper column
Right-click the helper column header and select Delete. Now use Data > Remove Duplicates on your cleaned original data.
Method 2: Use Find and Replace for Specific Characters
This method is best for stubborn characters like non-breaking spaces that the CLEAN function misses.
- Select your data range
Highlight the cells or columns where duplicates are not being removed. - Open the Find and Replace dialog
Press Ctrl+H to open the Find and Replace window. - Enter the character code in the Find what box
For a non-breaking space, hold Alt and type 0160 on the numeric keypad. Alternatively, type the formula=CHAR(160)in a blank cell, copy the resulting invisible character, and paste it into the Find what field. - Leave the Replace with box empty
Ensure the Replace with field is completely empty to delete the character. - Replace All
Click Replace All. Excel will remove all instances of that specific invisible character from your selection.
If Remove Duplicates Still Does Not Work
Data Has Leading/Trailing Spaces or Different Cases
The TRIM function removes extra spaces but is case-insensitive. For text in different cases like “APPLE” and “apple”, Excel sees them as duplicates by default. If you need case-sensitive deduplication, you must use a formula-based approach or Power Query, as the built-in tool ignores case.
Cells Contain Formulas or External Links
If your cells contain formulas, Remove Duplicates compares the formula text, not the resulting value. Convert formulas to values first. Select the range, copy it with Ctrl+C, then use Paste Special > Values. Also, check for cells that look like text but are actually numbers stored as text, which can be cleaned with the VALUE function.
Advanced Cleaning with Power Query
For complex, recurring data cleaning, use Power Query. Select your data and go to Data > From Table/Range. In the Power Query Editor, use the Transform tab. Choose Format > Trim and Format > Clean. You can then use the Remove Duplicates button within the editor and load the cleaned table back to Excel.
Manual Cleaning vs. Function vs. Power Query: Key Differences
| Item | Find and Replace (Manual) | CLEAN/TRIM Functions | Power Query |
|---|---|---|---|
| Best for | Targeting one known character like CHAR(160) | Quick, one-time cleanup of common control chars | Repetitive cleaning of large or updated datasets |
| Automation | Manual process, not repeatable | Requires formula copy-paste steps | Fully automated refresh with one click |
| Learning curve | Low, uses familiar dialog | Low, basic Excel formulas | Moderate, new interface to learn |
| Handles case sensitivity | No | No | Yes, with additional steps |
You can now reliably use the Remove Duplicates feature by first cleaning your data of invisible characters. Start with the CLEAN and TRIM functions in a helper column for a standard fix. For persistent non-breaking spaces, use the Find and Replace method with CHAR(160). Remember to use the LEN function to verify your text length before and after cleaning. For ongoing data projects, invest time in learning Power Query to automate the entire cleaning and deduplication process.