5 effortless methods to figure out duplicates values in excel featured image

Intro by Chandeep

Today we have a guest in the house. Please welcome, Puneet Gogia from ExcelChamps and he is going to teach us, 5 methods to remove duplicates in Excel.

 

Enters Puneet –

While working with large data sets, duplicate values are always a problem

No matter what, you will always get some duplicate values in your data. But the good news is, in Excel, you can find them, highlight them, and delete them in no time. Today, I will show you 5 effortless methods to figure out duplicates value in Excel.

So let’s get started.

 

Instant Access: Download this Quick PDF guide to learn about these 5 methods.

 

Table of Contents

  1. Remove Duplicate Option
  2. Using COUNTIF Function
  3. Highlight With Conditional Formatting
  4. With a Pivot Table
  5. Using Power Query

 

Here I have a billing data with some duplicate invoices in it. You can download it from here to follow along.

invoice data to figure out duplicate values

 

1. Remove Duplicates Option

It’s a default option in excel to remove duplicate values. With this, you can remove duplicate values in a flash. Here are the steps.

  • Select the column from where you want to remove duplicate values (Here I’m selecting invoice column).
  • After that, go to Data tab -> Data tools -> Remove Duplicates.

go to data tab figure out duplicate values

  • Once you click on “Remove Duplicate”, you will get a pop-up option.
  • From this option, you can select:
    1. Expand the Selection:
    2. Continue with the current selection:
  • Select “Expand this Selection”, click “Remove Duplicates”.
  • Congratulations, now there are no duplicate values in your data.

pop up option to figure out duplicate values

 

Different Between “Expand the Selection” & “Continue with Current Selection”

  1. Expand the Selection: This will consider entire row while checking for duplicate values. If there is a duplicate value in a cell but rest of the cells have unique values then that will not consider as a duplicate entry.
  2. Continue with Current Selection: It will just remove duplicates from selected column without considering values from other columns. But in real, this is not a good idea to use this option.

 

Pros:

  1. This method is super quick. You just need a few clicks.
  2. It will automatically delete duplicate values.

 

Cons:

  1. You can’t review duplicate values from your data
  2. Data should be sort before you remove duplicates

 

2. Using COUNTIF Function

An advance method to remove duplicate values in excel. And, my favorite one. Just follow these simple steps.

enter formula drag upto last cell figure out duplicate values

  • Enter the above formula in your data table.
  • Now, drag this formula up to the last row of the data.
  • Filter the column and select values which are more than 1.

filter values more than twol figure out duplicate values

  • All the values which you have selected are in duplicate.
  • You can review them and then delete them.

 

Pros:

  1. It’s a one time set up, you don’t have to apply it again and again.
  2. It’s a dynamic method, it will get update with data.

 

Cons:

  1. Applying formulas may be confusing sometimes.
  2. You can check duplicate values from a single column at a time. You have to extend the formula to check other columns.

 

3. Highlight With Conditional Formatting

You can highlight duplicate values with a color. Yes, conditional formatting can help you to highlight duplicate values and after that, you can review them, and delete them.

Steps are below.

  • Select the entire your data from which you want to highlight duplicate values.
  • Go to Home Tab -> Styles -> Conditional Formatting -> Highlight Cells Rules -> Duplicate Values.

use conditional formatting to figure out duplicate values

  • Here, you’ll get a pop-up option to select a color to highlight values.
  • Select the color, you want to use. Click OK.

select format tol figure out duplicate values with conditional formatting

 

This will highlight all the duplicate values from the data you have selected.

figure out duplicate values with conditional formatting

Now, you can review them and delete them. You can also highlight unique values instead of duplicate values.

 

Pros:

  1. It’s quick and easy.
  2. You can review duplicate values before deleting them.

 

Cons:

  1. This is not dynamic, you have to update conditional formatting when you update your data.
  2. You have to delete duplicate values manually.

 

4. Using a Pivot Table

Maybe it looks silly to you but you can also use a pivot table to figure out duplicate values from your data. Let’s me tell you how you can do this. Here I am using the same invoice data which I have used in above methods.

Follow these steps.

  • First of all, you have to insert a pivot table.
  • And after that, using pivot table fields dialog box, add invoices in “Rows” field.
  • Again, select invoices and add it into “Values” field.
  • Now, change value setting to “count”.

change value setting in pivot table to figure out duplicate values

  • In the end, you have a pivot table like below.

invoices more than to figure out duplicate values

In this pivot, all the invoices where the count is more than 1 are in duplicate in your data.

invoices more than one to figure out duplicate values

Now, you can review duplicate values from your raw data and delete them accordingly.

Pros:

  1. One time set up.
  2. A pivot table is dynamic if you are using auto update.

 

Cons:

  1. You can’t directly delete duplicate values.
  2. You have to delete values manually.

 

5. Using Power Query

Power Query is a revolution for data cleaning. You can easily use it to remove duplicate values from your data. Make sure to apply table to your raw data. And, follow these steps to remove duplicates with power query.

  • Select any of the cells from your data.
  • Go to -> Data Tab -> Get & Transform -> From table.

click on from table to figure out duplicate values with power query

  • Once you click on the button, it will load your data into power query editor.

load data to figure out duplicate values with power query

  • Now, right-click on the column header from which you want to check duplicate values.
  • Select “Remove Duplicate” (It will instantly remove all the duplicate values from your data.)

select remove duplicate to figure out duplicate values with power query

  • And after that, click on “Close & Load” (It will paste the entire data into a new worksheet without duplicate values).

click close load to figure out duplicate values with power query

Pros:

  1. It’s fully dynamic. Whenever you update your data, you just have to refresh your query and it will automatically remove all the duplicates.
  2. It will not make any changes in your main data, in fact, you’ll get new transformed data which will be connected with your original data.

 

Cons:

  1. It’s a long process.
  2. Not possible if you don’t have power query in your excel version.

 

Conclusion

Well, as long as you work with data, you can’t ignore duplicate values. It will be there for always. But the important thing is you can highlight them, review and delete them by using a right method.

And, I hope all the methods which I have mentioned above will help you in your work.

Now, you have to help me, go to the comment section and tell me if you have any other methods to handle duplicate values in excel.


Written by Puneet Gogia
Founder at ExcelChamps. You can find him online, tweeting about excel, on a running track, or sometimes hiking up a mountain. Check out his Excel Productivity Guide which is fully loaded with some most amazing excel tips.

 

Closing : By Chandeep

Thanks Puneet for taking out time for sharing these cool tricks with us. Sidenote: I learnt how to remove duplicates using Power Query.

Do let Puneet know if you have any questions in the comments section, he’ll be around to reply to your comments. Also download his productivity guide. It is kick-ass!

 



Topics that I write about...






Download Smart Ebooks on
Excel and Power BI