SQL Server Delete duplicates record from a table | CloudFronts

SQL Server Delete duplicates record from a table

Posted On February 20, 2020 by Sandip Patel Posted in

In this blog we will learn how to delete duplicates records from a table in SQL Server.

Let’s create a sample table for the demo.

Step 1: Create temporary table called tmpTable as follows

DECLARE @tmpTable TABLE(EmpID INT IDENTITY(1, 1), EName VARCHAR(50))

Step 2: Insert some record into the tmpTable table.

INSERT INTO @tmpTable(EName) VALUES (‘Ricky’)

INSERT INTO @tmpTable(EName) VALUES (‘Ricky’)

INSERT INTO @tmpTable(EName) VALUES (‘Ricky’)

INSERT INTO @tmpTable(EName) VALUES (‘Lloyd’)

INSERT INTO @tmpTable(EName) VALUES (‘Lloyd’)

INSERT INTO @tmpTable(EName) VALUES (‘Lloyd’)

INSERT INTO @tmpTable(EName) VALUES (‘Jack’)

Step 3: write SQL query from the tmpTable table

Select * from @tmpTable

The below screenshot shows the output of the query

There are duplicate rows (1,2,3) and (4,5,6) for the Employee that have the same Employee name.

To delete the duplicate rows from the SQL Server, you follow these steps

  1. Find duplicate rows using GROUP BY clause or ROW_NUMBER() or RANK() function.
  2. Use DELETE statement to remove the duplicate rows.

Method 1:

Find the maximum employee id from the table using GROUP BY function and delete those record that are not present in main table.

DELETE

FROM @tmpTable

WHERE EmpID NOT IN

(

SELECT MAX(tmp.EmpID) as [MaxEID]

FROM @tmpTable as tmp

GROUP BY tmp.EName

)

Method 2:

Using the common table expression (CTE) to delete duplicate rows:

;WITH tblTest

AS

(

SELECT EmpID

, ROW_NUMBER() OVER (PARTITION BY EName ORDER BY EmpID) AS [RowNumber]

FROM @tmpTable

)

DELETE FROM tblTest

WHERE RowNumber > 1

In this statement

  • First, the CTE uses the ROW_NUMBER() function to find the duplicate rows specified by values in the EName columns.
  • Then, the DELETE statement deletes all the duplicate rows but keeps only one occurrence of each duplicate group.

Method 3:

We can also delete duplicate row using RANK() as shown below :

DELETE FROM @tmpTable

WHERE EmpID IN

(

SELECT a.EmpID

FROM

(

SELECT tmp.*

, RANK() OVER (PARTITION BY EName ORDER BY EmpID) as [Rank]

FROM @tmpTable as tmp

) as a

WHERE a.Rank > 1

)

I hope this will help you.

 


Share Story :

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close