Sql Server delete all duplicate rows

Hello Friends

In this article, we are going to discuss, that how we can find all duplicate rows in a table and how we can delete all duplicate rows and keeping one unique row.

We will create an employee table with some data with duplicate rows as shown below.

Here is the code to create the dummy table with sample data:

IF OBJECT_ID('dbo.tblEmployee') IS NOT NULL
    DROP TABLE dbo.tblEmployee
GO
 
IF OBJECT_ID('dbo.tblEmployee') IS NOT NULL
    DROP TABLE dbo.tblEmployee
GO
 
CREATE TABLE dbo.tblEmployee
(
    EmpId INT,
    EmpName VARCHAR(200),
    Age TINYINT,
    EmpCode Int,
	DeptId int,
	DesignationId int
)
GO

INSERT INTO dbo.tblEmployee VALUES(12,'Jack',34,1222,24,4),(32,'Rashid',44,1242,24,4),(55,'Denial',24,1292,21,9)
INSERT INTO dbo.tblEmployee VALUES(9,'Andrew',34,1212,44,2),(2,'Michel',41,1202,54,1),(505,'Flintof',23,12112,15,8)
INSERT INTO dbo.tblEmployee VALUES(12,'Jack',34,1222,24,4),(32,'Rashid',44,1242,24,4),(55,'Denial',24,1292,21,9)
INSERT INTO dbo.tblEmployee VALUES(9,'Andrew',34,1212,44,2),(505,'Flintof',23,12112,15,8)
INSERT INTO dbo.tblEmployee VALUES(12,'Jack',34,1222,24,4),(55,'Denial',24,1292,21,9)
INSERT INTO dbo.tblEmployee VALUES(9,'Andrew',34,1212,44,2),(2,'Michel',41,1202,54,1),(505,'Flintof',23,12112,15,8)
INSERT INTO dbo.tblEmployee VALUES(12,'Jack',34,1222,24,4),(32,'Rashid',44,1242,24,4),(55,'Denial',24,1292,21,9)
INSERT INTO dbo.tblEmployee VALUES(9,'Andrew',34,1212,44,2),(2,'Michel',41,1202,54,1),(505,'Flintof',23,12112,15,8)
INSERT INTO dbo.tblEmployee VALUES(12,'Jack',34,1222,24,4),(32,'Rashid',44,1242,24,4),(55,'Denial',24,1292,21,9),(32,'Rashid',44,1242,24,4)
INSERT INTO dbo.tblEmployee VALUES(9,'Andrew',34,1212,44,2),(2,'Michel',41,1202,54,1),(505,'Flintof',23,12112,15,8),(2,'Michel',41,1202,54,1)
Go

Let’s have a look at the table data now.

SELECT * FROM  dbo.tblEmployee 
Go

OUTPUT: EmpId EmpName Age EmpCode DeptId DesignationId ----------- --------------- ---- ----------- ----------- ------------- 12 Jack 34 1222 24 4 32 Rashid 44 1242 24 4 55 Denial 24 1292 21 9 9 Andrew 34 1212 44 2 2 Michel 41 1202 54 1 505 Flintof 23 12112 15 8 12 Jack 34 1222 24 4 32 Rashid 44 1242 24 4 55 Denial 24 1292 21 9 9 Andrew 34 1212 44 2 505 Flintof 23 12112 15 8 12 Jack 34 1222 24 4 55 Denial 24 1292 21 9 9 Andrew 34 1212 44 2 2 Michel 41 1202 54 1 505 Flintof 23 12112 15 8 12 Jack 34 1222 24 4 32 Rashid 44 1242 24 4 55 Denial 24 1292 21 9 9 Andrew 34 1212 44 2 2 Michel 41 1202 54 1 505 Flintof 23 12112 15 8 12 Jack 34 1222 24 4 32 Rashid 44 1242 24 4 55 Denial 24 1292 21 9 32 Rashid 44 1242 24 4 9 Andrew 34 1212 44 2 2 Michel 41 1202 54 1 505 Flintof 23 12112 15 8 2 Michel 41 1202 54 1 (30 rows affected)

Lets use Distinct with select statement.

SELECT DISTINCT * FROM  dbo.tblEmployee 
Go

OUTPUT: EmpId EmpName Age EmpCode DeptId DesignationId ----------- ---------------- ---- ----------- ----------- ------------- 2 Michel 41 1202 54 1 9 Andrew 34 1212 44 2 12 Jack 34 1222 24 4 32 Rashid 44 1242 24 4 55 Denial 24 1292 21 9 505 Flintof 23 12112 15 8 (6 rows affected)

So we have only 6 unique record(s) out of 30 rows in database.

To get rid off such redundancy, we will count all duplicate rows using ROW_NUMBER() as follows.

SELECT ROW_NUMBER() OVER(PARTITION BY EmpId,EmpName,Age,EmpCode,DeptId,DesignationId ORDER BY EmpId) RNo,* 
FROM  dbo.tblEmployee 
Go

This will results as follows.

OUTPUT: RNo EmpId EmpName Age EmpCode DeptId DesignationId --------- ----------- ----------------- ---- ----------- ----------- ------------- 1 2 Michel 41 1202 54 1 2 2 Michel 41 1202 54 1 3 2 Michel 41 1202 54 1 4 2 Michel 41 1202 54 1 5 2 Michel 41 1202 54 1 1 9 Andrew 34 1212 44 2 2 9 Andrew 34 1212 44 2 3 9 Andrew 34 1212 44 2 4 9 Andrew 34 1212 44 2 5 9 Andrew 34 1212 44 2 1 12 Jack 34 1222 24 4 2 12 Jack 34 1222 24 4 3 12 Jack 34 1222 24 4 4 12 Jack 34 1222 24 4 5 12 Jack 34 1222 24 4 1 32 Rashid 44 1242 24 4 2 32 Rashid 44 1242 24 4 3 32 Rashid 44 1242 24 4 4 32 Rashid 44 1242 24 4 5 32 Rashid 44 1242 24 4 1 55 Denial 24 1292 21 9 2 55 Denial 24 1292 21 9 3 55 Denial 24 1292 21 9 4 55 Denial 24 1292 21 9 5 55 Denial 24 1292 21 9 1 505 Flintof 23 12112 15 8 2 505 Flintof 23 12112 15 8 3 505 Flintof 23 12112 15 8 4 505 Flintof 23 12112 15 8 5 505 Flintof 23 12112 15 8 (30 rows affected)

Finally we will delete all rows having Rno greater than 1 Using CTE as Follows.

;WITH CTE AS(																												  
SELECT ROW_NUMBER() OVER(PARTITION BY EmpId,EmpName,Age,EmpCode,DeptId,DesignationId ORDER BY EmpId) RNo,* 
FROM  dbo.tblEmployee)
DELETE FROM CTE WHERE CTE.RNo>1
Go

OUTPUT:
(24 rows affected)

Let select all rows.

SELECT * FROM dbo.tblEmployee ORDER BY 1
Go

OUTPUT: EmpId EmpName Age EmpCode DeptId DesignationId ----------- ----------- ---- ----------- ----------- ------------- 2 Michel 41 1202 54 1 9 Andrew 34 1212 44 2 12 Jack 34 1222 24 4 32 Rashid 44 1242 24 4 55 Denial 24 1292 21 9 505 Flintof 23 12112 15 8 (6 rows affected)

If you have any query or question or topic on which, we might have to write an article for your interest or any kind of suggestion regarding this post, Just feel free to write us, by hit add comment button below or contact via Contact Us form.

Your feedback and suggestions will be highly appreciated. Also try to leave comments from your valid verified email account, so that we can respond you quickly.

Add Comment

Sql Server delete all duplicate rows

Comment By: {{c.Author}} On: {{c.CreatedDate|date:'dd/MM/yyyy'}} / Reply

Categories

Tags

Archives