Question
This question already has answers here :
[Remove duplicate rows in MySQL](/questions/3311903/remove-duplicate-rows-in- mysql) (28 answers)
Closed 3 years ago.
How would I delete all duplicate data from a MySQL Table?
For example, with the following data:
SELECT * FROM names;
+----+--------+
| id | name |
+----+--------+
| 1 | google |
| 2 | yahoo |
| 3 | msn |
| 4 | google |
| 5 | google |
| 6 | yahoo |
+----+--------+
I would use SELECT DISTINCT name FROM names;
if it were a SELECT
query.
How would I do this with DELETE
to only remove duplicates and keep just one
record of each?
Answer
Editor warning: This solution is computationally inefficient and may bring down your connection for a large table.
NB - You need to do this first on a test copy of your table!
When I did it, I found that unless I also includedAND n1.id <> n2.id
, it
deleted every row in the table.
-
If you want to keep the row with the lowest
id
value:DELETE n1 FROM names n1, names n2 WHERE n1.id > n2.id AND n1.name = n2.name
-
If you want to keep the row with the highest
id
value:DELETE n1 FROM names n1, names n2 WHERE n1.id < n2.id AND n1.name = n2.name
I used this method in MySQL 5.1
Not sure about other versions.
Update: Since people Googling for removing duplicates end up here
Although the OP's question is about DELETE
, please be advised that using
INSERT
and DISTINCT
is much faster. For a database with 8 million rows,
the below query took 13 minutes, while using DELETE
, it took more than 2
hours and yet didn't complete.
INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
SELECT DISTINCT cellId,attributeId,entityRowId,value
FROM tableName;