Feb 222014
 
Article MySQL

Sometimes we may find that there are unwanted duplicate records in one of the tables of our database. Often we notice this issue when trying to create a unique key based on one or several of the fields in the table.

This post explains a couple of simple methods that can be used to remove duplicate rows from a mysql table, keeping only one of each group of duplicates.

Method 1 – Creating a unique index that uses the IGNORE modifier

Let’s suppose there is a ‘leads’ table in our database, with the following content:

At a given point, we might decide to keep only one record per lead, because it is enough to have an email address per user. But the name “john smith” is duplicated. If we try to create a unique key on the fields (name1,name2), we get an error:

To avoid this error, we can delete the duplicated records at the time the unique index is created, using the IGNORE modifierĀ of the “ALTER TABLE” command:

In some versions of MySQL, this procedure only works for MyISAM tables. If our table uses the InnoDB storage engine, we can convert it to MyISAM, remove the duplicats and convert it back to InnoDB:

Method 2 – Create a temporary table

In the previous case, we have seen that using the IGNORE clause has the desired effect of eliminating duplicates, but we have no control on which the record to keep and which other records to delete in a set of duplicates. Using a temporary table gives us more flexibility. First, we create an empty table with the same structure as the original:

next, we create the unique index on the new table:

next, the new table is loaded choosing only one record from the set of duplicates. En the select sentence, we can choose the value we want to keep for each of the fields that are not part of the unique index. For instance, we could load the table choosing always the shorter email in each group of duplicates:

Finally, the original table is dropped and the new table is renamed:

Index of posts related to MySQL

 Posted by at 9:05 pm

 Leave a Reply

(required)

(required)