Wednesday, October 27, 2021

    Delete Duplicate Emails

    Must Read

    We started this site to inspire young minds to motivate and encourage them towards Programming Language. In this site you will get programming tutorials, tech, programming facts, programming fun and programming blogs.

    Write a SQL query to delete all duplicate email entries in a table named,Person keeping only unique emails based on its smallestId.

    | Id | Email            |
    | 1  | [email protected] |
    | 2  | [email protected]  |
    | 3  | [email protected] |
    Id is the primary key column for this table.

    For example, after running your query, the above Person table should have the following rows:

    | Id | Email            |
    | 1  | [email protected] |
    | 2  | [email protected]  |


    Approach 1:
    DELETE p1
    FROM Person p1, Person p2
    WHERE > AND p1.Email = p2.Email
    (If the above condition like :< AND p1.Email=p2.Email, will keep highest id and removes other duplicate emails)
    Approach 2:
    delete from Person
    where Id not in
      (select * from
                             (select min(Id)
                                 from Person
                                 group by Email)
      as t)

    Update: Since people Googling for removing duplicates end up here
    Although the OP’s question is about DELETE, please be advised that using INSERT and DISTINCT is much faster. For a database with 8 million rows, the below query took 13 minutes, while using DELETE, it took more than 2 hours and yet didn’t complete.

    INSERT INTO tempTableName(cellId,attributeId,entityRowId,value)
        SELECT DISTINCT cellId,attributeId,entityRowId,value
        FROM tableName;


    Latest Articles

    More Recipes Like This