Find the duplicate records with combination of two column

ghz 8months ago ⋅ 66 views

I have table (Project_Supplier_Status) in MySQL like below:

ProjectsupplierStatusTrack_Id
Pro1Sup1P1
Pro1Sup1P2
Pro1Sup1P3
Pro1Sup1P4
Pro5Sup5P5
Pro5Sup5P6
Pro5Sup6P7

Problem statement:

  1. If all duplicate Pro1 (project) have same Sup1 (supplier), then we have to keep one record of them as status P and discard the other records as status F.
  2. If all duplicate Pro5 (project) have multiple supplier (Sup5, Sup6), then we have to discard all the records as status F.

I need the Mysql query to execute above 2 operations on 100 000 records.

UPDATE

The desired output:

ProjectsupplierStatusTrack_Id
Pro1Sup1P1
Pro1Sup1F2
Pro1Sup1F3
Pro1Sup1F4
Pro5Sup5F5
Pro5Sup5F6
Pro5Sup6F7

Answers

You can achieve this using a combination of SQL queries to identify the records that need to be updated and then update them accordingly. Here's how you can do it:

  1. Update records for Pro1 projects with the same supplier:
UPDATE Project_Supplier_Status p1
JOIN (
    SELECT Project, supplier
    FROM Project_Supplier_Status
    GROUP BY Project, supplier
    HAVING COUNT(*) > 1
) p2 ON p1.Project = p2.Project AND p1.supplier = p2.supplier
SET p1.Status = CASE WHEN p1.Track_Id = (
        SELECT MIN(Track_Id)
        FROM Project_Supplier_Status
        WHERE Project = p1.Project AND supplier = p1.supplier
    ) THEN 'P' ELSE 'F' END;
  1. Update records for Pro5 projects with multiple suppliers:
UPDATE Project_Supplier_Status p1
JOIN (
    SELECT Project
    FROM Project_Supplier_Status
    GROUP BY Project
    HAVING COUNT(DISTINCT supplier) > 1
) p2 ON p1.Project = p2.Project
SET p1.Status = 'F';

These queries will update the Status column based on the conditions specified. Make sure to run these queries in the order presented above.