7

I currently have a table called People. Within this table there are thousands of rows of data which follow the below layout:

gkey |    Name |  Date       | Person_Id
1    |    Fred |  12/05/2012 | ABC123456
2    |    John |  12/05/2012 | DEF123456
3    |    Dave |  12/05/2012 | GHI123456
4    |    Fred |  12/05/2012 | JKL123456
5    |    Leno |  12/05/2012 | ABC123456

If I execute the following:

SELECT [PERSON_ID], COUNT(*) TotalCount
FROM [Database].[dbo].[People]
GROUP BY [PERSON_ID]
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC

I get a return of:

Person_Id     | TotalCount
ABC123456     | 2

Now I would like to remove just one row of the duplicate values so when I execute the above query I return no results. Is this possible?

a_horse_with_no_name
  • 497,550
  • 91
  • 775
  • 843
LaLa
  • 301
  • 2
  • 17
  • I can't see duplicate value, Would you like to remove Fred or Leno (who have the same Person_ID) ? Is there a rule for taking such decision ? – Imane Fateh Jul 04 '13 at 11:12

4 Answers4

7
WITH a as
(
SELECT row_number() over (partition by [PERSON_ID] order by name) rn
FROM [Database].[dbo].[People]
)
DELETE FROM a
WHERE rn = 2
t-clausen.dk
  • 42,087
  • 11
  • 52
  • 90
0

Try this

DELETE FROM [People]
WHERE gkey IN
(
   SELECT MIN(gkey)
   FROM [People]
    GROUP BY [PERSON_ID]
   HAVING COUNT(*) > 1
)

You can use either MIN or Max

bvr
  • 4,656
  • 18
  • 23
0
DELETE FROM PEOPLE WHERE gkey=
(SELECT MAX(TABLE1.gkey) FROM (SELECT P.gkey, A.PERSON_ID,A.TotalCount FROM People P,(SELECT [PERSON_ID], COUNT(*) TotalCount
FROM [Database].[dbo].[People]
GROUP BY [PERSON_ID]
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC) A WHERE P.gkey=A.gkey) TABLE1 ) TABLE2
Mari
  • 8,739
  • 8
  • 30
  • 33
0

Use Top keyword with delete;

DELETE TOP(1) FROM [People]
WHERE Person_Id IN
(
   SELECT MIN([PERSON_ID])
   FROM [People]
   GROUP BY [PERSON_ID]
   HAVING COUNT(*) > 1

)

The query is same as posted by Vassy with just Top(1) addded...