Say I have a table that has three columns, dateReg, email, age. I want to find duplicates and display all the columns. If I do this it shows duplicate emails, just showing that column: Code: SELECT email, COUNT(email) AS NumOccurrences FROM users GROUP BY email HAVING ( COUNT(email) > 1 ) But if I say Code: Select email, dateReg, ... and the rest of the code below this line is the same, it gives me an error Msg 8120, Level 16, State 1, Line 1 Column dateReg' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. How do I display duplicates, but with all columns of the duplicate entry showing?
Duh! Code: .. group by email, dateReg having(count(email) > 1) Sorry I thought I tried this and failed. Getting tired...
I understand what the problem is. Since it can only return one row, but there are multiple occurrences, which of the rest of the row is it supposed to display since they are all different? This is ridiculous. Maybe I will just read the table into C# and do it all there. The problem is that this is the sort of thing why I was using SQL!!! LMAO
Say you are storing news in a database, and each story is indexed by date:time. How in SQL do you return all the columns (one column being the actual text) pertaining to a given day regardless of time? This has to be easy!
This template got me closer, but still no cigar Code: SELECT Delegate.DelegateID, Delegate.FirstName, Delegate.LastName FROM Delegate INNER JOIN (SELECT FirstName, LastName FROM Delegate GROUP BY FirstName, LastName HAVING Count(DelegateID) > 1 ) AS Duplicates ON Delegate.FirstName = Duplicates.FirstName AND Delegate.LastName = Duplicates.LastName
Got it. Remove the second field from the temp table Duplicates. Strange, I don't know why that works, but I don't care for now.
this is ok - basically creating intermediate view on the fly. note that your request does not make much sense from the dba standpoint... why to show all duplicate rows in the first place? why not just one record for each duplication?...maybe with a counter column added? and that's after ignoring issue of indexes and primary keys... p.s. also you should label all fields and tables with aliases in order not to mess up...
I am not sure I follow. Are you saying that it makes no sense to show duplicate rows? In the example I gave before, it makes perfect sense. Say you are storing news items in a database. Say your columns are GUID(PK), DateTime, Headline, NewsStory, Author. Say you want to return all the news stories and the author for a single day by a given author. Clearly this is a table that is a bunch of duplicate entries. Say you want to see all the rows that have multiple authors for a single day (any day),where select shows only the date part (not time) and the author.