I love the fact that the best solutions are almost always the simplest!
I spend a lot of time sanitising data. I use Excel a lot of the time to perform most of the work — but I am slowly starting to use more and more SQL in my MySQL databases to do the work.
Today after importing a few hundred records of CSV data into a MySQL table I wanted to check if I had any duplicate values in a non-indexed field. After imagining querying all kinds of self-join tables, I came across this SQL query:
SELECT fieldname, COUNT( fieldname )
GROUP BY fieldname
HAVING COUNT( fieldname ) > 1