My Photo

Your email address:


Powered by FeedBlitz

January 2015

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Blog powered by Typepad

Become a Fan

« Effective Counter-Terrorism and the Limited Role of Predictive Data Mining | Main | The Registered Traveler Program And Worrying About When Good People Go Bad »

December 29, 2006

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83452946769e200e5509542fa8833

Listed below are links to weblogs that reference It Turns Out Both Bad Data and a Teaspoon of Dirt May Be Good For You:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Bruce Wallace

I have worked in years past with vendors like Identity Systems that have software to fuzzy match names of people and addresses. The counter-intuitive lesson I learned was that "dirty" data should not be converted to some canonical form, but that the dirty data should be kept to support clustering (and future re-clustering).

In my recent studying of Philosophy, I found that this was something deep and basic, and not a specific identity-management tweak. See...

http://existentialprogramming.blogspot.com/search?q=superman

The comments to this entry are closed.