My Photo

Your email address:


Powered by FeedBlitz

January 2014

Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  
Blog powered by Typepad

Become a Fan

« Effective Counter-Terrorism and the Limited Role of Predictive Data Mining | Main | The Registered Traveler Program And Worrying About When Good People Go Bad »

December 29, 2006

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83452946769e200e5509542fa8833

Listed below are links to weblogs that reference It Turns Out Both Bad Data and a Teaspoon of Dirt May Be Good For You:

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Bruce Wallace

I have worked in years past with vendors like Identity Systems that have software to fuzzy match names of people and addresses. The counter-intuitive lesson I learned was that "dirty" data should not be converted to some canonical form, but that the dirty data should be kept to support clustering (and future re-clustering).

In my recent studying of Philosophy, I found that this was something deep and basic, and not a specific identity-management tweak. See...

http://existentialprogramming.blogspot.com/search?q=superman

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.