My Photo

Your email address:

Powered by FeedBlitz

April 2018

Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          
Blog powered by Typepad

Become a Fan

« IEEE Paper: Threat & Fraud Intelligence – Las Vegas Style | Main | It Turns Out Both Bad Data and a Teaspoon of Dirt May Be Good For You »

December 12, 2006


Feed You can follow this conversation by subscribing to the comment feed for this post.

Patrick Herron

I enjoyed your article and found it interesting. I feel the lack of positive cases of terrorism does, um, undermine the ability to perform supervised learning.

There are certainly conceptual problems in using mining to provide sentinel for further humint operations. Your analogy of the medical decision process is a misleading one, however, on a couple of different levels. For one, machines are violating people's privacy, not people. If a stone sees me naked I don't feel my privacy has been violated, namely because a stone has no consciousness. It isn't aware and cannot act. So a machine has the capacity to enable a person to do so in the future, and that's a problem, certainly. but that's a problem with managing data and privacy. But there's two other aspects essentially misleading about the example of the 3 million false positives. For one, having a biopsy is far more painful than having your email parsed by a machine behind your back. We don't mine for cancer, a far greater threat to our well being. this leads me to the second essential misleading point--that the cost and benefits are not being porperly analyzed in the analogy. Not only are they bootstrapping us to a false impression of cost, they are ignoring the easy use of a cost matrix. technically speaking in the case of actionable intelligence you want extremely high precision rather than recall if your goal is to prevent terrorism without violating people's rights or wasting tons of humint resources.

A couple more things--I think what we're talking her is not really just data mining but also text mining. I think, perhaps for different reasons than Marti Hearst does, that data mining and text mining are not equivalents. A sentinel system is going to be built on a combination of text mining and data mining, not data mining per se, though of course data mining and text mining are hardly disjoint. Also, supervised learning is not the full extent of mining. Unsupervised learning could be utilized to learn more about populations that foster greater rates of terrorist acts.

Mining to generate actionable intelligence is a task that faces numerous barriers. But foremost among the barriers is certainly the lack of frequency of terrorist acts. Great point. Nice work.

Jaap Vink

Great article but I do agree with Patrick that Data Mining, especially in intelligence and law enfordement, is more than supervised learning only. Unsupervised techniques and text mining are very much part of such an environment. Also using a combination of unsupervised and supervised learning to emulate the thinking process of intelligence officers can help organize information in a very efficient and effective manner.
Colleen McCue recently published her book (Data Mining and Predictive Analysis: Intelligence gathering and crime Analysis, publ. Butterworth-Heinemann) based on het practical experience in this area which covers bopth the drawbacks of infrequent results and the implications of false positives in this area. She illustrates a pragmatic approach based on several cases from the "real; world".
Again: nice article because we do need to be very careful about promising tha data mining is the silver bullet for all intelligence issues. It is a very effective analytic approach that can complement other activities.

Ernie Chan

Jeff: Great article on data mining and counterterrorism. I used to be a researcher in statistical pattern recognition at IBM's Human Language Technologies Group but now work in financial prediction. I wrote an article on my blog some weeks ago espousing the view that data mining and AI are not suitable for financial markets prediction either, for very similar reasons. Best,Ernie

Stephen Taylor

First, my comments are my own and don't reflect my company's position. I was glad to see that you tackled the subject and I think your conclusions are good. Many of the post-9/11 actions have been reactions without logical basis.
But back to your article, I don't believe you sold your analysis very well because of a weak definition of predictive data mining and because you are totally excluding the posibility of a predictive component in data mining.
If the connections that you listed were less obvious to the human mind, maybe data mining could reveal them. Using that data, an experienced law enforcement person or analyst might then provide the predictive "ah ha" that leads to a terrorist plot. As computer systems become more "intelligent:, I think it is reasonable to assume that computers will be able to deduce threats from patterns, at least of a general nature. Ex. flying lessons mean an aircraft attack.
In conclusion, I worry about the privacy issues and about the ability of most analysts to construct useful queries that don't waste time and resources, but I think data mining is here to stay. The US needs an umbrella law that describes privacy and establishes some sort of template for information that can be collected and under what circumstances. Our current laws are directed at specific professions such as medical or at reporting regulations. There is no general privacy law, as I believe Australia has.


Hi Jeff.Thanks for your post about "Effective Counter-Terrorism and the Limited Role of Predictive Data Mining".Your article make great idea to me.Thanks.

The comments to this entry are closed.