A few days ago I blogged a high-level synopsis of went wrong (enterprise amnesia) versus what our national security community needs (enterprise intelligence). In short, a smart system, among other things, would be expected to recognize if someone with an active visa later becomes a subject of concern. In such a system, the visa would be immediately reviewed for possible revocation.
As a “systemic failure” a number of things may need to be fixed.
Focusing on the technology issues, there are two things that absolutely must happen to not only catch more idiots, but more importantly arm the nation with the tools necessary to detect and preempt very sophisticated bad guys.
I am making my specific technical recommendations as a Christmas wish list. When thinking about a Christmas list, it is important to ensure the list is well thought out … as there is only so much money, and some toys last longer than others. And, like my kids eventually figured out, if one only asks for a few things that are within the budget, this dramatically increases the odds that they will get everything they ask for (as long as I can order it on-line, aka COTS).
Jeff Jonas’ Christmas Wish List
1. An identity resolved card catalog. Analysts (and/or their systems) at organizations like NCTC should be able to take one look in one place and determine what is known about someone – much in the same way the library has one card catalog. However, unlike the library, because this card catalog is identity resolved, cards about the same person are rubber-banded together. This is worth wishing for because it means when one searches and finds a card, they very well may find a bundle – a handful of related cards, each pointing to enterprise documents. Imagine that! Search for a name and date of birth, find a person, and find records in the enterprise related to them that had neither a name nor a date of birth, (e.g., only an email address). What is the value of this type of enterprise “discoverability”? Priceless. More here: It’s All About the Librarian! New Paradigms in Enterprise Discovery and Awareness.
Not to be greedy, but this index needs to be updated in real time and provide real-time responses when queried. Otherwise, the next item on the wish list will not work so well. Fortunately, real-time indexes like this are actually more efficient than batch systems. More about this here: Accumulating Context: Now or Never.
Fortunately, the policy calling for “discoverability” in the intelligence community is already in place. See: ICD501 - Intelligence Community Directive 501.
2. The data must find the data and the relevance must find the user. Every time an organization receives a new piece of data … the organization has just learned something. With each new arriving piece of data, the system must immediately ask: “How does this relate to what is already known?” This discoverability question is handled by wish list item #1. This is data finds data – the new data discovers if the enterprise has other related data. This happens as data arrives. For example, if a new subject is added to a bad guy database, the system automatically discovers that this person has a visa, and publishes this insight e.g., sending an alert to the organization that investigates and revokes visas. Of course, for this to work the identity resolved catalog would also need to contain active visas. Yes, I am saying the index contains pointers to both subjects of interest, and visa holders, among many other things – much in the same way a library card catalog would contain books on anthropology, travel, science, and so on. This is not difficult. More here: Data Finds Data.
With just these two (2) items, at least an organization will be able to make some sense of what it knows … and in time to do something about it.
BONUS SECTION for prudent Christmas shoppers. Here are some toys not worth placing on your wish list as these will likely break, get discarded and/or may contain unsafe levels of lead:
1. Profiling the behavior of air travelers. Modifying existing systems to do a better job profiling based on one-way tickets, no checked luggage, and paying by cash is a waste of time. For one thing, it is too widely talked about in the press. And to avoid detection a terrorist will no doubt be willing to pay an extra $1,200 for a round trip ticket, will waste a $40 suitcase, and will find a way to use a credit card (like get a debit card or steal your identity). Worse, so many folks purchase one-way tickets (like me) and check no luggage (like me) the number of false positives will be impressive.
2. Investment in middleware that automates federated search. The federated search approach is somewhat akin to roaming the library halls looking for a book, instead of using the index. Today, analysts often face this dilemma – manually selecting which system to search, one hallway at a time. Note: There is no army of any size big enough to solve this problem. Unfortunately, this leads some folks to wish they had a smart middleware layer that would take a user’s query and perform the federated search automatically, bringing the results of all these disparate systems together into a single consolidated user response. Be advised, no amount of investment can fix federated search. While there are a number of show stoppers that make this a bad idea, one of them is: for data to find data this means that every piece of arriving data is the query. Imagine hammering countless enterprise systems with that kind of volume while these systems are busy doing their day job. More specifics about why this is nothing to wish for is explained here: Federated Discovery vs. Persistent Context – Enterprise Intelligence Requires the Later.
3. A point-specific solution to fix this recent lapse. If the current system is “upgraded” to specifically address this latest scenario, it should catch this one type of low hanging fruit. Bad news. We need a system that detects sophisticated plots, not just idiots. What we really need is a more general solution designed to commingle data from all-source intelligence collections, weaving this data together into context, enabling enterprise discovery and insight. Such a system not only finds the obvious (idiots and similar circumstances), more importantly, it is the only way to find very weak signal and the non-obvious signatures of highly skilled bad actors.
On a related note: Have you heard of the Darwinian Awards? Terrorists dumb enough to use a match to light a fuse in public will also likely run out of gas on the way to the operation. Point being, the system must be able to detect and preempt very smart bad guys with ever changing tradecraft and attack vectors.
4. Standardization of existing systems and data. This will take forever and cost too much in part because there are too many systems to reengineer, and if that is not hard enough, how are we going to get our foreign partners to adopt our standard? Be assured that standardizing everything first is not required. More about this thinking here: Scalability and Sustainability in Large Information Sharing Systems.
5. Just more analysts. If analysts feel overwhelmed by data today, just wait until next year and see how much harder it is. We must change the paradigm. With more advanced analytics making sense of the data, analysts will be substantially more efficient, maybe 10x, maybe 100x – a true force multiplier. Only then, after the analysts are truly enabled, will an organization be able to properly determine the number of analysts needed.
These five (5) wish list items, and a probably a slew of others, are not worth much attention as the cost/benefit will not impress anyone (except the bad guys).
RELATED PAPERS:
Markle Foundation – Nation At Risk: Policy Makers Need
Better Information to Protect the Country
CATO
Institute – Effective Counterterrorism and the Limited Role of Predictive Data
Mining
IEEE – Threat and Fraud Intelligence: Las Vegas Style
OTHER
RELATED POSTS:
The Christmas Day Intelligence Failure – Part I: Enterprise Amnesia vs. Enterprise Intelligence
Nation At Risk:
Policy Makers Need Better Information to Protect the Country
Intelligent
Organizations – Assembling Context and The Proof is in the Chimp!
Enterprise
Intelligence – My Presentation at the third Annual Web 2.0 Summit
Enterprise Intelligence: Conference Proceedings from TTI/Vanguard (December 2006)
What
Do You Know? Introducing Perpetual Analytics
You
Won’t Have to Ask -- Data Will Find Data and Relevance Will Find the User
Sensing Importance: Now or Never
More Data is Better, Proceed With Caution
Scalability and Sustainability in Large Information Sharing
Systems
Big
Breakthrough in Performance: Tuning Tips for Incremental Learning Systems
Nice work Jeff -- you may be interested in my recent op-ed discussing the commonality in a series of events:
Systemic failures, by design
http://kyield.wordpress.com/
Posted by: Kyield | January 22, 2010 at 08:02 AM
I agree with Data finds Data but I would like to know about "Data build rules to find data". This would help not only identify consumers of data but evolve the rules that identify data. For example, some intelligence gathering would result in data finding out that it needs to be shared with USCIS. But if the rule of sharing does not evolve. The rule engine should be self feeding and ever evolving. I would expect my rule Engine to feed on all events, feed them to my CEP and at the same time my CEP evolves itself based on data set. New rules should evolve instead of being interpreted by humans.
Let me give an example. For Mr Abdulmatallab was identified as someone as potential threat. So some agency needs to create the data for him. Lets say CIA gathered info and entered into their system. Once data is created, then "Data finds Data". But someone needs to create the data. What if the travel pattern of Mr Mutallab were an indication? The system should feed on all possible events known for Mr mutallab, create an algorithm. It should try to see if there is any commonality with other events. For example, once Mr Mutallab was flagged, everyone who was not a citizen and in the same country as Mr Mutallab would need to be flagged for pattern matching. What is the rule identified here? Foreginers or non residents. All these data points should feed into rule engine to be extracted for creating rules for identifying consumer for data. With so many complex events happening, it is almost impossible for human to interpret events and correlate them. More over non-events would be impossible to identify. What if someone went under radar for years and then emerges in some crime. Such patterns need to be identified by system. Events need to be identified. They may not be in order, may be in order or could be randomly in and out of radar. Flagging Mr Mutallab is waiting for his Dad to help the agency. Doesnt the erratic stays of Mr Mutallab good enough to be flagged? In today's purposed life, any erratic behaviour needs to be flagged. For example, a policeman not only uses radar to find people who overspeed but watches for people changing lanes erratically. We need to build systems which can identify such individuals. Since this would be trying to identify an individual way before any credible intelligence, near real time is as good as real time. Potential individuals and their patterns would continue to evolve. I guess my thaughts arent structured but yeah there is a random pattern here.:)
Posted by: Tarun Srivastava | February 16, 2010 at 11:31 AM