My Photo

Your email address:


Powered by FeedBlitz

June 2008

Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          
Blog powered by TypePad

« April 2006 | Main | June 2006 »

May 24, 2006

Responsible Innovation: Staying Engaged With the Privacy Community

While attending Esther Dyson’s PC Forum earlier this year I had a brief conversation with a consultant who advises senior government officials. During this conversation I mentioned that I am spending 30-40% of my time working in the area of privacy and civil liberties. This comment gets a nod. And then I highlighted proudly that I spend a lot of time speaking with key folks at many of the leading privacy advocacy groups. I got the most unexpected response – to paraphrase her, "Why? The privacy community only kills projects, they are crazy and useless. I would not go near them!"

In a rebuttal I add that these are, for the most part, very rational people, people who really care about balancing security and civil liberties, people with very insightful perspectives. And, in many cases, this improved insight is something that can be translated into more responsible innovations.

She looked at me like I was from Mars. Desperate to figure out the deep disconnect, I asked her when was the last time she had had a conversation with the folks in the privacy community. She replies "10 or 15 years ago!" What’s wrong with this picture?!

So now when asked why I recommend maintaining a conversation with the privacy community, I find myself trying to "sell" folks on the importance of this concept by throwing in this possible motivation, "A more privacy-responsible system is a more sustainable system," or said another way, "Should your program cause enough consumer surprise, your program will be shut down with prejudice."

And while there are no perfect programs, there are certainly better programs. And listening to the privacy community leads to better programs.

For starters, here are some organizations worth listening to and dialoging with:

  • American Civil Liberties Union (ACLU)
  • Cato Institute (CATO)
  • Center for Democracy and Technology (CDT)
  • Electronic Frontier Foundation (EFF)
  • Electronic Privacy Information Center(EPIC)
  • Markle Foundation, Task Force on National Security in the Information Age
  • May 18, 2006

    Hunting Bad Guys, Phone Records and a Few Good Dead Men

    Make no mistake – applying link analysis against communications traffic is a highly effective way to find a few bad guys.  One of the best examples in open source happens to be how the Cali cocaine cartel used phone records in their counterintelligence operations to hunt for moles.

    As I mentioned in a recent post [1], the world’s insatiable hunger for more data and better tools escalates on the grounds of competition.  Imagine the high stakes game of “cops and robbers” that is going on between large criminal narcotics organizations and government counter-narcotics groups that target them … each side trying to maximize advantage (human capital, information, tools) to attack, defend and dominate.  At stake: billions of dollars and human life.

    A great way to dominate in such a competition is to have unanticipated insider knowledge of the adversary’s strategy and operations. One obvious way to get such information is the use of moles (e.g., paid informants).  Drug cartels, fully aware of this exceptional risk, employ sophisticated practices to ferret out and kill such informants (try not to think about chainsaws right now).

    Here is one very telling, and chilling, glimpse into this world.  In 1996 the DEA discovered that the Cali drug cartel had a “mainframe” computer with a database containing the phone records of all Cali residents.   Using link analysis to cross reference phone calls that occurred between the cartel’s own people and American and Columbian narcotics officials (including US diplomatic, military and DEA personnel) the Cartel was able to detect, capture and kill at least 12 informants. [2]

    When having a starting point, link analysis can be very effective. [3,4]  I have seen this hold true for various low signature threats including both the asymmetric and insider threat.

    Never underestimate the ambition of the adversary to covertly acquire and develop similar (and wherever possible better) technology, information, methods and sources.  To this point, and speaking as an inventor, over the years I have actually had a number of inventive ideas that I have chosen to never reveal to anyone.  Why?  Because, I have come to believe that one way to evaluate responsible innovation is by using this simple test: … “Would you be willing to have your adversary use this invention against you?” 

    [1] Ubiquitous Sensors?  You Have Seen Nothing Yet
    [2] Columbia Cartels Hum With High Tech
    [3] The Six Degrees of Kevin Arbitrary
    [4] Sometimes a Big Picture is Worth a 1,000 False Positives

    May 16, 2006

    The Six Degrees of Kevin Arbitrary

    Everybody is related to everybody at six degrees ... so they say.  So starting with Osama bin Laden, at some shallow depth of link analysis, you are connected.  This notion begs the question: When starting with a bad guy, what kinds of links between entities (people and organizations) are worthy of attention?

    Got links?

    Not every link is as useful as the next.  For example, when analyzing newspaper stories it is not useful to simply consider close proximity names a link.  Al Qaeda and Condoleezza Rice appear to be connected over 2,000,000 times according to Google.  Heck … Google also says I’m connected to Mohamed Atta over 100 times.  And since you are reading my blog, my apologies, now you too are connected to Mohamed Atta at just two degrees.

    Other more solid connections can be useful in some settings, not others.  If you have ever seen a compiled public records report, the type purchased by private investigators, you would quickly note that at least half the named people on your report are people you do not know.  All of the neighbors on your street.  The person that owned your car before you.  The people who lived in your house or apartment before you (college dorm rooms are a classic example).  You likely know few, if any, of these people.  And while very useful from an investigatory perspective (e.g., a parental abduction or murder investigation), such connections are less meaningful in the context of predicting the next terrorist.

    Noting these observations in the early 1990’s, I chose to focus on a more narrow connection … a connection I would refer to as “relationships” – specifically, the likelihood that two people know each other in a close, personal sense.  For example, when inventing software to protect casinos from “tightly held conspiracies to do evil” (see inverse post here), it became self evident that not only must one start with a bad guy but also pursue relationships in a very narrow manner.  Employees who handle cash who are roommates with gaming felons present some risk.  Employees would be expected to disclose such.  Is this a telltale sign of a criminal intent or a crime?  Not in the least!  Is this something worth a little more attention than my mom?  Well when it comes to casinos, and their expected levels of due diligence, the answer is yes.

    What kind of data proves useful in expressing a close personal relationship?  Well this generally involves either shared resources (homes, cars, phones) or personal communications (e.g., calls, emails, care packages, money wires).  There are a few others, but I will have to let your mind wander as I would hate to tip off any evil doers.

    Even when starting with a bad guy, and following only close, personal relationships, the usefulness of the trail still degrades very quickly.  That is unless the trail leads to another previously known bad guy … then of course, those in between are certainly a bit more interesting.  Link analysis brings with it many interesting national security policy questions.  To name a couple:  How deep should a government agency be permitted to see? And if there is another evil doer at the other end, does that provide probable cause to see the entities in-between?  And it better be much less than six degrees, huh?  Otherwise, all paths lead to you, me and Mr. Arbitrary.

    May 11, 2006

    Super Consultants

    Once upon a time I ran a company called Systems Research & Development (SRD) that specialized in building custom software.  We had an unusually successful track record for such a business.  Failure was not an option and in our mind every customer had to become a raving fan.  Question: What kind of a consulting team does it take to successfully design and build ground up, one-of-a-kind software systems – year over and year?  Answer: Super consultants.

    Introducing “The Super Consultant”

    Super Consultants are professionals who direct, guide, drag, push, or pull a project through to the end.  They take personal responsibility for the entire project, including external influences, despite any “obvious” proof that certain events were beyond their control.

    Everyone has heard the phrase “the customer is always right.”  But the Super Consultant lives to a higher standard.  Every glitch in the project that threatens success is the responsibility of the Super Consultant.  There are no excuses.

    When any consulting team sells a project they are saying, “We are experts in this area.  We have deep experience in project management and understand the complex dynamics that can affect such engagements.  We understand how to deliver systems on time and on budget despite obstacles.  We know how to work with organizations like yours towards achieving success.  Trust us.”

    Then when the first “big surprise” turns up, these same consultants more often than not find themselves running around explaining why they are victims of the unexpected, positioning for more time and money.  I have 100 examples, but here are a two.  The customer’s project manager quits mid-project or some executive is actively sabotaging the project from within.  The Super Consultant on the other hand stands up and says “yes, we have seen this pattern before … no worries.”  Turnover?  Infighting?  Corporate restructuring mid-project?  Mergers and acquisitions?  Death?  Seen it, been there, done that.  The Super Consultant makes no excuses and when they are caught by surprise, they say “my fault for not being prepared for this.” They re-group and apply Herculean effort to get back on track, and then they build in that new pattern to future planning so it never bites them again.  This is the life of a Super Consultant.

    I believe that the social capital (as measured by the customer’s confidence) generated by Super Consultants is much more valuable than cash.  And to no surprise, there happens to be no better way to generate more business and more cash. So, while there is no absolute in the consulting business, there is a scale – ranging from bad, to run-of-the-mill, to great, and then above that, the higher standard is … the “Super Consultant!”

    May 08, 2006

    Ubiquitous Sensors? You Have Seen Nothing Yet

    Organizations, and that includes governments, are essentially in a competition.  And in a competition one seeks to dominate.  Domination involves maximizing one’s available resources along three areas of competitive advantage:

    • A human capital advantage
    • An information advantage
    • A tools advantage

    Human capital advantage includes sharp leaders establishing superior strategy, enacting smarter policy, implementing sustainable processes, and focusing resources on relevant information and implementation of appropriate tools (e.g., technology).  Information advantage involves having the right data at the right place at the right time.  And, a tools advantage includes physical resources (e.g., wood, hammers and nails) and machines (e.g., a communication system, computers, etc.).

    While human capital is the prime mover within these three areas of competitive advantage, each of them are dependent on the other.  A lack of advantage in any category diminishes the advantage of the other (e.g., less information availability limits the advantage of human capital and tools.)  So in practice, if one wants to empower existing human capital assets, one option is to improve the information advantage, another option is to improve the tools advantage, or better, both.

    So why are information collection and technological innovation (tools) unstoppable?  Because, competing entities are keenly focused on staying ahead of their adversaries.  Therefore, without a doubt, our future holds more sensors, more information flows and more computer-guided observations.  This is inevitable.  This is the truth about the future.

    As we witness our society racing ahead with surveillance-enabling sensors and platforms (tools) in the spirit of competitive advantage, this unstoppable momentum can easily drive one into an apathetic perspective when thinking about privacy and civil liberties, freedom of motion and anonymity.  But I prefer to spend my energies thinking about what kinds of privacy-enhancements can be innovated into these next generation technologies.

    A few months back I authored a post entitled, “Responsible Innovation: Designing for Human Rights,” which introduces some thinking along this line.  I am also hopeful that technologies such as Analytics in the Anonymized Data Space and Immutable Audit Logs will further contribute in this area.

    The more innovators, designers and engineers engage the privacy community and spend time thinking about socially responsible technology, the better off this planet will be.  And while this effort will be far from perfect, I still believe it is better than doing nothing.

    May 07, 2006

    Sometimes a Big Picture is Worth a 1,000 False Positives

    I hear from time to time the idea that one giant picture containing all known links between people and/or events could provide the analyst the visual stimulation needed to discover the next big clue.  I cannot visualize this … or I should say visualize this as being useful.

    Almost immediately following September 11th the FBI began carefully disseminating a terrorist watch list across corporate America because they needed immediate and wide-scale assistance with their investigation.  This made some news as it can be hard to keep the list current and out of the wrong hands (e.g., the sensitive list ended up on a web site in South America).

    During this time, my little company (SRD) offered to help by donating our NORA (Non-Obvious Relationship Awareness) software and our time to help a few companies accurately match the FBI watch list they received against their internal databases.  Without our help, they had little chance of producing any accurate results whether by human searches or automated algorithms.  For example, how would they account for the 100+ spelling variations of Mohammed?

    Simultaneously, the investigative journalists began publishing link charts of how the terrorists were connected to Mohammed Atta and how Atta was connected to Osama bin Laden.  Then some folks started suggesting the shapes of these networks held clues, telltale signatures now detectable by observing the unique shape of a network cluster.

    My body of experience would suggest otherwise.  Because as one amasses larger and larger sets of data, the shape of the network becomes less and less relevant (at least when hunting for bad guys).  Think of traveling sports teams or family reunions, might these networks look like Atta’s network?  In large populations of data, I believe the false alarm rate of this “pattern-based” network analysis is virtually useless.

    What matters is the entrance point into the network.  For example, starting with a known bad guy or a communication from an Al Qaeda safe house?  Observing the network from such a vantage point is useful.

    So I crafted a picture (drawing only from press clippings and other public sources) about how the network actually looked when starting from such an entrance point – in this case, Nawaf al Hazmi and Khalid al Mihdhar, known terrorists believed to be in the United States at this time. [This is well documented on page 271 and 272 of the 9/11 Commission Report.]

    I created this specific picture to demonstrate that one did not need vast oceans of medical, financial and communications data to disrupt the 9/11 attacks.  Rather, concentrated scrutiny on a small network, a network isolated as interesting by starting with a few known bad guys.

    Shortly thereafter, my work depicting the September 11th terrorist network as seen from this vantage point appeared in various policy papers (e.g., page 28 of the Markle Foundation: Protecting America’s Freedom in the Information Age) and media accounts (e.g., Newsweek: Geek War on Terror).

    This is the back story behind the 9/11 link chart I created.  And I share this perspective every time I hear that people are spending time and money on technology to present gigantic graphs to users with the notion that somehow they will be able to navigate the chart and discover the next big clue.

    On a more subtle technical point: Even when observing a network from a specific vantage point, what data one uses to construct the network becomes critical.  As it turns out, a lot of data in this world is not helpful for this mission.  I intend to post more on this subject including some basic rules about what data is and is not useful in link analysis.

    May 05, 2006

    Thinking about Jet Lag at 32,000 feet

    When I first started traveling from Las Vegas to the East Coast several years ago, I remember the jet lag as being quite noticeable.  Eventually after bouncing back and forth across the country the jet lag became unnoticeable.  I later realized my internal clock had adjusted to somewhere in the middle of the country.  Well, my travels have become somewhat more extreme.  Take this last week for example:

    • Sunday, April 23rd                   Las Vegas, Toronto
    • Monday, April 24th                  Toronto, Ottawa
    • Tuesday, April 25th                  Ottawa, Washington DC
    • Wednesday, April 26th             Washington DC
    • Thursday, April 27th                Washington DC, Las Vegas
    • Friday, April 28th                    Las Vegas, Los Angeles, Las Vegas
    • Saturday, April 29th                 Las Vegas, Los Angeles, Singapore
    • Sunday, April 30th                   (lost in space due to flight to Singapore)
    • Monday, May 1st                     Singapore

    And now at this exact moment I find myself on a plane in route from Singapore to London (Singapore Airlines has wireless!) contemplating which time zone would be ideal for my internal clock to be synch’ed up to for this kind of week.  So I am thinking there is but one hope ... the time zone at Earth’s core would be perfect.  Home of the “magma” as Austin Powers would say.

    What Came First, the Query or the Data?

    For some strange reason we have all come to believe that there is data and then there are queries.  Over the last few years, I have come to conclude that this is not only odd, but also a mindset that is preventing information systems from being substantially more useful.

    When did we start thinking that queries are not data?  When a user conducts a search there is this underlying assumption that the data being looked for is both a) known and b) posted.  That is a pretty significant speculation that in some settings may produce odds of no better than 50/50.  Could that mean half the analytic answers generated by systems are incorrect?

    One of the more significant blog postings I have made (at least in my mind) is about the significance of Sequence Neutrality in Information Systems.  In the context of search this means that while we traditionally expect queries to find data, in sequence neutral systems the data must have an equal ability to find the earlier query.  And the best way to deliver this at scale is to treat queries as the data itself.  And when I say “treat” I mean manipulate, process and store queries in the same way.

    Then … when queries are treated like data, one discovers that queries also find queries.  And this is cool because this allows a system to recognize that two users have asked the same or related questions, despite the fact there was not any underlying “data.”

    There are already systems where data and queries are working together to give users significantly better intelligence.  Two examples that come to mind are Google and Amazon.  Google notices what people have searched for (and selected) in the past to better order their search results for you.  And Amazon makes tailored suggestions for you using the search (and purchase) interests of others.

    And while Google and Amazon lack the property of Sequence Neutrality … no need to worry, as there are probably not too many users of these services where lives or millions of dollars are at stake.  However, in mission critical systems where analytics make or break the enterprise, one would want to know if yesterday’s answer is now believed to be entirely wrong.  And you would want to know right now!

    I think that the next generation of business intelligence systems (e.g., like Perpetual Analytics) are going to build on this notion that the queries are the data.  Thus the answer to the question “what came first, the data or the query” will be moot.