My Photo

Your email address:


Powered by FeedBlitz

June 2008

Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          
Blog powered by TypePad

Main | February 2006 »

January 31, 2006

Yesterday’s Technology Review Story: Blinding Big Brother, Sort of

http://www.technologyreview.com/InfoTech-Software/wtr_16209,300,p1.html?PM=GO

While this Q&A piece is mostly focused on the work I have done in the area of advanced analytics inside the anonymized data space, it makes brief mention of Immutable Audit Logs (IAL’s).  While audit logs are nothing new, the notion of “immutability” is a more recent concept which introduces the idea that an audit log could be tamper resistant (e.g., not easy for anyone to tamper with or erase history).  And while I believe that IAL’s will become best practice, I started thinking about the possible negative secondary consequences of such robust logs.  Imagine a log that contains all user activity and all data flows whereby this superset of information is retained for a potentially very long time.  What would prevent audit logs from being themselves the source for misuse, unintended disclosures or unanticipated repurposing?  This is an area I care a lot about, so stay tuned.

January 28, 2006

What sharks? Reflections on the 2005 Western Australia Ironman

While my job is my #1 hobby, I also find it relaxing to suffer through endurance events. Take the 2005 Western Australia Ironman, for example. If you have not heard of such events, Ironman races are triathlons – long ones – consisting of a 2.4 mile swim followed by a 112 mile bike ride and wrapped up with a 26.2 mile marathon run. The pros do it in just over 8 hours and if you take longer than 16 hours they pluck you off the race course.

Generally I worry about just finishing these events, the risk being I never have the time to train like the rest of the real athletes. But with this event came an additional point of tension. Sharks.

Sharks didn’t even really enter my mind until I chatted with a real Ironman athlete who was sitting next to me on my flight to Australia. He said they use sonic buoys in the water on the swim course as they are proven to drive the sharks away – he added that the Western Australia Ironman is the only such triathlon in the world that uses such devices. This came about the prior year after there was a shark attack up the coast a few weeks before last year’s race. Comforting.

On Saturday, November 26th, the day before the race I was required to attend a mandatory race briefing. The swim director basically announces the following: "About the "S" word, we are aware of the recent report from the attack up the coast. To address this concern we are going to have a plane circling to look for the "S" word. If the safety team in kayaks on the race course begin to blow their whistles continuously, this means the race has been canceled for some reason. In which case, you are to swim to the wooden Bussleton jetty and fashion yourselves onto the wooden cross beams. Do not, I repeat do not attempt to swim back to the shore. You must wait for a motorized vehicle to pluck you from the jetty. The pre-race briefing continued by saying the good news is there are more deaths from lightning than the "S" word. The bad news is that a thunder and lightning storm is in the weather forecast for tonight and tomorrow morning." Comforting.

It was hard to sleep the night before the race and the thunder and lightning just added to the suspense.

The next morning the race was postponed by 30 minutes due to lingering lightning in the area. There were no sonic buoys that I know of and due to the weather conditions it appeared there were no "S" word spotting planes. My last coherent thought before swimming straight out to sea for over a mile then back was that I might be thrashing in the water like an injured seal more so than the rest of the athletes because the last time I had been swimming was on July 17th when I competed in the 2005 Zurich Ironman. Comforting.

Needless to say I lived. I also achieved my most fundamental goal … not to be last and to beat at least one chick. My time was 12 hour and 55 minutes and as usual there was nothing comfortable about it. I think I will do three Ironman races this year.

January 26, 2006

Sequence Neutrality in Information Systems

When I ask investigators or analysts what technology improvements they would most appreciate, invariably one of their top requests is “to get answers to their questions faster.”  This has always struck me as funny.  What if the question being asked today is not a smart question until next Thursday?  How can we expect analysts to ask every smart question every day?  In short, this is kind of like climbing a tree to get to the moon.  You can always inch further up, but how is that really going to get you where you need to go?

Systems that produce different answers based on the order of events lack a property I refer to as “Sequence Neutrality”.  Sequence neutrality means regardless of the order in which data or queries occur, the end-state, once all data points are known, is the same.  Sequence neutrality prevents systems from having to ask every smart question every day. 

Here’s an example.  Today when a bank searches for “Billy the Kid” the answer will depend on whether such a record existed first.  However, with sequence neutrality the moment “Billy the Kid” opens a bank account, regardless of when that occurs, the user making the original query can be notified.  Furthermore, months later if “Billy the Kid” is added to the OFAC list (people and organizations that financial institutions are banned from doing business with), the bank is instantly alerted.

As another example, government entities perform background checks on individuals seeking “top secret” clearances.  What happens if one of the systems used to favorably qualify a person thereafter receives a record that would suggest the applicant should receive additional scrutiny—a record shows up in a registered (and public) sex offender database shortly after the person is granted a clearance.  How will they learn of this new data point?  One option would be for the government to ask every question every day, which obviously is impractical.  So to address this scenario, the US Government performs background checks every five years.  But that means that a glaring problem in the data may not be discovered until the question is asked again—potentially years later.  In a system designed for sequence neutrality, the moment a relevant record comes into existence, it is published (pushed) to the relevant system or user.

When sequence neutrality is applied to information systems a very interesting effect is created: the “data finds the data.” What this means is that as each new piece of data is observed by the system, how this data relates to all previously observed data points is considered – without waiting for a user to ask a question.  And while this can benefit a single system it is even more powerful when applied across heterogeneous systems.  Suddenly, very interesting insight is possible.

How does a company recognize that its accounts payable manager shares the same phone number as its largest vendor (a relationship that can violate company policy if undisclosed)? 

When the “data finds the data” such insight and awareness is not only possible it is fundamental and essential to create market differentiating services. Whether an organization is focused on managing customer relationships, credentialing parties, evaluating credit risk or handling investigations– with sequence neutrality built in – unusually unique and powerful possibilities emerge.

January 23, 2006

Today’s FCW story about my anonymization work

Federal Computer Weekly ran a story today entitled "Flipside: A few minutes with Jeff Jonas"

http://www.fcw.com/article92036-01-23-06-Print

This is a fairly accurate piece, although it is not true that "a lot of government customers … use it with their existing systems."  They should though as it is my feeling that if governments (or any organization for that matter) are going to exchange sensitive information about people ... they ought to share the information in an anonymized form whenever possible.

January 21, 2006

Introducing the concept of network-centric warfare and “post before processing”

The Department of Defense (DOD) has been reengineering the way they integrate their war fighters, military platforms and command and control systems.  One of the big elements of this effort is known as “Network-centric warfare”.  Network-centric warfare basically means that the nodes of the network can find each other and exchange information in a manner that does not require a brittle (non-adaptive), high latency (slow), hierarchical communication channel.

Why?

During the first Gulf War one element of our military would sense the launch of a SCUD missile and based on its trajectory compute its geo-location.  Using hierarchical command and control systems and the ATO (Air Tasking Order) processes in place, 24-48 hours later the shooter elements of our military would be tasked to respond. Unfortunately, this did not matter because the Iraqi Army understood this process all too well, so they moved their launch vehicles in less than two hours after each launch.

Network-centric warfare changed the game.  During the second Gulf War sensor to shooter communications were non-hierarchical.  These two elements could effectively exchange information in a more peer-to-peer fashion.  This time lag between sensing and responding was now so diminshed that after any SCUD launch our military was able to retaliate against the launch vehicle in minutes, long before the vehicle was even able to retract its stabilization legs.  Net effect -- each SCUD launch vehicle never launched twice.

Another important transformation made by the DOD Command Control and Communication (C3I) division was the notion of “posting before processing”.  Prior to this radical change in thinking information was collected and routed to analysts for analysis and annotation before being posted and made available to the war fighter.  Relying on “processing before posting” denied war fighters timely content.  Imagine a Delta Force request for overhead imagery of the valley they will be encountering over the next ridge line.  In the old model where processing occurred before posting, their request would often arrive too late (i.e., the target had moved on).  In the new model where posting occurs first the Delta Force team can see the images as soon as they become available, make their own conclusions about them and rely on analyst annotation when necessary.  In the network-centric environment enterprise sensor content is made available to the edges (e.g., the war fighter on the front lines), and immediately where possible.

Why this post first?  I plan on blogging in a “post before processing” model.  This means that I am going to post before perfecting my post. As such, I reserve the right to tinker with previous posts when seeking to improve the quality of my workmanship.

January 20, 2006

Entering the weblog onramp

Today I find myself ready to begin blogging.  How this effects my future will soon be more clear.