My Photo

Your email address:


Powered by FeedBlitz

June 2008

Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          
Blog powered by TypePad

« June 2007 | Main | August 2007 »

July 30, 2007

The World is Not a More Dangerous Place

Back in the days when I had my company, Systems Research & Development (SRD), I prevented anyone from pitching my software using "the world is a more dangerous place" as the set up pitch.

Two reasons: (A) I think it is safer to be alive now than ever before and (B) I hate the idea of using the "fear card" to sell.

Before you call me crazy, consider the following: In the 1300’s the Black Death killed an estimated 75 million people – including a third to two thirds of Europe’s population. The 1918 Spanish Flu killed 50 – 100 million in just 18 months making by far the most destructive pandemic on record.

The average life span at the end of the nineteenth century in Western Europe was thirty seven. Today the average lifespan in the world is sixty seven! [Ref: Life Expectancy]

In short, you are more likely to grow older today than any time in the history of man.

Here is another point of reference: Even if America sunk into the ocean the 300 million deaths would be ~4.5% of the world’s current population (~6.7B). The 75 million lives lost to black death amounted to ~17.4% of the world’s population at that time (~432MM). Thus, if you were standing in America and discovered it was going to suddenly fall off into the ocean in the next few minutes, although this makes for a very bad day for you personally, overall the world still would be a less dangerous place as compared to the mid-1300’s.

Nukes complicate this equation. The two primary nuke scenarios being: a) one-se-two-se nuclear detonations carried out by stateless criminals; and b) a full scale global nuclear war causing the annihilation of mankind.

While periodic unscheduled 10-kiloton nuclear detonations would be very very bad, until such events exceed a few a year (or they go thermonuclear) – in the grand scheme of things us Earthlings are still safer than the 1,300’s. (True. If all of these events happen in a single geography, then while the world at large would still not be a more dangerous place, that specific geography would certainly be a more dangerous place!)

The scenario involving a full-scale nuclear exchange of large numbers of thermonuclear weapons deserves special attention. True, the risk of global nuclear annihilation was absolute zero before the 1900’s and today this risk is no longer zero. But, this risk ebbs and flows. One way to consider how this risk changes over time is the Doomsday Clock. Remember that? The idea being, the closer this clock is to midnight, the greater the risk of global annihilation. Its keepers calculated the time period 1953-60 as the closest the world has yet come to a doomsday event (2 minutes till midnight). [Note: the Doomsday clock was not adjusted in 1962 during the Cuban Missile Crisis as this incident came and went faster than the group reconvened and reset the clock.] Then from 1991 to 1995 the Doomsday Clock was rolled back to 17 minutes until midnight suggesting times were the safest since the inception of this clock in 1947. Notably, the clock shows that since 1995 the safety of the world has been declining. Nonetheless, at this point in time even when considering nuclear Armageddon, the world is less dangerous today than 1953-1960.

So in all fairness, when considering whether the world is a more dangerous place one would also have to ask "as compared to when?" and "as compared to where?" For example, if you called Chernobyl home on April 27th, 1986" you were definitely in a more dangerous place.

And one more thing … when the world seems like an incredibly dangerous place ... you can probably thank some of the media for that. The media’s ability to take every bad thing that happens on the planet and package it up for maximum sensation plays a huge role in spreading fear. It’s not their fault of course it is yours (and mine) as sensational news is what draws us into the media. And as our attention gives them higher ratings they justifiably work even harder at finding, packaging and delivering up even more of this bad news for us. [So, I propose we fix this by only directing our attention to "good news" stories from now on ok? Wait, is that a smoke plume on CNN … get out of my way … I gotta see this!]

Honestly, if you could pick another time to live, would you really trade living in this age for an earlier century? I wouldn’t. Oh, and I wouldn’t want to trade it for 100 years in the future either – I think the future has a chance of being really messy.

These could be the golden years!

PS: Before you get too excited one way or the other about this post, take this into account: This is my Yin post. Stay tuned for my forthcoming Yang post which will be entitled something like "More Death in Future Cheaper."

RELATED POSTS:

The Only Way to Actually Win the (Long) War on Terror

Web 2.0 – Al Qaeda’s Most Effective Force Multiplier

July 26, 2007

On Public Speaking

I do a fair bit of public speaking. In 2006 I spoke to approximately 7,000 people over the course of the year and this year I am on track for something like 15,000.

And while I get pretty good feedback, make no mistake about it … I hate public speaking.

This generally comes as a surprise to those who have seen me make a presentation.

I have come a long way. Believe it or not I used to be unable to speak to more than three people at a time. Back in my SRD days, when my staff grew to three employees, I stopped having staff meetings. Then one day in the early 90’s I attempted to present a time and attendance system to the CIO of the MGM Grand Hotel and Casino and his staff (6-8 people in total) … I was dysfunctional to say the least. I stared at a grease board, my back to the room, shaking, sweating and senseless mumbling.

Be afraid ... be very afraid!

After this horrifying incident, I realized that an inability to effectively communicate my ideas was going to severely blunt my lifetime potential. So, I asked my friend Doug Pool how he became such an accomplished presenter. His answer: Toastmasters. This amazing organization took me step-by-step from no ability to more ability … one super-scary step at a time. And guess what? It worked. Mind you, public speaking is still nerve-wracking … the difference now being … I know how to do it.

And now that I am doing a half-decent job at this whole speaking thing, here are a few tips (should anyone care):

PowerPoint is a Sedative

The more you presentation deck resembles everyone else’s deck, you lose. Most likely your audience has already been punished with grueling PowerPoint charts. You know what I am talking about … those information-overloaded charts with a mish-mash of tiny fonts, over animated, busy architecture and plumbing diagrams and loads of words which often state the obvious. The only thing worse is when the presenter then reads the words off the chart. No one is interested in this. So … if you are going to use PowerPoint, I recommend you spend some time developing a style and deck that is all you. By way of example, when I spoke at O’Reilly’s Third Annual Web 2.0 Summit, I buzzed through 41 charts in less than 10 minutes. It was almost all pictures (hand drawn by me in the PowerPoint scribble mode). I dreamt up this style about a year ago and think it is akin to a slow-motion movie synched up with a speed reader! [The story line here.] [The actual PowerPoint deck here.]

Crank Up the Signal!

When presenting … you had better say something every few minutes that strikes most of the audience as either "huh?" or "wow!" Otherwise, your audience may only be hearing "blah blah blah." In my attempt to do this, I might say something like, "The faster you collect data, the dumber you are likely to be." Without constant and meaningful signals, they will wish they were somewhere else, and then in self-defense direct all their attention to their BlackBerries. Creating signal can also involve doing something that will make it hard for them to ever forget you – for example, smash your guitar. Oh wait, that is a heavy metal band tip.

Don’t Punish Them On Your Watch

Never take advantage of the fact your audience is captive. Don’t waste their time by telling them anything obvious or widely known. And, if you discover you are boring them – the remedy is to jump to material that has a better chance of resonating with them. When in smaller settings, I preempt any fear by starting some presentations by saying, "If I start talking about something that you already know – stop me immediately" and "if this material is not interesting to you in the first five minutes, I’ll leave and give you some time back on your calendar." You would not believe the relief this creates. Furthermore, never ever overspeak your time. It is not fair to your audience or for that matter the next speaker. One exception, throwing the ball to the next speaker 20 minutes ahead of schedule (catching them off-guard), is not nice either … I did this once and felt real bad.

Make it Easily Digestible

If you find you are frequently losing people when you present, spend more time making your material more consumable. For starters, don’t use any words or acronyms your audience may not know. Don’t use any words or acronyms that may mean very different things to different people (e.g., data mining is such an overloaded term I often avoid using it). When I break this rule, e.g., when I use the word Context, I make a huge effort to explain what I mean. Another approach is to create your own terms and then explain them well (e.g., my use of terms like perpetual analytics, sequence neutrality, etc.). As a general principle, the deeper the think, the more simplistic and crisp the concepts must be presented. Don’t be afraid of bloating your presentation with pictures: pictures trump text 1000:1. Duh. (Word of caution: not all graphs qualify as helpful pictures!)

Miscellaneous Tips:

1. The bigger the venue, the more important it is to rehearse both lighting and sound. Have them demonstrate show time lighting because: (a) it is nice to know before hand if you are going to be blind up there and (b) it is wise to know how clearly your materials will project (if you have any). Do a full sound check during rehearsal to see if you are going to be in an echo chamber (something I discovered by accident twice this year in both cases at show time with great horror – the echoes were so distracting I could hardly think).

2. When you hear little voices in your head like "Run" or "Am I stuck in a thoughtless loop yet? How about now? Now?" … don’t debate these evil demons. Just move on.

3. Never call out (by name or otherwise) a competitive product or company. Never stoop that low.

4. The number one way to calibrate how effective you present is inversely proportional to the number people with glazed eyes, nodding off, and/or escapees.

I have a long way to go. For example, I still <quasi-expletive> at delivering a succinct and meaningful closing, I talk too fast, and I often wander off on a lot of unnecessary tangents. Gotta have goals! On this front I have this friend named Dick Hardt of Sxip Identity. He presents hundreds of charts in 10 minutes. His style is so unique and world-class that his video has been downloaded hundreds of thousands of times.

Check out this inspiring video: Identity 2.0 Keynote by Dick Hardt

Anyway, while I may never come to actually enjoy public speaking, without a doubt Toastmasters has made an enormous difference in my ability to express myself.

July 10, 2007

How to Use a "Glue Gun" to Catch a Liar

"People lie. How are you going to account for that?"

This question used to make me crazy. I always wanted to blurt out, "And the sun is going to consume the earth someday – deal with it!"

I never said this, of course.

Anyway, I have a more thoughtful response these days.

Try this on for size. Yep. People are going to falsify information. In fact, you may have experienced this in your life. Let’s say you had a friend – or so you thought. Over time you discovered that this person was in fact dishonest. How did you discover this? The answer is simple: you collected more observations over time.

Observations add up.

I have seen this play out in real data. For example, there was this very big database (billions of table rows describing hundreds of millions of unique people). In this particular database there was this one fellow who was repeatedly lying about his identity. He did a good job, in fact such a good job that despite Semantic Reconciliation processing he appeared to be six different people.

The guy was a liar and no one knew ... that is until future observations (created by his own actions) flushed him out.

[Skip this next paragraph, if you are speed reading or want to stay out of the weeds.]

Here is how this happened. Imagine six apparently discrete identities. Some name similarity, but that never matters at this scale. Then one day this fellow decides to use one of these identities (using previously reported features e.g., same name, phone, SSN, date of birth, etc.), except this time he introduces a new address, one that had never been previously associated with this identity. So this new record is identity resolved to the existing identity – the identity he wanted to present). This caused context accumulation – in this case the new address enhanced what was known about the person he was being today. Sequence Neutrality processing then fires-up to make sure earlier identity resolution events are still valid. During this process another identity was located that shared the new address (the one just learned) and other matching features (e.g., similar names and more). The identity he was trying to be had now become conjoined to one of his other identities – one he was trying to distance himself from. [Technical note: I am specifically using the term conjoined as opposed to merged. Think of conjoined like being rubber-banded together versus merged where two records become one. This is essential for many reasons e.g., retaining the ability to change one’s mind later. More about this in a future post.]

When two identities collapse into one identity – this new conjoined identity now has more context. As something new had just been learned, sequence neutral processing immediately determines if there are any further assertions of the past to fix (e.g., more identities that can be conjoined, or in some cases, disjoined).

Long and short, his six discrete identities collapsed into one … thanks to the arrival of two new records.

Knowing this, one thinks about what data sources are better than others. Some data sources are so good … they work like "glue guns."

From a national security and privacy point of view, it is the above behavior that makes it so important to debate what perceptions (observations) are fair game for context construction, and when.

RELATED POSTS:

More Data is Better, Proceed with Caution

Ubiquitous Sensors? You Have Seen Nothing yet

Accumulate Context: Now or Never

To Know Semantic Reconciliation is to Love Semantic Reconciliation

July 01, 2007

Context: A Must-Have and Thoughts on Getting Some …

I spent more than ten hours on this post; more than any other single post. And unfortunately, despite this effort, I feel this post deserves substantially more work.

Operating on a datum without first placing it into context is a risky proposition. Whether interested in mitigating risk or maximizing opportunity, no surprise, Context is King. And thus, from my point of view, Determining context is the most significant technical hurdle necessary to deliver the next generation of business intelligence.

So, if you must have context the next question is: "How do you get some?"

The construction of context primarily depends upon: A) the features available in an observation, B) the ability to extract the essential features from the observation, and C) the ability to use the extracted features to determine how the new observation relates to one’s historical observations.

Features, features, features. Without ample features, establishing context is hopeless. Take for example these two observations:

Observation #1: There were fewer fish.            

Observation #2: March ‘07 was warmer than usual.

BTW, there would be even less context if the second observation had been recorded in untranslatable Mayan symbols! But, had observation #1 stated "There were fewer fish in March, 2007," we would have some temporal proximity, and, if both observations also included the phrase "in the San Francisco Bay," we would also have geospatial proximity. As more features overlap across more observations, more context emerges.

Want context? Step 1: Get features.

Features matter. But not all feature matters. For example, in observation #1 above 6/7ths of all vowels are the letter "e." The essential features needed to construct context are generally: A) those features that will enable Semantic Reconciliation (i.e., recognizing like objects e.g., same document, same person, same thing, etc.), and B) those features that enable an understanding of relationships between objects (e.g., like documents, former roommates, occurring in the same place and at the same time, etc.).

Some observations include features that make semantic reconciliation a breeze (e.g., an RFID in a passport) but more often than not there is ambiguity. The same goes with recognizing relationships between observations – some observations present an explicit relationship (e.g., traveling partners) but more often than not relationships must be inferred (e.g., two people always entering the same building together).

Because context construction is dependent on features, "key feature extraction" is where "the rubber meets the road."

Big breakthroughs in context accumulating systems are going to first require big breakthroughs in feature extraction.

DEEPER TECHNICAL THINK:

1. Temporal and geospatial (when and where) are possibly the two most helpful features needed to establish context. And while useful establishing historical context, temporal and geospatial features provide essential context when determining what, if any, action is warranted now. [See: Responsible Innovation: Designing for Human Rights and Source Attribution, Don’t Leave Home Without It].

2. Context engines, at least in the Perpetual Analytics class I have been pounding my head on, cannot scale if every observation is simply treated with probability. I have concluded that high-speed, real-time contextualization (at least on today’s technology) requires that when assimilating an observation – some assertions must be made. In short, if confidence is very high … assert it as true! Unfortunately, future observations may invalid earlier assertions. Thus, context engines must constantly be on the lookout for new observations that change earlier assertions – and if a new observation provides such evidence – the invalidated assertions from the past must be remedied. This is Sequence Neutrality, and it is absolutely critical to context engines. Notably, this is very hard to do on real-time data feeds at scale.

3. When I have been referring to Persistent Context in my blog, I mean the physical information space (database) where all historical observations are assembled in context. And one cannot bulk load such a database with a "rack and stack" mentality and expect to get persistent context. To get persistent context, one must learn the past. That means taking the historical observations and streaming them into the engine. The engine then assembling how each observation relates to the others. Pop quiz: Do you think … the order one loads historical data matters? Answer: If it did matter, you are hosed. This is exactly the reason one must have this property in such systems. Therefore, the reason these systems have to be so screaming fast is not to keep up with the present … rather, to learn the past. Hence, my excitement about our recent performance breakthroughs.

4. Other extractable features, like Source Attribution, while often not essential to constructing momentary context (rendering a decision now) are nonetheless absolutely required to achieve perpetual context (e.g., think about an ability to correct or forget a misreported fact). [Related posts: Data Tethering]

5. I have come to the conclusion that the process of extracting features from observations is greatly improved when past experience (historical learnings, persistent context, or whatever you want to call it) are taken into account. In fact, it is my speculation that feature extractors will request substantially more bytes from the persistent context data store than the number of bytes the feature extractor will report down to the context engine. Further speculating that when contextualizing such highly refined feature sets, context accuracy and throughput will both improve. Leveraging accumulated context during feature extraction will significantly improve such things as entity extraction from unstructured documents and object recognition from videos.

7. Self-learning systems will promote new features of interest to feature extractors (akin to keeping an eye out for something particular). The inverse is true as well. Self-learning systems will demote (eliminate) interest in specific features. I think of this as intentional sensory deprivation. We all do this too – for example, right at this moment you (the reader) are blocking out that "background hum" … just stop for a second and listen. Right?

8. Now call me crazy. Have you ever had someone speaking to you – while you are sitting there thinking – "I know they are speaking English" – but you just could not decode it. Then, like a miracle, you replay it – the entire statement, word for word in your head. And presto, it is all clear now. (Would someone please admit this happens to you. too?) Using this caching mechanism for a replay – we throw some additional attention (more CPU) at this observation, error correction improves, and we get a useful decoding of key features. Cool! Notably, context engines will benefit from this too.

9. Expect convergence in the likely places. Unstructured with structured. Biographic and demographic with biometric. Audio and video with text. Efforts to make greater sense of available observations will entail cross-sensor fusion. Just the way we work.

PREDICTION:

When next generation feature extraction engines and next generation context accumulating engines converge, these systems are going to be the underpinnings of very, very smart systems. Add real-time and relevance detection … and you have more than situational awareness … you begin to approach the cognitive domain.

PRIVACY RAMIFICATIONS:

All this adds up to a double-edged sword from a privacy perspective.

The good news is: More context means fewer false positives and fewer false negatives – this is especially good news as this relates to government watch lists. [More about this here.]

The bad news is: If more data makes for more context, and everyone wants more context to ensure they are making the best possible decisions … everyone is going to want more data! [More about this here.]

RELATED POSTS:

Enterprise Intelligence – My Presentation at the third Annual Web 2.0 Summit

Enterprise Intelligence: Conference Proceedings from TTI/Vanguard (December 2006)

Intelligent Organizations – Assembling Context and The Proof is in the Chimp!

Sensing Importance: Now or Never

Accumulating Context: Now or Never

Federated Discovery vs. Persistent Context – Enterprise Intelligence Requires the Later

More Data is Better, Proceed With Caution

It Turns Out Both Bad Data and a Teaspoon of Dirt May Be Good For You

Streaming Analytics vs. Perpetual Analytics (Advantages of Windowless Thinking)

It’s All About the Librarian! New Paradigms in Enterprise Discovery and Awareness

Scalability and Sustainability in Large Information Sharing Systems