General purpose sensemaking systems will colocate diverse data in the same data space. Such an approach enables massively scalable, real-time, novel discovery over an ever changing observational space – without re-engineering.
This of course should be no surprise. Ever since Von Neumann’s assertion that computer memory should be used for operating instructions and data – not two different memories for the different purposes – general purpose computing became possible.
“Von Neumann machines differ in that they have a memory in which they store their operating instructions and data. Such computers are more versatile in that they do not need to have their hardware reconfigured for each new program, but can simply be reprogrammed with new in-memory instructions; they also tend to be simpler to design, in that a relatively simple processor may keep state between successive computations to build up complex procedural results. Most modern computers are von Neumann machines.”
~ Wikipedia: Computer data storage
Data structure governs function. For example, a room full of DVD’s behaves one way and a SQL database behaves another. Same holds true with enterprise operational systems: The human resources system uses one data model and the hotel reservation system another – each underlying data structure designed for each specific mission – and notably, of little use to anything else.
With information trapped in the tailored database schemas of systems of record, operational data stores, data warehouses and data marts, it is no wonder organizations continue to struggle to make sense of it all – despite decades of effort and innovation.
Performing some kind of federated search over all these disparate data sets just has not ever delivered. In fact, federated search bites when it comes to sensemaking because the diverse data structures are incapable of supporting a sensemaking function.
If you want to be smart, you will want to jam the available, diverse, observational space into the same data structure and in as close to the same physical space as possible.
Data is data.
When reference data, transactional data, and even user queries are colocated in the same data structures and is the same indexes as the extracted features from text, video, biometrics, and so on … something very exciting happens: data naturally finds data and context can accumulate.
By way of background: I first stumbled into the importance of data colocation back in 1993 when designing a surveillance system for the casinos in Vegas – a system that would help them keep the bad guys out. After claiming I could build such a system in 90 days for $25k, I was forced to take some short-cuts. Honestly, had the casino given me more time and been willing to pay more money I would have created a much more elaborate system containing a number of tailored database schemas (e.g., different structures for customers, employees, bad guys, vendors, stored user queries, etc.). Given the time and money constraints, I had to make some compromises. I decided to design one schema to support everything. Each record in the system would then be designated with a role e.g., “Customer” or “Bad Guy.” Long story short, when this general sensemaking system came on-line it started finding marketing hosts comping their roommates and lots of other unanticipated novel discovery. So much novel discovery, it earned the name Non-Obvious Relationship Awareness or NORA, we got two rounds of funding, IBM bought my company to get its hands on the technology, and the rest is history.
Simply said, you have to have a brain (multi-purpose, general structure) to think (sense make). Then with a brain, the smartest you are going to be is a function of what observations you have properly contextualized into that meat space between your ears.
OTHER RELATED POSTS:
Smart Sensemaking Systems, First and Foremost, Must be Expert Counting Systems
Asserting Context: A Prerequisite for Smart, Sensemaking Systems
Puzzling: How Observations Are Accumulated Into Context
Federated Discovery vs. Persistent Context – Enterprise Intelligence Requires the Later
Accumulating Context: Now or Never
What Came First, the Query or the Data?
Enterprise Intelligence – My Presentation at the Third Annual Web 2.0 Summit
Jeff,
Spot on!
As we discussed, the need for business intelligence in corporate America alone is a huge opportunity, and when you transcend that need into Government and other applications, it is not just game changing, but it can change lives.
As an example, we're working on a project to help locate families of children who's parents have defaulted on them. The gathering of data from multiple sources that's placed into our database and sorted by our algorithms will help child support advocates find the best match for that child within seconds, and that will immediately change the child's life for the better.
Thanks again for your time and help, and I hope you have a happy Pavo !
John Lewis
Posted by: John Lewis | November 22, 2010 at 06:29 AM
you used comic sans! oh no!
Posted by: Jav | February 07, 2011 at 06:27 PM
Thanks for the great article, Jeff. FYI I referenced you in my take on how the data deluge might affect the emerging personal experimenting and self-tracking trend: The Big Bucket Personal Informatics Data Model - http://quantifiedself.com/2011/02/the-big-bucket-personal-informatics-data-model/
Posted by: Matthew Cornell | February 18, 2011 at 09:43 AM