O’Reilly Media has just released an excellent book called: Beautiful Data: The Stories Behind Elegant Data Solutions. Lisa Sokol and I wrote chapter 7 entitled “Data Finds Data.”
Here is the
introduction:
The chapter in PDF is here: Data Finds Data (Provided here under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License)Next-generation “Smart” information management systems will not rely on users dreaming up smart questions to ask computers; rather, they will automatically determine if new observations reveal something of sufficient interest to warrant some reaction, e.g., sending an automatic notification to a user or a system about an opportunity or risk.
An organization can only be as smart as the sum of its perceptions. These perceptions come in the form of observations—observations collected across the various enterprise systems, such as customer enrollment systems, financial accounting systems, and payroll systems. With each new transaction an organization learns something. It is at the moment something is learned that there exists an opportunity, in fact an obligation, to make some sense of what this new piece of data means and respond appropriately. For example, does the address change on the customer record now reveal that this customer is connected to one of your top 50 customers? If an organization cannot evaluate how new data points relate to its historical data holding in real time, the organization will miss opportunities for action.
When the “data can find the data,” there exists an opportunity for the insight to find the user.
How data finds data is a statement about discoverability, the degree to which previous information can be located and correlated with the new data. Discoverability requires the ability to recall related historical data so that an arriving piece of data can find its place, similar to the way each jigsaw puzzle piece is assessed relative to a work-in-progress puzzle. Each new puzzle piece incrementally builds upon what is knowable, at each given point in time relative to the evolving puzzle picture. Often new pieces, although important to building out the bigger picture, do not themselves bring new critical information. (On the other hand, some pieces may change the shape of the puzzle in a way that warrants ringing the bell—finding that one piece that connects the palm tree scene to the alligator scene.) It is at this moment in time, when the new puzzle piece presents the opportunity to reshape the picture, that discoveries are made. Real-time discovery replaces the need for users to think up and pose the right question at just the right time.Organizations that are unable to switch to the “data finds data” paradigm will be less competitive and less effective.
RELATED POSTS:
Federated
Discovery vs. Persistent Context – Enterprise Intelligence Requires the Later
To Know Semantic Reconciliation is to Love Semantic
Reconciliation
Accumulating
Context: Now or Never
It’s
All About the Librarian! New Paradigms in Enterprise Discovery and Awareness
Sequence
Neutrality in Information Systems
Jeff, this is a must read piece for any product manager and architect in the scientific, technical and medical (STM) online information industry. Thanks for sharing the full-text.
Posted by: Rafael Sidi | July 27, 2009 at 07:15 PM
Jeff,
Really nice post!
I explicitly refer to "Data" as units of Observation in one of my demystifying Linked Data presentations [1]. I also refer to "discoverability" and its importance, esp. as the Web becomes a Linked Data mesh, in my post about SDQ (Serendipitious Discovery Quotient) [2].
Links:
1. http://tr.im/uARm - Linked Data Presentation section about Data, Information, and Knowledge
2. http://tr.im/iv9e - About SDQ
Posted by: Kingsley Idehen | July 29, 2009 at 07:30 AM
Jeff, This is an excellent article, giving deep insights why organizations need to move to "data find data" paradigm. Thanks for your sharing this.
Posted by: Mukesh Mohania | July 30, 2009 at 09:15 PM
The privacy implications and mining opportunities here are staggering. What if data find data that should not have been accessible to the "seeking data"? Who's accountable or liable if the correlations the seeking data derive from found data breach a privacy reg?
Intriguing stuff, thanks for the post
Posted by: Dave Piscitello | August 27, 2009 at 07:35 PM
Hello Jeff,
One step in this direction might be the use of a Subject State rules engine. The engine listens to all published events and reacts only to those events that are relevant to the subject’s current state. When the event and the subject rules are true then take the appropriate action. As appropriate, alter the subject’s profile data and possibly change the subject’s state, and the cycle begins anew.
I do share Dave Piscitello’s view on the privacy implications of this technology. One can see a future where customer profile databases are traded in back alleys at midnight.
Fascinating article! Thank you for the fine post.
Posted by: Jacques Spilka | December 30, 2010 at 04:37 AM
I would agree with Dave, great post.
Posted by: software consultant | January 22, 2012 at 09:03 PM