“Information sharing” is a hot topic these days. What drives this interest is the desire to improve decision making by ensuring that users are aware of enterprise content that has otherwise been trapped in isolated information silos. The objective is to construct robust Context for enterprise optimization. Having the right information at the right time in the right place matters a lot – whether the mission is to enhance customer service, detect identity theft or fraud, improve health care or secure our nation.
Picture in your mind ten different operational systems, each with its own mission-specific database (i.e., “isolated information silos”). What would information sharing really look like in this enterprise? Does information sharing mean every system must transfer all of its data to each other system? Or does information sharing mean that every system must constantly query all other systems in an effort to locate new context? And if Sequence Neutrality matters, which I think it does, how could either of the above sharing models deliver accurate, real-time situational awareness (Perpetual Analytics)? They cannot.
Over Sharing. In this model where every system broadcasts all of its data to all other systems, the show stoppers include enormous network bandwidth requirements, difficulty in maintaining information currency, inconsistent data protection schemes, inconsistent audit and access control mechanisms and, when dealing with sensitive identity data, legitimate privacy concerns. Not to worry, for entirely different reasons, data owners hate this model too.
Go Fish. In this model where every system asks every other system every question every day, again the show stoppers are substantial, including the inability of source systems to efficiently process unfamiliar queries, wrong answers caused by off-line systems, recursive processing required every time a query discovers something new, high latency, unacceptable network traffic and inconsistent audit and access control mechanisms. This model is similarly untenable because nearly every operational system on the planet would first need to be re-engineered to enable a functional Go Fish model.
Catalogs. What is better, in many cases, is the catalog model. Think of this like the card catalog at the library. In this analogy every aisle of the library is the equivalent of an isolated information silo. It would be unimaginable to roam the aisles expecting to efficiently find a relevant document (book). Rather, the card catalog provides a user with pointers to documents … i.e., directions where to go (who to ask). So instead of Over Sharing and Go Fish, with the Catalog model one can efficiently discover what needs to be shared. Data transfer is minimized in this model and scalability more certain – just look to Google for that case study. And as data owners fully control their own content, they can determine when to release what data to whom, and under what authority.
Information discovery is a critical precursor to information sharing. And Catalogs are one proven pattern for enterprise discovery. Once information is discovered, information sharing becomes more particular – because you know who to ask for what. Then to ensure policy is being followed one might implement various controls including Immutable Audit Logs.
What’s next? Anonymized and semantically reconciled catalogs.