I wrote an article with the
above title. This article has since been
published in the proceedings of the International Risk Assessment
and Horizon Scanning Symposium 2010 (IRAHSS) in
continues to chase the notion that systems should be capable of digesting
daunting volumes of data and making sufficient sense of this data such that novel,
specific, and accurate insight can be derived without direct human
involvement. While there are many major
breakthroughs in computation and storage, advances in sensemaking systems have
not enjoyed the same significant gains.
This article suggests that the single most fundamental capability required to make a sensemaking system is the system’s ability to recognize when multiple references to the same entity (often from different source systems) are in fact the same entity. For example, it is essential to understand the difference between three transactions carried out by three people versus one person who carried out all three transactions. Without the ability to determine when entities are the same, it quickly becomes clear that sensemaking is all but impossible.
Full article here.
I find most organizations have underestimated this principle: If a system cannot count, it cannot predict. While I covered this point in some detail in a previous post, this new article is more complete and has a section entitled Expert Counting Systems: Essential Ingredients For Sensemaking which covers such issues as:
- Expert counting engines should not rely on training data.
- Counted entities should accumulate features.
- Entities believed to be the same should be asserted as same.
- Expert counting benefits from favoring the false negatives.
- New observations should reverse earlier assertions.
- Full attribution/pedigree of each observation should be maintained.
- It should be fast in order to digest the historical data.
- It should be real time so that counting assertions can be made as the transaction is happening, in time to do something about it.
Anyway, long story short, expert counting is non-trivial, especially at scale, and lots more must be done in this area.
Miscellaneous Note: Over the years I’ve sometimes used the term Semantic Reconciliation (recognizing two things are the same despite having been described differently) to describe counting. And, many have heard me or others using the term Entity Resolution or Identity Resolution. Yes, more words that relate to counting … especially with respect to people or organizations: is this about one person or two? Unfortunately, trying to explain these terms to non-technical people has been a bit of work, so now in an attempt to make the concept more consumable … maybe the term “Expert Counting” is an improvement.