I don’t really like using overloaded words like this – words that mean many things to many people. Nonetheless, the subject of “ontology” is worthy of brief mention.
First, one definition from Dictionary.Com:
Ontology -noun- (computer science) A rigorous and exhaustive organization of some knowledge domain that is usually hierarchical and contains all the relevant entities and their relations.
The main problems I have with spending any energy on pre-constructing well thought out ontologies are two fold: 1) change happens; and 2) there is no such thing as a single version of truth.
While some things are well suited for ontologic declaration like planets, species, currencies, dates and times (a small universe of easy stuff) … many things are not well suited for such pre-defined structural classification (a larger universe of hard stuff) as this leads to non-scalable, non-sustainable, brittle information management systems. No one makes a better case for this than Clay Shirky in this article called: Ontology is Overrated: Categories, Links, and Tags.
My opinionated opinion is therefore: The more difficult the ontologic classification effort, the greater the degree the approach is probably the wrong approach. Conversely, the easier the ontologic classification effort, the more sense this makes to me. Hence, I am leaving ontologies to others as my focus is elsewhere.
What is the preferred model?
Automated. Untrained. Unguided. Self-organizing.
I think in many cases ontologies are best not pre-defined, more ideally the structures and hierarchies should emerge based on actual use/context. They are not static – they evolve and accumulate over time. They must also support disagreement … which will be sorted out later based on the lens (preferences, biases, etc.) of the viewing entity at a time of need.
I am hopeful this is the first and last time I have to use the “Ontology” word.
What about the Semantic Web? DON’T EVEN GET ME STARTED!
RELATED POSTS:
Thanks for educating me what this word actually means!
I couldn't agree more with your assessment. We like to categorize knowledge domains, but often it's inappropriate. So much information is inter-related and benefits from the wisdom of other areas of knowledge. Categorization is useful as a structure for understanding ideas, but if it creates separate theories for similar situations, it's detrimental instead of helpful.
Posted by: amy | March 04, 2009 at 10:36 AM
Jeff, I share your frustration with folks who over-complicate the tagging process. Automatic and untrained are good things. But I think it's also good to leverage whatever human-supplied information you can get for free.
You might want to check out work that my colleagues and I at Endeca did with the Association of Computing Machinery (ACM) to tag articles in their digital library. We distilled a vocabulary from the dirty, sparse set of author-supplied tags and then used it as a basis for automatically tagging the collection as a whole. The results were very nice, and we presented them at HCIR '08. We also applied a similar technique to a leading sports programming network.
http://research.microsoft.com/en-us/um/people/ryenw/hcir2008/
Posted by: Daniel Tunkelang | March 04, 2009 at 02:15 PM
It's precisely because 1) change happens and 2) there is no single version of truth, that I find ontologies useful. This is because the alternative is often to build your domain model into code, which is much harder to inspect and change than an ontology. By pulling the model out into an ontology, then building code that introspects onto that model, change is easier to cope with.
Posted by: David Allsopp | April 16, 2009 at 01:01 AM