Three fellow Foo-Campers and I held an open meeting session at this year’s Foo Camp entitled “How to Beat Google!”
Here are the highlights of what I wanted to say about beating Google at search. Note: I actually didn’t get the chance to say these things at the session because it spun out in a different direction. But, had I had my way, I would have made these observations.
1. It’s not about search. I am convinced the future is going to be less about search and more about location-aware, context-sensitive, relevance detection … published to the user in real-time. This will happen via some form of Perpetual Analytics whereby, “The data will find the data and the relevance will find the user.” In the future users will benefit more regularly from pushed intelligence than they will benefit from pulled intelligence (search).
2. ‘Pixel’ sorting has natural limits. Google page ranking is like a high-speed document sorter – each document being somewhat like a pixel. Herein lies the problem, there are limits as to how smart algorithms can be when treating each document discreetly because there is a limit to the knowledge you can squeeze out of a pixel. Material improvement to search is going to require extracting document content and placing this content into context first: pixels to pictures. For example, search for a common city name and you’ll get, say, 100k entries containing documents with such a city name. In five years, the same search might produce 200k document entries even though there are exactly the same number of cities with that name in the world. With information in context, the search engine would know there are only three such cities and have the documents organized based on this grouping. The point being, as indexed documents grow, search engines which simply rank documents are at risk of producing less relevant, or diluted, results over time. There is a stark difference between an ordered list of puzzle pieces and a work-in-progress puzzle with a high degree of pre-assembly and organization.
Is it just me or are search results getting a little less perfect as time marches forward?
3. Custom crafted lenses. One way of
producing better search results than Google would involve using personal/local
context to enhance ranking. I think of
the Google search as a lens through which we look for ordered Internet
content. Today, for the most part, this
lens is a general-purpose lens … meaning my results and your results are
similar if not the same. One way to
materially advance search is going to involve custom crafted lenses, meaning
your results are tailored to your life.
For example, if searching for souk
(market area) in Dubai, while
at the same time your calendar has entries about an upcoming trip to Dubai
4. The query is the data. Google keeps the data about documents in its indexes and the queries elsewhere. While this is how most systems operate, I say the better model is to store queries (some if not all) in the same place as the data – as if it is data – because it is data. Some data points to documents (URLs) and other data (queries) point to users or sessions in which the question was asked. What is better about this is (a) the ability to instantaneously discover new content related to yesterday’s questions; (b) the ability to instantaneously detect when the same questions are being asked (because the queries find the queries); and (c) when data and queries are commingled in the same data space, the result scales very well.
5. Bad news: Time is of the essence. Google clearly wants to cement users to the brand, making it just too costly for consumers to switch search engines. As such, it will be easier to compete with Google today than tomorrow. So hurry.
6. Good news: There is little chance Google can deliver data finds data and relevance finds the user at ingestion speeds (item 1) and information in context (item 2) any time soon. Two reasons: (a) they would have to figure out how to do this and do this at scale and this is not trivial (b) it would require substantial re-engineering, I suspect.
7. Names matter. Google’s name has such a nice ring to it, so much so that I think that if someone develops a moderately better mousetrap for search they will still lose unless their brand name flows off the tongue just as smoothly.
Nothing against Google, of course. I use it every day. It’s a great product. Nonetheless, I think competition is healthy and I would not mind seeing other content providers enjoying some of that “search” market share.
On a side note: I wonder if Google has a formal historian. Google is an amazing success story and I hope future generations will be able to read and enjoy the back-story.
All About the Librarian! New Paradigms in Enterprise Discovery and Awareness
You Won’t Have to Ask -- Data Will Find Data and Relevance Will Find the User
What Came First, the Query or the Data?
What Do You Know? Introducing Perpetual Analytics