Here are the highlights of what I wanted to say about beating Google at
search. Note: I actually didn’t get the
chance to say these things at the session because it spun out in a different
direction. But, had I had my way, I
would have made these observations.
1. It’s not about search. I am
convinced the future is going to be less about search and more about
location-aware, context-sensitive, relevance detection … published to the user
in real-time. This will happen via some form of Perpetual
Analytics whereby, “The data will find the data and the relevance will find
the user.” In the future users will
benefit more regularly from pushed intelligence than they will benefit from
pulled intelligence (search).
2. ‘Pixel’ sorting has natural limits.
Google page ranking is like a high-speed document sorter – each document
being somewhat like a pixel. Herein lies
the problem, there are limits as to how smart algorithms can be when treating
each document discreetly because there is a limit
to the knowledge you can squeeze out of
a pixel. Material improvement to search
is going to require extracting document content and placing this content into
context first: pixels to pictures. For
example, search for a common city name and you’ll get, say, 100k entries
containing documents with such a city name.
In five years, the same search might produce 200k document entries even
though there are exactly the same number of cities with that name in the
world. With information in context, the
search engine would know there are only three such cities and have the
documents organized based on this grouping.
The point being, as indexed documents grow, search engines which simply
rank documents are at risk of producing less relevant, or diluted, results over
time. There is a stark difference
between an ordered list of puzzle pieces and a work-in-progress puzzle with a
high degree of pre-assembly and organization.
Is it just me or are search results getting a little less perfect as time
marches forward?
3. Custom crafted lenses. One way of
producing better search results than Google would involve using personal/local
context to enhance ranking. I think of
the Google search as a lens through which we look for ordered Internet
content. Today, for the most part, this
lens is a general-purpose lens … meaning my results and your results are
similar if not the same. One way to
materially advance search is going to involve custom crafted lenses, meaning
your results are tailored to your life.
For example, if searching for souk
(market area) in Dubai, while
at the same time your calendar has entries about an upcoming trip to Dubai
4. The query is the data. Google
keeps the data about documents in its indexes and the queries elsewhere. While
this is how most systems operate, I say the better model is to store queries
(some if not all) in the same place as the data – as if it is data – because it
is data. Some data points to documents
(URLs) and other data (queries) point to users or sessions in which the
question was asked. What is better about
this is (a) the ability to instantaneously discover new content related to
yesterday’s questions; (b) the ability to instantaneously detect when the same
questions are being asked (because the queries find the queries); and (c) when
data and queries are commingled in the same data space, the result scales very
well.
5. Bad news: Time is of the essence.
Google clearly wants to cement users to the brand, making it just too
costly for consumers to switch search engines.
As such, it will be easier to compete with Google today than
tomorrow. So hurry.
6. Good news: There is little chance Google can deliver data finds data and
relevance finds the user at ingestion speeds (item 1) and information in
context (item 2) any time soon. Two
reasons: (a) they would have to figure out how to do this and do this at scale and
this is not trivial (b) it would require substantial re-engineering, I suspect.
7. Names matter. Google’s name has
such a nice ring to it, so much so that I think that if someone develops a moderately
better mousetrap for search they will still lose unless their brand name flows
off the tongue just as smoothly.
Nothing against Google, of course. I
use it every day. It’s a great
product. Nonetheless, I think
competition is healthy and I would not mind seeing other content providers
enjoying some of that “search” market share.
On a side note: I wonder if Google has a formal historian. Google is an amazing success story and I hope
future generations will be able to read and enjoy the back-story.
RELATED POSTS:
Why
Faster Systems Can Make Organizations Dumber Faster
It’s
All About the Librarian! New Paradigms in Enterprise Discovery and Awareness
You
Won’t Have to Ask -- Data Will Find Data and Relevance Will Find the User
What
Came First, the Query or the Data?
What
Do You Know? Introducing Perpetual Analytics