Here are the highlights of what I wanted to say about beating Google at
search. Note: I actually didn’t get the
chance to say these things at the session because it spun out in a different
direction. But, had I had my way, I
would have made these observations.
1. It’s not about search. I am
convinced the future is going to be less about search and more about
location-aware, context-sensitive, relevance detection … published to the user
in real-time. This will happen via some form of Perpetual
Analytics whereby, “The data will find the data and the relevance will find
the user.” In the future users will
benefit more regularly from pushed intelligence than they will benefit from
pulled intelligence (search).
2. ‘Pixel’ sorting has natural limits.
Google page ranking is like a high-speed document sorter – each document
being somewhat like a pixel. Herein lies
the problem, there are limits as to how smart algorithms can be when treating
each document discreetly because there is a limit
to the knowledge you can squeeze out of
a pixel. Material improvement to search
is going to require extracting document content and placing this content into
context first: pixels to pictures. For
example, search for a common city name and you’ll get, say, 100k entries
containing documents with such a city name.
In five years, the same search might produce 200k document entries even
though there are exactly the same number of cities with that name in the
world. With information in context, the
search engine would know there are only three such cities and have the
documents organized based on this grouping.
The point being, as indexed documents grow, search engines which simply
rank documents are at risk of producing less relevant, or diluted, results over
time. There is a stark difference
between an ordered list of puzzle pieces and a work-in-progress puzzle with a
high degree of pre-assembly and organization.
Is it just me or are search results getting a little less perfect as time
marches forward?
3. Custom crafted lenses. One way of
producing better search results than Google would involve using personal/local
context to enhance ranking. I think of
the Google search as a lens through which we look for ordered Internet
content. Today, for the most part, this
lens is a general-purpose lens … meaning my results and your results are
similar if not the same. One way to
materially advance search is going to involve custom crafted lenses, meaning
your results are tailored to your life.
For example, if searching for souk
(market area) in Dubai, while
at the same time your calendar has entries about an upcoming trip to Dubai
4. The query is the data. Google
keeps the data about documents in its indexes and the queries elsewhere. While
this is how most systems operate, I say the better model is to store queries
(some if not all) in the same place as the data – as if it is data – because it
is data. Some data points to documents
(URLs) and other data (queries) point to users or sessions in which the
question was asked. What is better about
this is (a) the ability to instantaneously discover new content related to
yesterday’s questions; (b) the ability to instantaneously detect when the same
questions are being asked (because the queries find the queries); and (c) when
data and queries are commingled in the same data space, the result scales very
well.
5. Bad news: Time is of the essence.
Google clearly wants to cement users to the brand, making it just too
costly for consumers to switch search engines.
As such, it will be easier to compete with Google today than
tomorrow. So hurry.
6. Good news: There is little chance Google can deliver data finds data and
relevance finds the user at ingestion speeds (item 1) and information in
context (item 2) any time soon. Two
reasons: (a) they would have to figure out how to do this and do this at scale and
this is not trivial (b) it would require substantial re-engineering, I suspect.
7. Names matter. Google’s name has
such a nice ring to it, so much so that I think that if someone develops a moderately
better mousetrap for search they will still lose unless their brand name flows
off the tongue just as smoothly.
Nothing against Google, of course. I
use it every day. It’s a great
product. Nonetheless, I think
competition is healthy and I would not mind seeing other content providers
enjoying some of that “search” market share.
On a side note: I wonder if Google has a formal historian. Google is an amazing success story and I hope
future generations will be able to read and enjoy the back-story.
RELATED POSTS:
Why
Faster Systems Can Make Organizations Dumber Faster
It’s
All About the Librarian! New Paradigms in Enterprise Discovery and Awareness
You
Won’t Have to Ask -- Data Will Find Data and Relevance Will Find the User
What
Came First, the Query or the Data?
What
Do You Know? Introducing Perpetual Analytics
Jeff,
Take a look at our website, we have a semantic search technology that creates digital "Fingerprints" of search results. Based on our technology we are are able to allow users in the life sciences to determine who are the experts in a given disease category, allow folks to do a high definition search (semantic view) and allow them to do hypothesis generation in regards to where scientific research is going.
Would love to give you an executive briefing to show you how the technology works.
Best regards,
Darrell W. Gunter
EVP/Chief Marketing Officer
[email protected]
Collexis Holdings, Inc.
1201 Main St. Suite 980
PO Box 11951
Columbia, SC 29211
Main: +1.803.727.1113
Direct +1.973.762.9715
Cell +1.973.454.3475
Fax +1.803.727.1118
Posted by: Darrell Gunter | August 15, 2008 at 03:25 PM
Jeff,
What I would really like to see is more active user participation in search engine technology and implementation. As search engines become more decision oriented, as the results they produce begin to take on more character as intellectual extensions of the analysts and programmers who created them, they also become more alienating to the user. It is precisely this form of alienation that has become an increasing liability to Microsoft's programs and operating systems.
As a search consumer, what I would really appreciate is a search engine that creates profiles, not on some distant submerged server, but on my -own- machine. I'm delighted to have a profile created based on my web activity at Myspace, or Facebook, or Google, but I want that profile, or profiles to be on MY computer, and to have the opportunity to edit them, change them, or decide whether they are going to be associated with my identity at any given time, for any search, or at any web location.
All one needs to do is tally the number of 99-year-old participants on a site like Myspace, a barrier to defend against advertising profiling and autonomous search techniques, to see the indicators of the conflict that is coming in this arena.
Tim
Posted by: Tim R | September 19, 2008 at 08:27 AM
By the way, this is fairly consistent with your GIO video. More than simply being impractical, centralized control also breeds contempt. Microsoft may not have figured this out in time.
Posted by: Tim R. | September 20, 2008 at 08:59 AM
As for point (3), Google does already offer personalized search. See, for example, the article "Is Google’s Personalized Search Going to Change SEO Efforts?" on SearchEngineAcademySC.
Posted by: Gregory Grefenstette | February 16, 2009 at 01:46 AM