My Photo

Your email address:


Powered by FeedBlitz

June 2009

Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30        
Blog powered by TypePad

« Smart Systems Flip-Flop | Main | When Risk Assessment is the Risk »

August 15, 2008

FOO Camp 2008 – How to Beat Google! (At Search)

Three fellow Foo-Campers and I held an open meeting session at this year’s Foo Camp entitled “How to Beat Google!”

 

Here are the highlights of what I wanted to say about beating Google at search.  Note: I actually didn’t get the chance to say these things at the session because it spun out in a different direction.  But, had I had my way, I would have made these observations.

 

1. It’s not about search.  I am convinced the future is going to be less about search and more about location-aware, context-sensitive, relevance detection … published to the user in real-time. This will happen via some form of Perpetual Analytics whereby, “The data will find the data and the relevance will find the user.”  In the future users will benefit more regularly from pushed intelligence than they will benefit from pulled intelligence (search).

 

2. ‘Pixel’ sorting has natural limits.  Google page ranking is like a high-speed document sorter – each document being somewhat like a pixel.  Herein lies the problem, there are limits as to how smart algorithms can be when treating each document discreetly because there is a limit to the  knowledge you can squeeze out of a pixel.  Material improvement to search is going to require extracting document content and placing this content into context first: pixels to pictures.  For example, search for a common city name and you’ll get, say, 100k entries containing documents with such a city name.  In five years, the same search might produce 200k document entries even though there are exactly the same number of cities with that name in the world.  With information in context, the search engine would know there are only three such cities and have the documents organized based on this grouping.  The point being, as indexed documents grow, search engines which simply rank documents are at risk of producing less relevant, or diluted, results over time.  There is a stark difference between an ordered list of puzzle pieces and a work-in-progress puzzle with a high degree of pre-assembly and organization. 

 

Is it just me or are search results getting a little less perfect as time marches forward?

 

3. Custom crafted lenses.  One way of producing better search results than Google would involve using personal/local context to enhance ranking.  I think of the Google search as a lens through which we look for ordered Internet content.  Today, for the most part, this lens is a general-purpose lens … meaning my results and your results are similar if not the same.  One way to materially advance search is going to involve custom crafted lenses, meaning your results are tailored to your life.  For example, if searching for souk (market area) in Dubai, while at the same time your calendar has entries about an upcoming trip to Dubai, involving a stay at the super luxurious Burj Al Arab hotel, then the first result might be Souk Madinat Jumeirah the closest souk to the hotel.

 

4. The query is the data.   Google keeps the data about documents in its indexes and the queries elsewhere. While this is how most systems operate, I say the better model is to store queries (some if not all) in the same place as the data – as if it is data – because it is data.  Some data points to documents (URLs) and other data (queries) point to users or sessions in which the question was asked.  What is better about this is (a) the ability to instantaneously discover new content related to yesterday’s questions; (b) the ability to instantaneously detect when the same questions are being asked (because the queries find the queries); and (c) when data and queries are commingled in the same data space, the result scales very well.

 

5. Bad news: Time is of the essence.  Google clearly wants to cement users to the brand, making it just too costly for consumers to switch search engines.  As such, it will be easier to compete with Google today than tomorrow.  So hurry.

 

6. Good news: There is little chance Google can deliver data finds data and relevance finds the user at ingestion speeds (item 1) and information in context (item 2) any time soon.  Two reasons: (a) they would have to figure out how to do this and do this at scale and this is not trivial (b) it would require substantial re-engineering, I suspect.

 

7. Names matter.  Google’s name has such a nice ring to it, so much so that I think that if someone develops a moderately better mousetrap for search they will still lose unless their brand name flows off the tongue just as smoothly.

 

Nothing against Google, of course.  I use it every day.  It’s a great product.  Nonetheless, I think competition is healthy and I would not mind seeing other content providers enjoying some of that “search” market share.

 

On a side note: I wonder if Google has a formal historian.  Google is an amazing success story and I hope future generations will be able to read and enjoy the back-story.

 

RELATED POSTS:

Why Faster Systems Can Make Organizations Dumber Faster

It’s All About the Librarian! New Paradigms in Enterprise Discovery and Awareness
You Won’t Have to Ask -- Data Will Find Data and Relevance Will Find the User
What Came First, the Query or the Data?
What Do You Know? Introducing Perpetual Analytics

 

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83452946769e200e55403c95e8834

Listed below are links to weblogs that reference FOO Camp 2008 – How to Beat Google! (At Search) :

Comments

Jeff,
Take a look at our website, we have a semantic search technology that creates digital "Fingerprints" of search results. Based on our technology we are are able to allow users in the life sciences to determine who are the experts in a given disease category, allow folks to do a high definition search (semantic view) and allow them to do hypothesis generation in regards to where scientific research is going.

Would love to give you an executive briefing to show you how the technology works.

Best regards,


Darrell W. Gunter
EVP/Chief Marketing Officer
gunter@collexis.com



Collexis Holdings, Inc.
1201 Main St. Suite 980
PO Box 11951
Columbia, SC 29211

Main: +1.803.727.1113
Direct +1.973.762.9715
Cell +1.973.454.3475
Fax +1.803.727.1118


Jeff,

What I would really like to see is more active user participation in search engine technology and implementation. As search engines become more decision oriented, as the results they produce begin to take on more character as intellectual extensions of the analysts and programmers who created them, they also become more alienating to the user. It is precisely this form of alienation that has become an increasing liability to Microsoft's programs and operating systems.

As a search consumer, what I would really appreciate is a search engine that creates profiles, not on some distant submerged server, but on my -own- machine. I'm delighted to have a profile created based on my web activity at Myspace, or Facebook, or Google, but I want that profile, or profiles to be on MY computer, and to have the opportunity to edit them, change them, or decide whether they are going to be associated with my identity at any given time, for any search, or at any web location.

All one needs to do is tally the number of 99-year-old participants on a site like Myspace, a barrier to defend against advertising profiling and autonomous search techniques, to see the indicators of the conflict that is coming in this arena.

Tim

By the way, this is fairly consistent with your GIO video. More than simply being impractical, centralized control also breeds contempt. Microsoft may not have figured this out in time.

As for point (3), Google does already offer personalized search. See, for example, the article "Is Google’s Personalized Search Going to Change SEO Efforts?" on SearchEngineAcademySC.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment