Wednesday, October 10, 2012

Too Many Google Results From a Single Site

I don't know about you, but there's a feature that ruined many of my Google search results pages. It's not that Google can't return relevant results, the problem is that Google tries to be clever and detects keywords that are associated with a site. If it finds one, Google will return a lot of results from that site. In fact, sometimes you'll have a hard time finding results from other domains.

Ever since its launch, Google promoted diversity and used host crowding to show "up to two results from each hostname/subdomain of a domain name". Sometimes Google displayed a link that restricted the results to that domain or subdomain, but users had to click it. Matt Cutts wrote in 2007 that "we did hear complaints that for some types of searches (e.g. esoteric or long-tail searches), Google could return a search page with lots of results all from one domain. In the last few weeks we changed our algorithms to make that less likely to happen".

Then Google introduced sitelinks and started to show more pages from a domain. Two years ago, a Google blog post announced that "for queries that indicate a strong user interest in a particular domain, like [exhibitions at amnh], we'll now show more results from the relevant site". Since that announcement, host crowding was a thing of the past and Google started to include more and more results from a single domain.

I complained about this back in 2010 and mentioned that this feature could become annoying, but now it's much worse. Sometimes you can find queries that return mostly results from a domain. For example, when you search for [apple itunes] Google assumes that you want results from apple.com and starts to return a lot of irellevant pages. Sure, you can still find results from other domains, but 31 of the top 50 results are from apple.com.



If you search for [yahoo mail], Google returns a lot of uninteresting results from Yahoo's international sites, instead of including news articles, blog posts, reviews, tutorials.



Search for [berkeley college] and 26 of the top 50 results are from berkeley.edu. That's just too much. Having to constantly add to the query "-site:dominantresult.com" is annoying, not to mention that most Google users don't even know about search operators and shouldn't have to use them.

And this annoyance is not limited to navigational queries. What if you're not in India, search for [sony led] and Google's top 7 results are from Sony India? That's what happened when I disabled Google Instant and set Google to show 50 results per page.


Showing too many results from a domain is a bad idea because a search engine should offer information from multiple sources, while results should be relevant and comprehensive. Google's mission is to "organize the world's information and make it universally accessible and useful," but Google forgot to make the information accessible.

QueryDomainNumber of results in top 50
imdb ratingsimdb.com49
imdb ratings are brokenimdb.com46
google playgoogle.com31
playstationplaystation.com28
wordpresswordpress.org28
samsung led displaysamsung.com27

No comments:

Post a Comment