Jump to content

Google Search: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
MamboJambo (talk | contribs)
Merged Personalization articles
MamboJambo (talk | contribs)
Line 91: Line 91:
[[Image:googlesuggest.gif|right|125px|Google Suggest Logo]]
[[Image:googlesuggest.gif|right|125px|Google Suggest Logo]]
'''Google Suggest'''[http://google.com/webhp?complete=1&hl=en] uses auto-complete while typing to give popular searches.
'''Google Suggest'''[http://google.com/webhp?complete=1&hl=en] uses auto-complete while typing to give popular searches.


===Zeitgeist===
'''Google Zeitgeist'''[http://google.com/intl/en/press/zeitgeist.html] is a service that compiles a list of rising [[trends]] and [[patterns]] by tracking the most frequent search [[queries]]. [[Google]] then summarizes the [[statistics]] in each country [[weekly]], [[monthly]] and [[yearly]]. Google claims the data used to compile the reports stays completely [[anonymous]].


==Google jargon==
==Google jargon==

Revision as of 09:55, 27 August 2006

This article is about the search engine. For the corporation, see Google; for the underlying technology, see Google platform; for other uses see Google (disambiguation).
File:Google screenshot.png
Google's main page's unusually spartan design, uncluttered appearance and quick loading time have contributed greatly to the site's mass appeal.

Google is a search engine owned by Google, Inc. whose mission statement is to "organize the world's information and make it universally accessible and useful". The largest search engine on the web, Google receives over 200 million queries each day through its various services.

In addition to its tool for searching webpages, Google also provides services for searching images, Usenet newsgroups, news websites, videos, searching by locality, maps, and items for sale online. In 2006, Google has indexed over 25 billion web pages, 1.3 billion images, and over one billion Usenet messages. It also caches much of the content that it indexes. Google operates other tools and services including Google News, Google Suggest, Froogle, and Google Desktop Search.

The search engine

Index size

At its start in 1998, Google claimed to index 25,000,000 web pages.[1] By June 2005, this number had grown to 8,058,044,651 web pages, as well as 17,100,000 images, 1 billion Usenet messages, 6,600 print catalogs, and 4,500 news sources. Today, Google has over 25,000,000,000 webpages and 1,300,000,000 images indexed.

Physical structure

Google employs data centers full of low-cost commodity computers running a custom Red Hat Linux in several locations around the world to respond to search requests and to index the web. The server farms in the data centers are built using a shared nothing architecture. The indexing is performed by a program named Googlebot, which periodically requests new copies of web pages it already knows about. The more often a page updates, the more often Googlebot will visit. The links in these pages are examined to discover new pages to be added to its internal database of the web. This index database and web page cache is several terabytes in size. Google has developed its own file system called Google File System for storing all this data.

PageRank

Google uses an algorithm called PageRank to rank web pages that match a given search string. The PageRank algorithm computes a recursive figure of merit for web pages, based on the weighted sum of the PageRanks of the pages linking to them. The PageRank thus derives from human-generated links, and correlates well with human concepts of importance. Previous keyword-based methods of ranking search results, used by many search engines that were once more popular than Google, would rank pages by how often the search terms occurred in the page, or how strongly associated the search terms were within each resulting page. In addition to PageRank, Google also uses other secret criteria for determining the ranking of pages on result lists.

Google not only indexes and caches HTML files but also 13 other file types, which include PDF, Word documents, Excel spreadsheets, Flash SWF, plain text files, among others. Except in the case of text and SWF files, the cached version is a conversion to HTML, allowing those without the corresponding viewer application to read the file.

Users can customize the search engine somewhat. They can set a default language, use "SafeSearch" filtering technology (which is on 'moderate' setting by default), and set the number of results shown on each page. Google has been criticized for placing long-term cookies on users' machines to store these preferences, a tactic which also enables them to track a user's search terms over time. For any query (of which only the 32 first keywords are taken into account), up to the first 1000 results can be shown with a maximum of 100 displayed per page.

Despite its immense index, there is also a considerable amount of data in databases, which are accessible from websites by means of queries, but not by links. This so-called deep web is minimally covered by Google and contains, for example, catalogs of libraries, official legislative documents of governments, phone books, and more.

Google optimization

Since Google is the most popular search engine, many webmasters have become eager to influence their website's Google rankings. An industry of consultants has arisen to help websites raise their rankings on Google and on other search engines. This field, called search engine optimization, attempts to discern patterns in search engine listings, and then develop a methodology for improving rankings.

One of Google's chief challenges is that as its algorithms and results have gained the trust of web users, the profit to be gained by a commercial website in subverting those results has increased dramatically. Some search engine optimization firms have attempted to inflate specific Google rankings by various artifices, and thereby draw more searchers to their client's sites. Google has managed to weaken some of these attempts by reducing the ranking of sites known to use them.

Search engine optimization encompasses both "on page" factors (like body copy, title tags, H1 heading tags and image alt attributes) and "off page" factors (like anchor text and PageRank). The general idea is to affect Google's relevance algorithm by incorporating the keywords being targeted in various places "on page", in particular the title tag and the body copy (note: the higher up in the page, the better its keyword prominence and thus the ranking). Too many occurrences of the keyword, however, cause the page to look suspect to Google's spam checking algorithms.

One "off page" technique that works particularly well is Google bombing in which websites link to another site using a particular phrase in the anchor text, in order to give the site a high ranking when the word is searched for.

Google publishes a set of guidelines for a website's owners who would like to raise their rankings when using legitimate optimization consultants.[2]

Uses of Google

A corollary use of Google -- and other Internet search engines - is that it can help translators to determine the most common way of expressing ideas in the English language (and other languages). This is generally done by doing a 'count' of different variants, thereby establishing which expression is more common. While this approach requires careful judgement, it does improve the ability of non-native translators to use more idiomatically correct English expressions.

Google dance

Previously, Google would update its index on a monthly basis, leading to suddenly fluctuating and unexpected results as its various servers were updated. This is no longer the case, because Google updates its index continuously as it crawls the web.

Search products

You must add a |reason= parameter to this Cleanup template – replace it with {{Cleanup|August 2006|reason=<Fill reason here>}}, or remove the Cleanup template.

Google has created several new search engines, which individually cover a certain topic or medium.

Accessible Search

Accessible Search logo
Accessible Search logo

Google Accessible Search[1] is a search engine for visually challenged people. It prioritizes usable and accessible web content, as well as using Google's PageRank technology.

As well as the websites containing results being accessible, the Accessible Search interface is also rendered clearly and simply. For example, AdWords results have not been implemented, so they cannot distract the user from the main links.

Blog Search

File:Google Blog Search.gif

On September 14 2005, Google launched Blog Search. It is Google search technology focused on blogs. Results include all blogs, not just those published through Blogger. Blog Search's blog index is continually updated. Blogs written in English can be searched, as well as those written in other languages, including French, Italian, German, Spanish, Chinese, Korean, Japanese and Brazilian Portuguese.

Catalogs

Google Catalogs Logo
Google Catalogs Logo

Through the use of character recognition, users can search for a text string in over 6,600 print catalogs in a fashion similar to how they would for materials on the general web.

Directory

Google Directory Logo
Google Directory Logo

Google Directory[2] was launched in April 2000. The directory is a subset of the links in Google's database arranged into hierarchical subcategories, like an advanced Yellow Pages of the web. The source of the directory, and its categorization is from the Open Directory Project (ODP). The ODP publishes an easily parsed version of its database in Resource Description Framework (RDF) format for other sites, like Google, to use for derivative directories. The Websites in the Google Directory are sorted by PageRank.

Image Search

In December 2001, Google announced Google Images[3], which allows users to search the web for image content. The keywords for the image search are based on the filename of the image, the link text pointing to the image, and text adjacent to the image. When searching for an image, a thumbnail of each matching image is displayed. Then when clicking on a thumbnail, the image is displayed in a frame at the top of the page and the website on which that image was found is displayed in a frame below it, making it easier to see from where the image is coming.

Mobile

Google Mobile allows users to search using Google from wireless devices such as mobile phone and PDAs.

Personalization

Over the years, Google has also introduced features for its search engine that enable personalization. These include:

  • Bookmarks: Google Bookmarks[4] is a free online bookmark storage service available to Google Account holders. This service organizes bookmarks with tags and bookmarks labeled homepage will be displayed on your Personalized Homepage.
  • Personalized Homepage: In May 2005, Google introduced Personalized Homepage[5], giving the ability to customize the default Google home page. Customization can include modules on weather, Google search, maps, and Words of the Day. Also, customized RSS feeds can be used.
  • Personalized Search By making use of Google's Search History[6] feature, this service allows users to create a profile based on their prior search history. Future search results can be prioritized on an individual basis on the information collected.

Special Searches

Google Linux Search Logo
Google Linux Search Logo

Google Special Searches is a collection of search engines, tailored to a particular topic.

Some of the special searches use the standard Google web search engine, but filter the results depending on the subject. These include U.S. Government Search, Linux Search, BSD Search, Apple Macintosh Search, and a Microsoft Windows Search. There is also Google University Search, which lets you select a particular university, then search within their own site, and Google Public Service Search, a service intended for non-commercial organizations only.

Suggest

Google Suggest Logo
Google Suggest Logo

Google Suggest[7] uses auto-complete while typing to give popular searches.


Zeitgeist

Google Zeitgeist[8] is a service that compiles a list of rising trends and patterns by tracking the most frequent search queries. Google then summarizes the statistics in each country weekly, monthly and yearly. Google claims the data used to compile the reports stays completely anonymous.

Google jargon

To google
to search something using google (also, to seek information on someone by entering their full name or other information)
Googler
a person who uses Google's features very efficiently. Mostly uses the "I'm feeling lucky" button when searching. Fan of a google. 'Googler' is sometimes also used for "Expert Online Searcher". Also, a company term for a full-time google employee.
Nigritude ultramarine, SERPs, Seraphim Proudleduck, Mangeur de cigogne
SEO competitions
Googledork
A person who accidentally exposes information to the web by placing it into a location spidered by Google.
Google-proof
search-phrase delivering exactly the intended result while searching with google
Sandbox Effect
The name given to the phenomenon in which Google filters (from its results) websites created after March 2004.
Google bomb
An attempt to influence the ranking of a given site in results returned by the Google search engine. Also known as Google wash.
Googlewhack
A search using two dictionary-valid (underlined by Google) words that only results in one hit.

Google games

  • In Gwigle, you learn advanced Google search tricks as you go through the puzzles.
  • In Googlewhack you attempt to find two words that produce exactly one search result.
  • In Google Talk (not to be confused with Google Talk, Google's VoIP/IM service), google searches are used to complete a beginning of a sentence with words, leading to amusing or interesting results.
  • In Googlefight, you pit two keywords against each other to find which one has more results.
  • In Guess The Google, you attempt to guess which search term resulted in the displayed images.
  • In Toogle, you can search images with the text of the search item making up the image. "The most comprehensive image buggery on the web"

See also

References

Further reading

  • Google Hacks from O'Reilly is a book containing tips about using Google effectively. Now in its second edition. ISBN 0596008570
  • Google: The Missing Manual by Sarah Milstein and Rael Dornfest (O'Reilly, 2004). ISBN 0596006136
  • How to Do Everything with Google by Fritz Schneider, Nancy Blachman, and Eric Fredricksen (McGraw-Hill Osborne Media, 2003). ISBN 0072231742
  • Google Power by Chris Sherman (McGraw-Hill Osborne Media, 2005). ISBN 0072257873

Major Competitors

External links