Help Search SiteMap Directories MyMHC Home Alumnae Academics Admission Athletics Campus Life Offices & Services Library & Technology News & Events About the College Navigation Bar
MHC Home Search

Search Engine Help

3/1/04: Important changes

In order to be more consistent with the way major search engines work, MHC's search engine now assumes, unless you tell it otherwise, that all words you type must appear on the page in order for it to be returned in the search results. Think of this as an implicit "AND" of the words you type; previously, the search engine used an implicit "OR".

As a result of this change, it is no longer necessary to use the + symbol to imply "AND". Instead, you will have to use the word OR (in uppercase) whenever you want to specify that the word appearing after OR is optional. The word OR can also be abbreviated with a single vertical bar | character. See below for examples.


How the search engine works

The search engine consists of two main parts, a robot, and the search program that is activated by one of these pages. The robot starts late at night when usage is low and looks for links in each page on the system. Eventually, it covers every one of the many thousands of HTML files that are somehow linked to the main index.shtml page. It intentionally excludes pages that are part of a user's personal Web space rather than the main system. This method is very similar to that used by the Web-wide search engines like Google and Alta Vista.

From this process the robot generates a database of all the words contained in all of the pages. The site map is a second database that gets generated by a similar robot.

When you enter a search request, the search engine script looks in these two databases for all of the pages containing the words or phrases you asked for.

Because these two robots only operate once per day, any pages that are added to the system since it last ran won't come up as the results of a search.


Simple searches

The most basic search is done by just entering one more words, separated by spaces. The search engine will give you a list of pages containing at least one word which begins with these letters. For instance, if you enter food potato, the request would match a page having the words food and potato, or one with foodstuff and potatoes, somewhere in it.

Note that, in the case of entries from the site map, only the pages' titles are considered. This type of search is best when you think that the term you are searching for is contained in one of the major pages on the site. It helps to eliminate pages that may only deal with a topic in passing.

In order to prevent the search engine from considering words that only start with a word you are looking for, you can enclose the word in double-quotes. If you enter "food", only pages containing that exact word will match; pages with foodstuff will not.

If you want to limit the search to an exact phrase, you can enter it in double-quotes as well.

A couple of notes about what you enter:

  • Case does not matter. Asking for jones or JOnes will match Jones, just fine.
  • All punctuation is ignored, so entering Mr. Jones is the same as entering Mr Jones.
  • Some words (like a, the, and, to, etc.) are so common they are always ignored, except when in double-quotes. The search engine will give you an error message if you have not entered any "unique" keywords for your search.
  • Even though most of the examples here use one or two words, you can actually use any number of them.


Search results

The results of a search are organized into two sections, each of which contain three columns:


Matches in the Site Map

Location within site map hierarchy
Match
Gauge
Page TitleLink
Library, Information & Technology Services : MHC Archives and Special Collections : rare2.htm
100%Dante Catalog Home Page/lits/library/arch/dante.htm
Library, Information & Technology Services : MHC Archives and Special Collections : rare2.htm : Dante Catalog Home Page
100%Dante Catalog/lits/library/arch/dante.gia.htm
100%Dante Illustrators/lits/library/arch/danteill.htm


Matches in Pages

Match
Gauge
Page TitleLink
100%Descriptions of images in the Inferno/lits/library/arch/danteimgrt.html
64%Dante Catalog/lits/library/arch/dantegia.htm
63%Dante Catalog/lits/library/arch/test/dante.gia.htm

The first section shows matches from titles of entries in the site map, if this option is enabled. The list is organized based on the pages' hierarchy within the map, and their relative scores. Above each grouping appears a "path" which details how each page fits into the hiearchy.

The first column in each section is a gauge which indicates how well the particular page matches your query. Since the results are sorted with the best matches first, the first entry will always be a completely red bar (text-based browsers show this as 100%). Other matches are expressed as a percentage of the best match.

The search engine calculates the worth of a match based on the number of times the term appears on the page and where the term is located. For instance, a term contained in a page's title or in a large text header is considered to be more important than one in the body of a page.

The page's title, if any, is taken from the HTML <TITLE> tag. The link gives the full path of the matching page, and provides you with a link you can click on to go there.

Below the matches, a number of statistics are given, for example:

270 matches total, first 100 available, 1-15 shown.
FOOD=1351 FIGHT=1359

[ Next 15 matches ]

This display means that the word FOOD was found on 1351 pages, and the word FIGHT was found on 1359. The total number of pages containing both words is 270. In order to conserve system resources, the search engine will only show you the first 100 matches, so this line informs you that the first 15 are now being shown. To go to the next 15 best matches, click on the link provided.

The search engine normally shows 15 matches at a time. The Advanced Search page allows you to change this.


Advanced searches

There may be times when you want to look for pages that match either one word or a second. For this case, use the format word1 OR word2. For example, entering food OR potato will match pages which contain either of these words. It is important to type the word OR in uppercase.

By using a - you can exclude a word. Entering food -potato will match pages which contain food, but not those which contain both food and potato.

Using the modifier url: you can force the search to only match those pages whose location (URL) begins with a certain path. url: should be followed by the absolute path of interest, without http://www.mtholyoke.edu at the beginning. A search for food url:/offices/comm/csj looks for pages with the word food in issues of the College Street Journal.

There is also a separate Advanced Search page which allows you to choose from a list of locations on this site and to set the number of matches per page in the result.


Summary

  • Searches are not sensitive to capitalization. Punctuation is ignored.
  • word1 word2 Matches documents containing words beginning with both word1 and word2
  • "word" Matches documents containing the word exactly
  • "word1 word2" Matches documents containing the exact phrase
  • word1 OR word2 Matches documents containing a word beginning with word1 or word2
  • word1 -word2 Matches documents containing a word beginning with word1 and no words beginning with word2
  • word1 url:word2 Matches documents containing a word beginning with word1, but only when their URLs start with word2
Examples:
   mary lyon      Matches Mary Lyon, Lyons of Maryland, or Lyon, Mary
   "mary"         Matches Mary Lyon or Mary Smith
   "mary lyon"    Matches Mary Lyon
   mary OR lyon   Matches Maryland, Lyon, France, or Mary Smith
   "mary" -lyon   Matches Mary, but not on pages containing Lyon
   map url:/adm   Matches pages in the Admission Center which talk about the campus map


Perform a search


Include entries from the site map

Home | MyMHC | Web Email | Directories | SiteMap | Search | Help

Admission | Academics | Campus Life | Athletics
Library & Technology | About the College | Alumnae | News & Events | Offices & Services

Copyright © 2004 Mount Holyoke College. This page created by the Web Design Team and maintained by Webmaster. Last modified on March 1, 2004.