From my Web Terminology
page the definition for search Engine is 'A search engine
archives the contents of web sites into a searchable database.
By entering a suitable keyword(s) or phrase into the search
engine, you can obtain a list of pages that contain the entered
word(s)'.
Search engines send out software robots or spiders to roam
around the Web. They note the words on a page and include them
into a searchable indexed database.
An alternative to a search engine as stated above is the directory,
where sites are indexed into a database of searchable categories
and sub categories.
As a webmaster you should be interested in 4 aspects of
engines. Steps 3 and 4 are covered later in the course in
Tutorial 14 Site Promotion
How to use a search engine to efficiently search for
information, this will mean learning some advance search techniques,
which are very easy to do.
Integrating a search engine into your own site.
How to submit your site to engines.
Then once listed, how to promote your site
by getting your site ranking higher up that list.
Keeping an eye on similar sites to your own.
Each engine uses different methods on the putting together
of there database of web pages, not all sites are included and
sites that appear in more than one search engine may have a
good ranking (position in the returned results) in one search
engine and a very poor ranking in others.
Therefore you should not be surprised if the results of an
identical search using different engines results in vastly different
pages being suggested.
Once you have mastered "How to use a Search engine"
in the next section, and "Advanced Searches"
in the following section. You should try a few searches using
identical carefully selected keyword(s) or phrases in several
of the Engines, and the sites that combine results of several
Engines. These are listed below.
You should experiment with both simple and advanced searches
See which engines return
The most suitable sites
The highest percentage of suitable site.
The clearest of site preview
A speedy result.
Give a mark to each search engine, the highest marked engine,
you will naturally use the most.
This is my favorite search
engine. It has over a billion pages indexed in it's
database. Many pages are cached and can be retrieved, if
the site closes down.
This engine is different from
the others. Giving a star rating. Also lists the
position of the site in in the following engines: AOL, EntireWeb,
Espotting, Go, LookSmart UK, Lycos UK, Mirago, Overture
UK, UKPlus, WISEnut
There are a few programs available that you can obtain from magazine
cover disks, or download from the web, that you instal and run
from your hard drive. These will prompt you for your search query,
check out several large engines and then provide you with the
combined results.
Copernic
Copernic is regarded as the turbo charged web searching tool.
Downloaded from XXXXXX___________________________________XX
It queries multiple engines.
Customise options.
Validation of results to remove dead links.
Auto downloading of result pages, if required.
Skins
A very good search tool.
Free.
New users should look at the legend window ( From 'Window'
'Legend') to make sence of the 'Symols' against each result.
How to use a Search engine
In order to search for some information you enter into the input
box of the search engine one or more words or phrases. Phrases
are enclosed in "double quote marks".
These works are called keywords
Exercise
1.
Step 1 If you are not on line, go on
line now. Normally you would open what I believe is the best
search engine available www.google.com, but that is not
required because for this exercise I have brought the search
engine to you, see below.
Step 2 Assumption.
Assume that you are studying the Open University course
M206 Computing an
Object - oriented approach, and
wish to find a tutorial on the Smalltalk programming
that is relevent to the M206 course one of the subjects
on this course.
All the words in bold text in the
above sentence are possible suitable words or phrases to search
for. In Jan 2002 I searched for some of the above and the number
of returned matches are shown in the table below.
Step 3 The secret of successful searches
is the careful selection and combination of your keywords. From
what I have said in the assumption
Assume that you are studying the Open University course
M206 Computing an
Object - oriented approach, and
wish to find a tutorial on the Smalltalk programming
that is relevent to the M206 course one of the subjects
on this course.
Pick out the 2 main keywords
Obviously Smalltalk is the main one, and because the assumption
said, relevent to the M206 course,
I would say M206 the other, tutorial being a strong contender.
Step 4 For the rest of this exercise you are
going to use the following search engine that I have installed below
Ensure that the Search WWW option is selectet
Enter the following words into the input box. Smalltalk M206
then click the Google Search button.
This search engine will look through its database
for pages that have BOTH these words on the page.
Note most other engines will present pages that have EITHER
of the 2 words on a page.
To find pages that have BOTH these words in them, these
other engines would require you to enter either of the following
+Smalltalk +M206
Smalltalk AND M206
Step 4 Check the results
You should look at the result page produced. Look for the number
of pages found near the top of the list, it was 457 when
I tried it.
Note how the search words you typed in are displayed in bold
within the extracts from different sections of the page.
Read a few of the entries and see how near they are to our
requirements.
Look at the top, or near the top of the list for my M206 site
with an entry similar to the following.
Clicking on the either of the 2 links in the Google entry,
or in the above paragraph will take you to the Smalltalk and
M206 site
Webmaster tip:
Just under the entry look for the word cashed, and click this.
This is the page as cashed by Google the last time you they sent there
robots visiting the site. If you are frequently updating your site and
place a date when the site was last updated on your homepage then you
will obtain an indication of when your site was last visited by the
robot and the site database uploaded. This does not work for many of
my sites because the date is generated by a JavaScript and the cashed
process will update the date to todays date.
You may find this useful when you get your site listed on
the Search Engines.
Step 1 Add the word tutorial
to the other 2 words and try again
Step 2 Take a note how the search results
have now been cut down in number. By the careful selection of
your search criteria you should be able to obtain the best possible
sites for your purpose.
Many engines have an Advanced search page, where you can enter additional
query details that should improve your chances of returning relevent sites.
Boolean Searches ( + - AND OR NEAR)
Phrases
Phrases should always be placed within a pair of quotation marks. E.G.
"turbocharged engine"
The engine will look for an exact database entry for the phrase, finding
turbocharged engine, but not finding turbocharged car engine The useful search query to use here is
turbocharged NEAR engine
which looks for the 2 words being in near proximity.
If you enter a search query
T171 HTML tutorial
into the Google search engine you will obtain results that contain ALL
three of the keywords. Entering the same into most other engines and you
will obtain results that contain ANY of the 3 keywords. In these engines
you would need to use Boolean search techniques to obtain sites that contain
ALL the keywords as follows:
T171 AND HTML AND tutorial
or you could use the following alternative method
+T171 +HTML +tutorial
Many words have alternative meanings or can have abbreviations and you
may wish to accept either of the alternatives, then the boolean word to
use is OR, e.g.
+T171 +HTML AND tutorial OR lesson OR Instruction
"Member of Parliment" OR MP
Use the above examples in both Google and other engines.
As you use the different engines, notice which ones you prefer. You should
do this each time you use an engine, with the aim of finding the one that
suits you. The following may help you in making this decision.
How good are the top ranking results in answering your search criteria.
Is the short intro to the site helpful.
Does the engine highlight the keyword(s) or Keyphrase(s)
Does the engine supply a lot of unhelpful "paid for sponsership"
sites at the top of the list.
These boolean expressions are used to reduce the number of results that
you obtain, by getting rid of many of the unwanted sites. For this exercise
you are going to try and find all of the sites that I have written under
the name of John McGuinn, see table below. Your search will also bring
up some other sites that link to my sites and have used my name.
I have used my name for this exercise because it is not very common,
it also is not a high ranking keyword on my sites. This should therefore
give you a good insite into this very useful technique.
The 3 sites of mine are:
Site
Some of the Main Keywords
http://www.tutorials4u.com/html/
or http://www.tutorials4u.com/
T171, T170, HTML, tutorial, Web design
http://members.aol.com/M206ou/m206/
M206, Smalltalk, tutorial
http://members.aol.com/freetutorials/c/
T223, C, tutorial
There are more John McGuinn's in the world, and it is your task to remove
these from the search results by using the _ or NOT boolean
expressions.
Step 1. For this exercise use google. Search
for "John McGuinn" and note the number of results.
The key to reducing the number of results is to try and spot keywords
in the other John McGuinn's sites that are none releventto my sites as
given in the table above. I do not want to give too many clues to doing
this exercise. But let's assume that entertainer crops up
in a lot of the results, this is certainly none relevent to my sites,
but you should look for a better keyword or keywords to use.
Stemming is a process of finding additional words with different endings.
E.G. If you typed in help as your search query stemming would find help,
helping, helped, helper etc. Some engines use stemming automatically,
but the same result can be achieved by the use of wildcards in
the non stemming engines.
Wilcards can be used to include the:
Word plurals
Stemmed words
Words you are not sure of the spelling.
The wildcard characters are * and ? and these normally stand for a single
or group of characters, and a single character respectedly. Not all search
engins work the same way and some engines reverse these uses. A quick
look in the engines help section or a quick search experiment should verify
the swituation.
Exercise 5.
Try the following and note what words are brought up for the keywords.
Leisure Time - book your holiday or flights with our UK sponsors
Book a Monarch Holiday
Book a Thomson holiday
Book a Portland Holidays Direct
Fly Thomas Cook flights
Direct Holidays
Jet2 Cheap flights
More Tutorials
Home page of a tutorial in programming in Smalltalk, a object-oriented programming language.
This is an ideal tutorial for anybody learning Smalltalk and of particular interest
to students on courses: M206 at the OU Open University, and course CSC517 at NCSU North
Carolina State University
Home page of a tutorial in programming in C This is an ideal tutorial for anybody
learning C programming language, and of particular interest to students on courses:
T223 at the OU Open University.