|
|
| |
Search Engine
Optimisation - Gconnect -
The Business ISP |
|
|
|
|
|
|
 |
| |
Introduction to Search Engine Optimisation
Search engine
optimisation is the process of constructing (or
reconstructing) a web page so that it retains or
improves its usability to the user but gives it
a little more usability for a search engine
robot. The content and linking of web sites are
discussed in other pages, this page discusses
the technical side from a web page design and
build point of view. |
|
| |
|
|
|
| |
Search Engine Spiders
or Robots |
|
|
| |
As discussed in the
Search Engine Submission page, the web is
crawled by spiders or robots from the major
search engines. These robots gather data and
report back to base forming an index. The index
then allows users to search the web. Note that
when you type ‘business isp’ into the search box
on Google.com, it does not search the web, it
searches the index that Google built in its last
update. The updates are different for each
engine, but Google updates about once a month.
As search engine optimisation practitioners, we
need to create sites that Googlebot (or others)
can navigate easily and will return to
frequently.
About HTML and Search Engine Optimisation
Nearly all web pages are made up of HTML –
even ASP or PHP pages deliver the content in
plain old HTML. HTML is made up of two main
elements, tags and content. The content is the
text, links to the graphics and that’s about it.
The tags are all the bits that control the look
of the website, may contain JavaScript and other
data.
Imagine that you are given a report with 1000
words on it. It starts with a title and has
subtitles and content. You remember the title,
then the subtitles and then the text. The
document is set out so as to prioritise the
important parts. It is also fair to say that we
remember the first part better than the
remaining parts. An example is that we all know
the phrase:
“It was the best of times, it was the worst of
times,”
but do you remember:
“it was the age of wisdom, it was the age of
foolishness, it was the epoch of belief, it was
the epoch of incredulity, it was the season of
Light, it was the season of Darkness, it was the
spring of hope, it was the winter of despair, we
had everything before us, we had nothing before
us, we were all going direct to Heaven, we were
all going direct the other way- in short, the
period was so far like the present period, that
some of its noisiest authorities insisted on its
being received, for good or for evil, in the
superlative degree of comparison only. – Tale of
Two Cities”
Probably not, and a search engine is the same,
although it will remember the text, it gives
greater priority to the content at the
beginning. This brings the thread round to the
importance of HTML structure.
For the purposes of the argument I have created
two basic web pages from the quotation above.
The first using the Microsoft Word "Save as
HTML" command which produced a file of 2.27KB
and then another page created in Microsoft Front
Page with all the extra tags stripped out and
the file size was 801 bytes (0.8KB). All of the
extra lines of HTML were before the actual
content, hence Word makes a very inefficient
HTML editor, and the part of the file that
Googlebot will prioritise is all useless. Search
Engine Optimisation is about correcting this
sort of problem
Spiders follow links round a web site. If you
have a really nice Flash menu that makes great
sound effects and nice visual morphing effects,
then the spider will only go as far as the first
page. This sort of navigation has to be altered
to enhance the performance of the site.
Titles, Description and Keywords
If you don’t already know,
these are what we call meta tags. In the early
days of the net, these were more important than
they are now. It took no time at all for
webmasters to realise that typing in "Pamela
Anderson" into the keywords or description
brought great results. Now the search engines
have wised up to this. In order of priority, you
should put a sensible and page-relevant title
into each page and then a short description.
Keywords are optional, but no more that 5 or 6.
Most engines just ignore the keyword tag now.
Frames
Some websites are constructed with frames,
although this does not prevent indexing by
search engines, it does make the job a little
harder. It is not so much the frame part of
the site as most spiders can now follow an SRC
link, it is the structure of the child pages
which gives the problem. The structure of a
child page normally has no outgoing links and
does not encourage the travelling spider onto
the next page.
Databases and Search Engines
Database driven sites have always been
controversial in the world of search engine
optimisation. However, web bots get cleverer
every day and will now follow links like
http://www.abc.com/products.asp?id=75 and will
then read the data from a database.
The type of URL shown here contains what is
called a query string. A query string contains
a key and a value pair. In this case the key
is the id, and the value is 75. The number of
key value pairs should be kept to a minimum
for a spider to follow them. Rumour has it
that two pairs is the maximum that a robot can
follow.
Spam
There are many ways to try and fool a search
engine, but the search engine companies employ
bigger and brainier people than the spammers,
and the penalty is a pretty hefty ban on your
domain. Some companies accept this as an
occupational hazard for the short term
results.
However, it doesn’t work in the long run. If
you are serious about using the web as a
business tool then don’t bother cheating.
Search engine optimisation takes a long time
and a lot of effort, and its not worth the
risk of throwing it all away for marginal
gain. For the record here is a list of
activity that is classified as spam: |
|
| |
|
|
|
 |
| |
|
|
|
 |
Invisible Text -
Using the same coloured
text as the background to repeat
the same word over and over again. -
Guess what? Search
Engines can detect it! |
 |
Invisible Text -
on a coloured wallpaper
background |
 |
Cloaking -
changing the content of the
page programmatically when the
robots visits |
 |
Duplicate content -
copying the same
content to duplicate sites |
 |
Keyword stuffing -
adding repetitive
pointless text to HTML tags |
|
|
| |
|
|
|
 |
|
|
|
|
|
|