Sure, there are multiple roads that lead to Rome and likewise are there ample ways to drive traffic to your website. At the end of the day however, nothing beats organic traffic (organic traffic are visitors that arrive at your website through a search engine).
Why?
Because it doesn’t cost you a penny!
Chances are the above fact is nothing new to you, yet most people fail to act on this from the get-go. As a result they’ve spent a lot of time and money on making their website. But due to the fact they did not properly optimize their site for the 2008 LSI-era, their website and business struggle while they scratch their heads and ask themselves:
“Why did Google, Yahoo!, and MSN/Live not rank me high at all?”
To answer this question, let’s look at how LSI works, niches, and take a quick trip down memory lane.
Topics and Niches
As we all know, the Internet is huge and there are multiple website on every imaginable topic.
Most topics have dozens, hundreds, thousands, or even millions of websites that have some type of content or all its content related to that specific topic.
Note that a topic or niche can be:
- very broad - e.g. “health insurance”
- specific - e.g. “radio controlled cars”
- very specific - e.g. “radio controlled cars by kyosho”
- localized - e.g. “baby shower decorator kansas city”
The general rule of thumb is;
“the more specific a topic, the fewer the amount of competitors.”
Likewise, a specific topic also means less traffic than a broad topic.
A specific topic with little competition and no mass-traffic is also called a “niche”.
Dictionary.com: niche - a distinct segment of a market.
Do note that niches can vary in size and a lot of niches are seasonal as well.
Whether your website has content on a broad or specific topic, you’re very likely to find fierce competition.
Unless you’re an established marketer with a lot of clout, you’ll want to stay away from overly broad topics as it will be extremely hard to rank on the first page.
This was very different only several years ago. “Back then” there wasn’t anything particularly hard about ranking high in the major search engines for any keyword of your choice.
Let’s find out why.
SEO Memory Lane
Search Engine Optimization (SEO) used to consist of identifying one or two keywords or keyphrases related to your topic plus possibly the main product (or the only one) you were selling on your website.
You would then take these 3 keywords or key phrases and litter them throughout your site in your title tag, meta tags, navigation menu, table summary tags, img alt tags, and most of all: in your actual content or articles.
The resulting website was a website with a so-called high “keyword density” for those 3 keywords or keyphrases. This practice is also called “keyword stuffing”.
Keyword density, measurement in percentage, is the number of times a keyword or keyphrase appears compared to the total number of words on a page (or in an article).
In the context of search engine optimization keyword density can be used as a factor in determining whether a web page is relevant to a specific keyword or keyword phrase.
Due to the ease of managing keyword density, search engines usually implement other measures of relevancy to prevent unscrupulous webmasters from creating search spam through practices such as keyword stuffing.
Let me illustrate with an example how keyword stuffing leads to junk (unnatural content) being indexed in a search engine:
“The dog walked to the car. The dog was purple. The dog was a german shepherd.”
^ this would be a keyword stuffed sentence, optimized for the keyphrase: “purple german shepherd dog”
The following sentence would be much more natural:
“The purple german shepherd walked to the car.”
As you can see, this sentence is a natural sentence and not a stuffed and spammed sentence.
Furthermore, note how the word dog doesn’t even occur in the sentence.
It’s not necessary because an LSI algorithm will know that a “german shepherd” is most likely to mean a dog (as opposed to a person from Germany whose job is that of a sheep herder).
The above example explains the problem this keyword density search engine ranking measure created.
The resulting “spam sites” were sites that included lists of keywords simply to get ranked while the actual content on the site was not relevant at all to what the search engine user was looking for.
In due time, the results a search engine provided were becoming less relevant and eventually a complete waste of time for the search engine users.
As a result, Google (and when the big G does something, the other search engines eventually follow suit) started lowering the importance of keyword density (also called keyword weight) with regards to how high a website ranks.
Backlinks
The first improvement the search engines rolled out to combat spam sites was adding a high-value to the number of “backlinks” (back-links) a website has.
A backlink is an incoming link to a website from another website (also reffered to as an “inbound external link”)
Generally, the more backlinks a website had, the higher it ranked.
After all, the “keyword-stuffed” websites and/or articles that unscrupulous webmasters used to put online were not very pleasant to read for humans (whereas a search engine algorithm wasn’t able to tell the difference at the time).
This means that an actual webmaster who “reviews” someone else’s site before deciding whether or not to link to it would often recognize the site as a “spam site” and decide against linking to the site.
Although basing ranking primarily on the number of backlinks was conceptually great, it was fundamentally flawed.
While still very popular and to some extent effective, the backlink SEO factor was ruined also as webmasters raced to get as many backlinks as possible. As is usually the case, the bad guys always win the battle (but not the war!)
Unscrupulous webmasters used so called “black hat” (think of it as shady) techniques to generate massive backlinks to outrank the “natural websites”.
Scripts were used to massively submit comments or forum posts loaded with links to forums with little or no moderation adding to the number of backlinks for that site.
Webmasters responded by introducing a “nofollow” element to a link on their website (thereby telling the search engine not to follow the link or to at least not count it as a backlink nor pass PageRank).
Google, Yahoo!, and MSN/Live didn’t sit still either and changed a number of things.
One of the things the search engines did was assigning values to individual backlinks.
For example, maximum value is given to a link from a site about dogs to another site about dogs.
A fair value is also given to a link from a dog site to a cat site (though different, they’re both about pets).
Meanwhile, a link form a site about dogs to a site about cars with receive a substantially lower value as the topics are completely unrelated and thus likely to be “spam link”.
There are other factors at stake for this mechanism such as the popularity, age, PageRank, etc. but since this is a blog about LSI, we’re not going to get into that.
For clarity I would like to note that generating backlinks through social bookmarking is still very effective for a short time after the links were first discovered by Google on the social bookmark sites.
This technique can get you ranked in Google within several hours but generally doesn’t last longer then a few weeks as the search engines consider social bookmarks a time-sensitive thing much like news is.
So although backlink-spam was combated by both webmasters and the search engines themselves, the problem wasn’t going to go away.
Much like the search engines moved on from “keyword density” to “backlinks” being their most important factor, now it was time to move on to the next big thing and so the search engines introduced Latent Semantic Indexing (LSI) to their so-called algorithm (an algorithm is the secret and complex “formula” that determines how well a website ranks in the search engine.)
Search Engines introduce Latent Semantic Indexing (LSI)
In late 2006 / early 2007 Google actively started using a brand-new method to assign value to a website when it comes to ranking it in their result pages.
As usual, when Google does something, other search engines generally follow suit.
Latent Semantic Indexing, or LSI for short, was introduced to once again provide better and more relevant search results.
Latent Semantic Analysis (LSA) is the technique in natural language processing of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms.
That sounds rather vague right? No worries, I will soon explain LSA / LSI more clearly using actual examples.
First, let’s simplify the concept:
LSI is an algorithm designed to make search engines reason like humans do.
In theory, for a search engine to consider your content optimally matched to the search query the user enters in the search engine, your content will have to be naturally written and worded.
While naturally writing an article, we impulsively include suitable keywords in our text.
If you remember your grammar lessons (no matter what language they were in), you’ll remember that in order to avoid repetition we would use similar words to these keywords with the same or very similar meaning.
Furthermore, we’d use singular, plural, as well as the different tenses of that keyword or related keywords.
So instead of looking at a handful of keywords or phrases, an LSI-powered search engine will look at the overall theme (topic) of a web site in determining how high to rank the web site.
For a search engine to determine the theme or topic of your website, it will look at the words found within your content and measure the relevancy of these word(group)s within your website in relation to each other.
Search Engines apply complex matching patterns to detect the different forms and tenses of your words and determines how natural your sentences are built up.
Luckily for those whose primary language is not English or do not write perfect English, the English language is one of the hardest languages for a computer to understand. A 10-word sentence can have 10 different meanings depending on which of the 10 words the emphasis is.
The study of machine-interpretations of a language is called Natural Language Processing and you can see it as the step-to artificial intelligence.
Although improving, we’re still several years away from computers being fully able to interpret an English article, when a breakthrough is reached though you can count on Google being the first one to buy (a license to use) the technique much like they did in 2002 with regards to LSI. In fact, Google has probably been internally researching NLP for years much like Microsoft has a whole team assigned to the research of Natural Language Processing.
For more information on NLP, check out the wikipedia entry.
After a search engine determines your theme/topic, the search engine will determine how relevant YOUR website is in relation to OTHER websites it has ranked already.
A search engines does this by grouping words and phrases together and comparing it to both their own database of relationships between the words and phrases it finds, as well comparing your content to established authority sites (e.g. your top-ranked competitors).
Super Tip: The success of your competitors usually leaves a track for you to follow. This is why it’s important to analyze the currently top-ranked websites for your keywords in order to improve your own website.
Although the entire algorithm of latent semantic analysis / indexing is extremely complex (see also “A brief look at Natural Language Processing” above), the best way to improve your website’s LSI-friendliness is by looking at the current top-ranked websites for the keyword or keyphrase you’re trying to rank on.
Latent Semantic Indexing Examples
First off, I’ll give you a short example:
If you do a search on the Associated Press (AP) or Reuters database for “Iraq” it will look, among others, for the following related keywords (something I call “horizontal keywords”):
- saddam hussein
- war on terrorism
- first / second gulf war
- weapons of mass destruction
and yes, even
As you know, all these terms RELATE to Iraq although they do not contain the word Iraq and are not exclusively linked to Iraq (for example “first gulf war” is also related to Kuwait, Desert Shield, Desert Storm, USA, Kurds, Chemical Attacks, Patriot Missiles, etc.)
Now let’s look at an example that you as a webmaster can better relate to.
We’ll look at the topic of “diabetes”.
If you were to make a website about diabetes and wish to rank well for just the term diabetes, it is imperative that your website addresses most of or ALL the subtopics of diabetes.
The most OBVIOUS subtopics that you’ll be able to come up with from the top of your head are probably:
- Symptoms of Diabetes
- Causes of Diabetes
- Types of Diabetes
If you have these concepts covered on your website, you have your foot in the door but still a long way from page 1 in the search results even if you have thousands of backlinks and a 5+ PageRank.
To really rank high, especially on an extremely competitive keyword like “diabetes”, you’re going to have to dig much deeper.
Not only will your website need to have a large variety of content related to diabetes, you must also categorize and structure your content properly, preferably in so-called “silo pages” (more on that in future articles).
Using the Free Trial of the #1 tool for LSI KeyWord Research reveals a wealth of horizontal keywords our main content HAS to include in addition to the above 3 subtopics if you wish to rank on page 1, especially with few backlinks and a low PageRank.
We find the most important horizontal keywords and keyphrases:
- Blood Sugar
- Insulin
- Health
- Glucose
- Disease
- National Risk
- Pancreas
- Hypoglycemia
- Diabetic Recipes
- High Blood Pressure
- Obesity
- Hypertension
- Diabetes Research
- Diabetes Treatment
- Diabetes Diet
- Diabetes Research
- American Diabetes Association
But also:
and even
It is imperative that in order for your site to rank well for “diabetes”, your front-page needs to contain ALL these words and roughly 75% of these words need to be linked to sub/silo pages directly or indirectly going deeper into the word / concept / term in question. (we’ll talk about content funnels in a future article)
Furthermore, we also uncover various long-tail keywords that users are likely to enter into a search engine and thus should appear in our content or articles in one form or another (as long as it occurs
in a sentence in a natural manner):
- How Can You Prevent Diabetes
- What Causes Diabetes
- List of Foods for Diabetic to Eat
- Diabetes Symptoms Children
- Type 1 Diabetes
- Warning Signs Diabetes
- Beginning Signs of Diabetes
- Early Symptoms of Adult Onset Diabetes
- Cause of Diabetes Type 2
- New Treatment for Diabetes
- Diabetes Type 2 And Treatments
- Gestational Diabetes during Pregnancy
and the list of long-tail keywords found by KeyWord Excavator goes on…
For a more in-depth look into how the above results were obtained, I encourage you to read my mini case-study and to also do a query for Diabetes or any other keyword or keyphrase yourself using my Professional KeyWord Excavator tool.
How Search Engines “read” your Content Semantically
To demonstrate roughly what a search engine sees when it comes to LSI, I’ve created a tool that will strip your article of all words that do not carry any semantic meaning leaving only (LSI) content words.
Check it out by visiting the above link and you’ll see what I mean.
For example, my tool will turn this sentence:
“Going cold turkey may help to get excessive nicotine out of your system quickly, using within a few days or so, however, the withdrawal symptoms are usually pretty severe and intense, as is the physical discomfort that goes with this.”
into:
“cold turkey excessive nicotine system days withdrawal symptoms severe intense physical discomfort”
As you can see, my tool (as will a search engine) filters out the “noise”. What remains is a bunch of keywords and terms the search engine will use to rate your content from an LSI point of view.
These are also known as “content words” or “semantic words”.
My Free Semantic Article Cleaner Tool can be used to:
- Determine if the article you wrote is focused enough and relevant to the overall theme of your website
- Generate a list of Tags to be used for your blog post
- Generate a list of PPC KeyWords laser-targeted to your article
I also encourage you to try out my KeyWord Research Software Tool which is used by SEO professionals as well as webmasters big and small.
It’s free to try and doesn’t require you to sign up for, or even download anything so why not give it a whirl?
Keep in mind that if done properly, a search engine will recognize the theme of your website and value it above established / top-ranked websites and as such return it ahead of those competing sites.
The Bottom Line
LSI helps search engines identify those websites that have unique and relevant content to the keyword a search-engine-user entered and - when LSI-optimization is done right - give them
a favorable rating.
If you wish to outrank your competition in the Search Engines in 2008 and beyond, you will have to design your future websites and redesign your current websites with Latent Semantic Indexing in mind.
If you’re ranked in the top 3 already, you best redesign your current site, as it is only a matter of time before your competitors optimize their site with LSI in mind and outrank you as a result.
Although you’ve probably learned much in this long article to start improving your websites, designing a website with LSI in mind entails much more than just including horizontal and long-tail keywords. This article merely scratches the surface of what LSI is all about and how you can use it to improve your site’s visibility and rankability to the Search Engines.
In future article posts and video posts on this blog as well as exclusive non-web content sent to my mailing list I will focus on:
- How to properly structure your website with LSI in mind
- How to build “silo pages” the search engines are simply going to love
- How to utilize LSI and KeyWord Excavator’s results to generate dirt cheap traffic through PPC advertising
To stay ahead in the ever-changing world of Search Engine Optimization, stay tuned for the above and many more articles.
Sign up for my Newsletter and I will inform you whenever I post new content as well as send you exclusive tips and tricks to improve your website that you won’t find anywhere else on the internet nor even on my own blog!
Disclaimer: Note that many of the SEO techniques discussed in this or other articles on this blog have many books and studies dedicated to them and some can still be very effective if applied correctly.
Due to the nature of this blog and the fact that I specialize in LSI (Latent Semantic Indexing) and NLP (Natural Language Processing) this blog will mainly focus on LSI and explore techniques to use LSI to your benefit.