Wow, I suppose I don’t really update this blog anymore now do I, well, I have a good reason for it and this post will explain why.

I’m going to try and keep this relatively short because as some of you know from my support ticket replies, I can be quite lengthy and detailed, but I suppose that’s not always a bad thing.

 

You see, the reason I haven’t even looked at this blog lately is because I’ve been incredibly active in what I call Blog Network Syndication.

What this means that I submit articles to large networks of niche-specific blogs.

I’m sure you’ve all heard of article marketing and have submitted content to article directories before.

Look at it this way; article directories are old and dead, and blog network syndication is the new ‘it’ if you want to get massive targeted backlinks (and traffic).

You see, article directories are generally… general and cover every single topic under the sun but focus on none.

Most blogs on the other hand are focused on one specific topic, or niche.

This year Google really started caring more about niche-related links by giving the more ‘link juice’ which increases your rankings.

So right now, a link from a site about ‘Everything’ to a site about ‘dog toys’ is worth little. A link from a general site about pets to a site about ‘dog toys’ is worth more. A link from a site about dogs to a site about dog toys is worth a lot, and if you really want to hit the jackpot, you’ll get a link from a dog toy site to your dog toy site!

 

Why am I telling you this?

Well, I’ve recently joined a network that has over 4,000 niche-specific blogs in its network. This means that I can get extremely targeted backlinks in a huge range of topics straight to my site!

What’s more, I don’t just get a lame resource box link but I am allowed several links within the article itself! It’s really quite amazing.

Oh btw, if you’re a blog owner you can also participate by receiving this unique content!

So you either submit content and receive massive quality backlinks or you receive content and increase the size of your blog with quality content.

Or you can use the system both ways! It’s up to you!

For more information and to learn how to sign up, go here;

http://www.keywordexcavator.com/sk.php
http://www.keywordexcavator.com/sk.php
http://www.keywordexcavator.com/sk.php

Have a Wonderful Day!

 

Jos Jongejan
KeyWord Excavator Developer

Share/Save/Bookmark

Google has just surprised a lot of people with the release of their browser called Chrome.
Although the browser is the best thing that has been released to the market in a long time (check the awesome features), there’s something else that I wanted to address.

I’ve expressed my “paranoia” about Google for years now. I’ve always suspected Google to use data from services such as Analytics and Webmaster Tools in their search algorithms. If site A links to site B but both A and B are found in the same analytics or webmaster tools account then the link is less valuable, that sort of thing.
Far fatched as that may be, here is something that isn’t.

Quite recently, Microsoft announced they were “looking into” utilizing “search intent” data in their ranking algorithms. In English, this means using their Internet Explorer browser to determine how long you’re visiting a page and other data on how you are interacting with the site.

We’ve all seen expired-domain sites that are nothing but a bunch of advertising links wrapped in a nice design. A computer cannot (yet) distinguish these from ‘good’ sites, at least not with 100% certainty.
The average user, and moreover, the more advanced users (which as it may are the people most likely to download Google Chrome ASAP) however can easily tell when he/she has landed on such a site, and will click away / x-out in a matter of seconds.

You probably figured where I’m going with this but first let me point out this sentence in Google Chrome’s EULA:

Optional: Help make Chrome better by automatically sending usage statistics and crash reports to Google.

In fact, it’s not even a sentence in the text itself, it’s a separate line below the EULA and you have to check it.
At least it’s optional!

Now in my spare time I make websites with high quality content (for money). I am all for using behavioral patterns in search engine ranking algorithms as it’s one of the best metrics to determine the validity of a website.
It’s only the logical next step in the ever-lasting fight against black-hat marketing.

Obviously, Google has no obligations to release any information on their algorithms so for all we know they’ve been using behavioral data to a certain extent as gathered from their toolbar. Now though, there’s an “official” and legit way to gather this data and more, but please be open about it?

So instead of letting us check this box:

“Help make Chrome better by automatically sending usage statistics and crash reports to Google.”

Just make it

“Help make our Search Results better by automatically sending usage statistics to Google.”

What do you think? Behavioral patterns tossed into search algorithms, good or bad?

Share/Save/Bookmark

One of Google’s Engineers posted an entry on the Official Google Blog yesterday explaining some of the technologies behind Google’s ranking.

It’s an interesting post in general but I wanted to highlight one paragraph in particular:

Another technology we use in our ranking system is concept identification. Identifying critical concepts in the query allows us to return much more relevant results. For example, our algorithms understand that in the query [new york times square church] the user is looking for the well-known church in Times Square and not for articles from the ‘New York Times’. We don’t just stop at identifying concepts; we further enhance the query with the right concepts when, for instance, someone looking for [PC and its impact on people] is in fact looking for [impact of computers on society, or someone who searches for rainforest instructional activities for vocabulary] is really looking for rain forest lesson plans. Our query analysis algorithms have many such state-of-the-art techniques built into them, and once again, we do this internationally in almost every language we serve.

As you can read, Google talks about “concept identification”. This is a less-fancy word for Latent Semantic Analysis (LSA).
The examples Google provides are wonderful examples of LSI at work.
As you’ll probably know by now, if you want your pages to rank well you should have most of the keywords in [rainforest instructional activities for vocabulary] on your page but also have “rain forest lesson plans” as an anchor link for example to one of your subpages (or as inbound links from external pages).

Remember you can identify 90% of these LSI-keywords through my KeyWord Excavator tool. If you’re not familiar with my tool yet, be sure to check it out as there’s no sign-up and no download required. Oh it’s free to try too! KeyWord Excavator: LSI KeyWord Research Software Tool

Share/Save/Bookmark

As a long-time ClickBank Publisher and Affiliate I’ve often been frustrated by the fact that you simply have no control over where a “ClickBank Hop Link” points to.
Although there are scripts out today that do what I’m about to show you as part of a larger product (such as a complete affiliate tracker like EasyClickMate - which by the way is wonderful), why purchase a full-blown product if you only want to control your hoplinks?
So lets tame those out-of-control hoplinks shall we?

The following can be done with ASP just as easily but in this article we’ll focus on doing this through PHP.

So what am I going to do? I’m going to point visitors to any page on my site with just a few lines of code that only has to be implemented in 1 spot hence it should be relatively easy for anyone to implement this technique.
The kicker here is that before they land on the desired page, they will first be taken through the standard ClickBank Hoplink (http://AFFILIATE.PUBLISHER.hop.clickbank.net) so that the visitor is still tracked properly in order for the affiliate to get credit when a sale is made.

There are actually two ways to do this depending on how much control you want over where people can point your hoplink to.

  1. Give full control of what URL can be pointed to by putting the file and path directly in the url like so:
    http://www.keywordexcavator.com/?goto=/lsi-seo-blog/2008/ how-an-lsi-search-engine-sees-your-articles-or-website/&aff=omalainet&tid=kwx_sac
    This would redirect the user to http://www.keywordexcavator.com/lsi-seo-blog/2008/how-an-lsi-search-engine-sees- your-articles-or-website/
  2. Or I can predefine the locations that can be directed to in my code (more on that later in the article) and the URL would look like:
    http://www.keywordexcavator.com/?goto=blog&aff=omalainet2&tid=kwx_sac
    Which takes people to my blog located at http://www.keywordexcavator.com/lsi-seo-blog/

Before we get started, here are a few practical applications you can use this technique for or what I’ve used it for so far;

  • As an affiliate I can recall many instances where a publisher put up an awesome video promoting their product but the ClickBank affiliate link would only send my visitors to their sales page, not their video! Had the publisher used a simple script like mine I would have been able to link directly to the video while getting credit for any possible sales.
  • I want affiliates promoting my KeyWord Excavator product to be able to point people directly to my Free Online Trial which is a major selling point. To achieve this, an affiliate would simply use this link: http://www.keywordexcavator.com/?goto=trial&aff=johndoe&tid=email1 - Go ahead and try it. As you can see you can even pass ClickBank’s own tracking ID (tid) as well so the affiliate can see where a sale originated from.
  • I created a viral tool called Semantic Article Cleaner. I’ve made it so people can host my tool on their own website and make money by incorporating a clickbank affiliate link that promotes my KeyWord Excavator product, however, there’s also a link in there that points to my blog post containing additional information on the free tool.
    If I were to link directly to the blog post and a person that clicks the link from my tool hosted at someone else’s blogs and ends up buying my product, the affiliate is not going to get any credit since the visitor didn’t come in through an affiliate link!
    So we can resolve this by using my script to credit the affiliate and still pointing the visitor to the page we want him to link to which is my blog post.
    The result is this link: http://www.keywordexcavator.com/?goto=blog_sac&aff=omalainet2&tid=kwx_sacb (again, try it!)

Note that the tid at the end of the link is optional and needs to comply with clickbank’s 8 character limit, it is purely meant for tracking and the value can be changed at the affiliate’s will.
So let me recap before we get down and dirty with the simple and short php code:

  1. Visitor clicks http://www.keywordexcavator.com/?goto=blog&aff=omalainet&tid=whatever
  2. The index.php on www.keywordexcavator.com sets a cookie or session that will expire in 15 seconds
  3. The index.php then redirects to the standard ClickBank hoplink using the affiliate ID and the tid: http://omalainet2.omalainet.hop.clickbank.net/?tid=whatever
  4. ClickBank sets a cookie on their domain and stores the visitor’s IP in their database to properly assign a sale to the affiliate if and when a sale is made.
  5. The clickbank hoplink sends the user back to my www.keywordexcavator.com/index.php (landing) page
  6. My tiny script sees there’s a cookie set that tells it to redirect to my blog (goto=blog). The cookie is deleted and the user is redirected.

Result: The person that clicked the link is cookied through the regular clickbank hoplink process yet does not end up on my landing page but lands on any page of my choosing.
The Code - Both Type of Redirects
This is the exact code I have on www.keywordexcavator.com (which is where my ClickBank account points to) at the time of writing:
if ($_GET["goto"]!=”" && $_GET["aff"]!=”") {
session_start();
$_SESSION["goto"]=$_GET["goto"];
header(”Location: http://”.$_GET["aff"].”.omalainet.hop.clickbank.net/?tid=”.$_GET["tid"]);
exit;
}

session_start();
if ($_SESSION["goto"]) {
if ($_SESSION["goto"]==”blog”) {
session_destroy();
header(”Location: http://www.keywordexcavator.com/lsi-seo-blog/”);
exit;
}
elseif ($_SESSION["goto"]==”trial”) {
session_destroy();
header(”Location: http://www.keywordexcavator.com/trial/”);
exit;
}
elseif ($_SESSION["goto"]==”tools_sac”) {
session_destroy();
header(”Location: http://www.keywordexcavator.com/lsi-seo-blog/semantic-article-cleaner/”);
exit;
}
elseif ($_SESSION["goto"]==”blog_sac”) {
session_destroy();
header(”Location: http://www.keywordexcavator.com/lsi-seo-blog/2008/how-an-lsi-search-engine-sees-your-articles-or-website/”);
exit;
}
else {
session_destroy();
header(”Location: http://www.keywordexcavator.com”.$_SESSION["goto"]);
exit;
}
}

There you have it.

This is the code that allows you to predefine a few ‘tags’ and point them to a certain page within your site (or even an external page if you wanted).

If a predefined ‘tag’ is not found the script will try to read the ‘goto’ variable directly and redirect to that subpage.

If you don’t like the idea of your affiliates having 100% control of what page their link points to you can remove the last bit of code that reads else { … }

So this code allows you to use:
http://www.yoursite.com/clickbank-landing-page.php?goto=/your-blog/&aff=johndoe&tid=blog
to link to:
http://www.yoursite.com/your-blog/
While still sending the visitor through the ClickBank affiliate hoplink.

And also a pre-defined link in the code:
http://www.yoursite.com/clickbank-landing-page.php?goto=blog&aff=johndoe&tid=blog
to link to:
http://www.yoursite.com/what-ever-subpage-you-predefined-in-code
That’s all there is to it folks!

Of course there are ways to expand on this code and if there’s enough demand I might do so in the near future. Think of adding tracking, a database, etc.

Please don’t hesitate to post any questions you have in the comment section below, I will read it and I will respond to it and help you in any way I can.

Share/Save/Bookmark

Today I’d like to discuss the workings of Latent Semantic Indexing a bit further. It will help you understand the importance of LSI even better and it will also help explain why I wrote a free mini-tool that reduces “noise” out of an article and leaves you with only semantically-relevant words (or “semantic words” as I’ll be referring to them from here on).

To recap my introduction to LSI article and refresh your memory; old-school search engine techniques approached keyword searches with sort of an accountant mentality: a word is either found in an article / on a web page or it is not, there is no middle ground.
In addition to indexing the sites that DO contain a specific keyword, Latent Semantic Indexing looks at an entire collection of documents as a whole, that means a subsection of your website, your entire website, and sometimes even your website and several other websites that have multiple links to yours.
While attempting to assign a value / rank to your specific page in question, LSI looks at all the other documents for the same or related words. LSI tries to simulate a human being when it comes to judging relevancy among a set of documents.
In part due to the complex nature of the English language, LSI does not understand what the words mean, although the patterns LSI picks up on can make it seem incredibly intelligent while in fact it’s still a ‘dumb computer’.


What does an LSI algorithm look at?
An LSI algorithm will index a set of documents (which can be a handful or thousands of documents) and calculates similarity values for every semantic word (more on this later). Obviously, the formulas used in determining similarity and relevancy values are extremely complex and above all kept secret by the search engines.

When comparing one document or page to another, LSI algorithms are not looking for an exact match of a specific keyword to determine relevancy. The two documents or pages therefore do not have to contain the same keywords in order to be relevant in the eyes of an LSI algorithm. This makes an LSI-powered keyword search much better than a plain keyword search (or old-school keyword search as I called it earlier).

To use an example, let’s say a collection of diabetes-related articles is indexed by an LSI search engine. If the words diabetes, insulin, and glucose appear together in enoguh articles, the LSI search algorithm will figure out that the three terms are ’semantically close’ or in plain english: the terms are related to eachother.
As such, a search for ‘diabetes’ will return a set of articles containing that phrase BUT also articles that contain just the word insulin (and not diabetes). Think of an article explaining what insulin is and what it does to your body but does not mention the word diabetes even once, an LSI algorithm will still agree on the fact that it is indeed relevant to ‘diabetes’ even though the search engine doesn’t know anything about ‘diabetes’ like a human being does.
So by examining enough documents, an LSI algorithm teaches itself that these three terms are related. It uses this information to provide more sophisticated and natural search results.


So an LSI Search Engine Bot visits my site, what does it do?
Let’s assume the following scenario. You have a website that covers the topic of say… diabetes! On that site you have a mere 10 articles with related content. You have an article that explains the symptons, another article discussing treatment, an article explaining the different types of diabetes, etc.
An LSI-powered search engine spider like Google’s Bot will visit your website and index your frontpage and your 10 subpages and consider this a ’set of related documents’. The algorithm will however determine how well-related your documents are.

Let’s have a closer look on how it looks at your site.


Noise-reduction: Tossing words that don’t carry Semantic Meaning
As you learned earlier, an LSI algorithm will search your set of documents for patterns of word distribution and the co-occurence of words.
To make its job easier, an LSI algorithm will start by filtering out words that don’t carry any semantic meaning. To explain this, note that natural language is full of redundancies, and so not every word that appears in a document carries semantic meaning.
Think about the most frequently used words in English (the, of, to, and, or, etc.) and consider that they don’t really mean anything. As you probably know, search engines even discard these type of words when you enter them in a keyword search.

An LSI algorithm has a huge set of words it filters from a document, leaving only “content words” or “semantic words”.

Though not a huge issue, a slight pitfall of this approach is that depending on the context, a word can be a semantic word or a junk word. For example, consider an article or advertisement for a car that contains “rolls royce phantom in good condition” - this is where the word good is not junk as it is a relatively important aspect of the content. On the other hand, consider an article that mentions “the good news is that…” - this is where “good” is just another junk word and can be safely tossed.


LSI Noise Reduction in action
As you might have noticed by now, when I explain things I prefer to explain them with a detailed example. Better yet, I take a ’see for yourself’ approach wherever I can. With that in mind, I recently spent an entire week collecting and analyzing the type of words an LSI algorithm ignores and created a tool that shows you exactly that.

My Mr. LSI’s Semantic Article Cleaner, much like an LSI-algorithm, will filter the following words:

  • Common Adjectives (big, small, low)
  • Common Numerals (two, tenth, millions)
  • Common Verbs (do, be, see)
  • (Compound) Prepositions (after, near, with)
  • Conjunctions (and, for, than, where)
  • Conjunctive Adverbs (also, consequently, nevertheless)
  • Contractions (can’t, i’ll, wasn’t, wouldn’t)
  • Pronouns (i, his, whomever, yourselves)
  • Interjections (awesome! whoops! good grief! etc.)
  • Articles (a, an, the)
  • Frilly Words (albeit, however, moreover, moreso, therefore, thus)

You can try out Mr. LSI’s Semantic Article Cleaner here.

Try out my tool above and be amazed how ‘clean’ your article looks after it’s been filtered. Note you can also pre-populate the fields in case you don’t have an article of your own to test handy right this moment.

Back to the LSI algorithm as it sees your set of 10 diabetes articles. To determine the Top XX relevant words or word phrases, the algorithm will discard any words or sentences that appear in every article or page (mainly useful to ignore navigation menus, footers, etc.)
Additionally, it will discard any words that appear in only one document accross your entire set of articles.

This process condenses your articles into sets of semantic words that the search engine will now use to index our collection.


So what does that mean for Me as a Webmaster?
LSI is something that’s here to stay and will only improve. I personally expect Google to either make internal breakthroughs or license a third-party patent/company with regards to Artificially Intelligent Search Indexing within the next 6 years.
As you read earlier, the only problem with LSI is that computers still cannot understand the context of words and articles it indexes. As research is being done into Natural Language Processing and Artificial Intelligence, it is only a matter of time before Latent Semantic Indexing moves onto the next level.

The bottom-line is that Google will see right through “unnatural” content more and more, making today’s article spinners completely worthless.
Moreover, it is now more important than ever to structure your website properly with LSI in mind. This means your website should be populated with the most relevant content and structured logically as a human being would expect your site to be structured.

You have the greatest tool of all to structure your website and that is your brains. When it comes to a topic you’re not an expert on however it is easy to miss out on important sub-topics of a niche you’re making a website in. This is where my “KeyWord Excavator” tool comes in. You can try this tool for free and there’s no sign-up nor download required. Try Mr. LSI’s KeyWord Excavator here.

Share/Save/Bookmark