Thursday, January 15, 2009

Gag Me with a Spoon! - Latent Semantic Indexing?


For years it has been thought that Google is using word relationship technologies, one of which has been dubbed “latent semantic indexing,” or LSI, and when you think about it, this concept is really something out of a Sci-Fi movie. Not only does Google index certain words that appear in a document, but it examines the document collection as a whole, comparing it to other documents in order to determine which documents contain similar word choice. The really amazing thing is how well it correlates these semantically similar pages in a way that is strikingly close to the way a human would classify the same information.

We recently discovered an excellent example of this that almost makes it seem like an actual human being made changes to a search result because of the relevancy of the result. We did a search for “gag me with a spoon.” We all remember that phrase right? Well, some of the younger folk these days have no idea what it means, so a simple Google search is the logical solution. As it appears below, the sixth result for this query is a Wikipedia entry about “Valspeak.”



It just so happens that Valspeak is the term used to describe the kind of speech, or sociolect, associated with the phrase, “gag me with a spoon.” In other words, it is the language of “valley girls.” So that doesn’t seem so uncommon so far because you might think there are some examples of Valspeak, of which one would be “gag me with a spoon.” But the amazing thing is that this phrase does not appear even once on the entire page. Scour it as much as you like, but the phrase we queried is nowhere to be found. This is simply an excellent example of latent semantic indexing in which Google has taken terms that do appear, such as “valley girls,” “surfer slang,” “Southern California,” or even “Clueless,” and compared it to pages containing the phrase “gag me with a spoon.” As you might guess, there are probably a large number of commonalities with these pages, and thus, Google succeeds in placing a search result that is actually quite relevant to the query but that does not even contain that term.

So what does this mean for search marketers? Anyone can easily do a search to find the terms that Google considers relevant to certain keyphrases. Simply do a search for ~search marketing. The ~ character causes semantically related terms to appear in bold in the search results so that terms like online marketing and Internet marketing appear. It might be a good idea to include some of this terminology along with target keyphrases in order to take full advantage of latent semantic indexing and increase the relevancy of your pages.


About the Author: Peter Hamilton is the Project Manager in charge of the Seattle office of ArteWorks SEO. His interest and experience in search engine optimization is largely focused on social media optimization and multi-media facets of exposure specifically video SEO. To learn more about this search engine optimization company, visit www.arteworks.biz.


Labels: , ,

Read more!

6 Comments:

At January 16, 2009 12:31 PM ,
Blogger theGypsy said...

HI gang, Dave here... I wouldn't be throwing around LSA/I too quickly, it is how the myth get's perpetuated. Google originally purchased Applied Semantics in 2003 for the Ad Serving stuff (AdSense/Awords) and there are many legitimate arguments why it never made it too the regular organic side of things.

I find it best to talk about it in terms of semantic systems, not LSI per se. You might want to look into PLSI, HTMM and even phrase based indexing and retrieval as well. They are all things that relate and have also been looked at by Google.

Of note is the phrase based stuff as Google also purchased related technology (from Anna Patterson) shortly after the Applied purchase... so we could infer that is being used (equally dangerous assumptions).

In the end it's best to simply talk about semantic relations and how they apply in SEO... not misleading peeps with that which we DON'T know.

... have a great weekend

  At January 16, 2009 3:40 PM ,
Blogger Peter Hamilton - Arteworks SEO said...

Thanks so much for the insight Dave! After hearing your argument, I would agree that we "don't know," exactly how Google handles semantic relations. I would also agree that considering how semantic relations impact SEO is definitely the moral of the story. I don't know much about Hidden Topic Markov Models, so thanks for mentioning them. I'll check 'em out!

  At January 16, 2009 4:47 PM ,
Blogger theGypsy said...

Not a problem at all - feel free to get in touch and talk shop, compare notes and the like.

I sent you a lead via Twitter and I would also check out some phrase based indexing and retrieval... and Microsoft has some interesting semantic papers and patents. The main thing to express is the related concepts, more than us guessing at specifics. Besides, simply understanding these concepts can be some heavy lifting, one tries to make it more malleable for the general SEO public.

Its always great to see some more technical topics getting tackled, so kudos on that.

Talk soon... Dave

  At January 16, 2009 5:45 PM ,
Anonymous Anonymous said...

Did you ever think to look at the site link profile? There are quite a bit of sites linking to that page using the term "gag me with a spoon" as the anchor text.

  At January 16, 2009 9:54 PM ,
Anonymous Online Internet Faxing said...

At least he who has the most links will generally be at the top before all the related phrases. I doubt it would be that hard at all to beat out that wiki page for the phrase with that kind of keyword density.

  At January 17, 2009 12:07 AM ,
Blogger Matt Foster, CEO, ArteWorks SEO said...

But isn't that the point Anonymous? That LSI uses such things as inbound linkage, etcetera to determine related terms?

 

Post a Comment

Thanks for your Comment!

<< Home

Monday, March 3, 2008

Returning from SMX West

This past week, ArteWorks attended the Search Marketing Expo (SMX) in Santa Clara California, and I must say the experience is well worth it to anyone interested in any type of search engine marketing. Representatives from all of the major search engines as well as the most knowledgeable people in our field were in attendance. The seminars were focused, the keynotes were inspirational, but most of all, the networking was tremendous. This was my first conference to attend as a representative for ArteWorks, and getting to know so many brilliant and fun people in the industry was certainly a highlight.

The conference started Monday evening with a networking bash (with drinks provided), which was a great way to break the ice with people I have only known in the blogosphere. The next morning the keynote started early, lead by Danny Sullivan of Search Engine Land, focusing mainly on blended, personalized, and social Search. After a break, the seminars began. The next three days would be a solid chunk of focused info sessions ranging from social media, to video, to reputation management, to Q & A with search engine engineers, and beyond. Each seminar would be lead by a panel of experts in a particular field and was open to email questions at the end. I concentrated on the social media and blended search oriented seminars as this is what I am most interested in, and I found the presentations to be quite useful. Though I am already familiar with most of the concepts and strategies addressed, I felt like I was going down the laundry list of factors that make up a successful Internet marketing campaign and reevaluating my approach. For someone new to the industry, these types of seminars would be absolutely invaluable.

Now on to the parties! Each night there was an organized “networking” function that encouraged chatting with new people in the industry while letting loose a little bit. I felt that the atmosphere was extremely open and friendly, and a fantastic avenue for getting to know some of the most knowledgeable people in Search as well as those just getting started. Not only did I make some of the most valuable business connections ever, I genuinely had a great time doing it. We are not alone, all you Search Marketers out there!

If you have ever considered attending one of these events, I hope you will take my recommendation and make it happen. SMX offers so many great advantages in this industry, and it would be a shame to miss out.


About the Author: Peter Hamilton is the Project Manager in charge of the Seattle office of ArteWorks SEO. He has a Bachelor's degree in radio, television and film and extensive experience in social media marketing. Mr. Hamilton also heads up the ArteWorks SEO educational video series on topics related to Internet marketing and search engine optimization.



Labels: , ,

Read more!

0 Comments:

Post a Comment

Thanks for your Comment!

<< Home