What does Latent Semantic Analysis mean in SEO?
Latent Semantic Analysis (LSA) is a process used by all major search engines to achieve two goals:
- Understand what a web page is about
- Identify web pages with in-depth content versus those that contain thin articles - regardless of how many words are used
After carrying out LSA search engines then know what a page is relevant to and can then, along with other factors such as authority and load speed, know where to place it in their index. This is known as Latent Semantic Indexing. Be aware that in SEO Latent Semantic Analysis and Latent Semantic Indexing are often used to mean the same thing although this isn't strictly true.
Lets take an example.
I want to write a blog post about 'The best American cars of all time'. A poorly written article would repeat the word 'cars' regularly. A well written article would use alternative phrases such as:
And we would expect to see both the singular and the plural in use. Search engines know these words all have similar meaning to 'cars'. How? Well any Thesaurus
will tell you and they have their own.
But if it is a really good article they would expect to see certain other words such as brand names (Ford, Chrysler, General Motors, etc.) and words related to cars (wheels, steering, seats, dashboard, etc.).
These secondary related words are clear signals as to the depth of the content and their absence could also suggest that this content is not actually what it claims to be.
of the secondary related words it found words like casino, winner, bet, etc. these are signs that this content is masquerading as one thing but really about another.
I've underlined the word 'instead' above because obviously your article might contain these words like this:
"You wouldn't be out of place driving a 1964 Cadillac to a casino
. With one in your front drive would your neighbors have seen you as a winner
? You bet
That's not going to get you dinged by a search engine if their usage is low level while the secondary related keywords and phrases dominate.
As you are probably coming to realize now Latent Semantic Analysis is a complex business and its usually done by complex computer driven algorithms.
LSA also has its part to play in spotting spun content. This is when the same article has been rewritten several times with each rewrite simply replacing a few words with alternatives like this:
"I grew up in Alabama" -> "I was raised in Alabama" -> "I spent my childhood in Alabama"
Search engines know all three phrases mean the same thing and so if it found three different pages on your site where entire articles were essentially the same thing it might start thinking now is the time to blow your site out of the water by penalizing it for duplicating content in an attempt to rank higher.
If you are interested in knowing more about how search engines do these sort of things have a read up on Shingles (no not that).
Good Content, bad SEO
LSA also has a roll to play in figuring out what something is about even if the SEO is not up to scratch. Let's say I wrote an article "What to do this weekend" and in it I mentioned Covent Garden, River Thames, Richmond Park, South Bank, Borough Market, Big Ben, Hammersmith Appollo.
Search Engines know these are all place names related to London so I must mean "What to do in London this weekend". They can then get on with assessing the quality of the piece by looking for Secondary Keywords like tickets, opening times, queue, price, etc.