What does Natural Language Processing (NLP) mean in SEO?
Natural Language Processing (NLP) is set to become a key part of how search engines rank web pages with Google already leading the field. So what is it and what should you be doing about it?
Early attempts to get computers to answer questions from humans assumed humans asked carefully worded questions and used correct grammar and spelling. Of course ... they don't and languages (especially English) can evolve rapidly with thousands of new words coming into use every year, not to mention new ways of using them. NLP uses Machine Learning to gather data about how people really talk and then try and figure out what they really mean.
Google is well placed to do this because users carry out billions of queries every month on the search engine and by watching which websites seem to meet their needs it gets to know those questions better.
As humans we find understanding vague questions fairly easy because we usually have context. If someone says to you "I want to buy a coat" we know the general weather conditions. If its high summer they probably want a light coat or jacket. We may have some historical context - yesterday they told you that they were going to a wedding so perhaps they need something elegant. You may know something about their tastes - color, cut, etc.
Traditionally search engines had to cope without any of this background although Google already considers our location and websites we have visited in the past. So its already starting to build context but what happens when someone asks a question it has never seen before.
Natural Language Processing aims to overcome this without requiring a human to adjust bits of code. Rather than being told the grammar rules it learns them. Instead of considering each word in a question individually it considers the sentence and the way the words are used in that sentence.
"I want to buy a book
", "I want to book
a flight" - here 'book' has very different meanings depending where it is used in the string of words.
This is just the tip of the iceberg with all sorts of fun things like stemming, segmentation, lemmatization, etc. If you really want to discover what goes into NLP here's the Wikipedia Page
My focus here is what this means for Search Engine Optimization.
Why search engines want to master NLP
Search engines are primarily focused on answering "What is the best website page to answer this?" for any search that a user carries out. This came in two steps:
- Relevancy - which website pages contain keywords or phrases similar to the ones in the search and use these keywords or phrases in particular ways (regularly - but not too much - in the text, in the title tags, in the urls, etc.)
- Authority - which website pages that are relevant have links from other respectable websites which have a similar theme?
Using links was a great way to sort the quality from the rubbish in the early days of the Internet but it has a fundamental flaw. Websites which rank highly for a particular search term tend to gain more links over time. If I want to make a point, I go to Google and find the first article which supports my point, I link to it. But that does not mean it is the 'best' page on the Internet.
The 'best' page may have been added to the web last month but Google is ignoring it because it has no links. The author of the 'best' content would need to hire link builders or have access to other online marketing resources such as a large social media following in order to make large numbers of people aware from which some links would come. But even via this route it may take weeks or months to get that content ranking enough for it to earn more links naturally and eventually knock the current number one ranking result from its position.
That's not a very good way of deciding what content is 'best' - that's not going to help Google maintain its position as the worlds number one search engine.
A human on the other hand can be shown two articles and, without being told which is more popular, quickly conclude which is 'best'. Search engines want to be able to do the same and make popularity a secondary factor - yes, I don't think links will be discounted completely at any point soon. To me 'popularity' is a natural part of 'best'.
What we will see is a tipping of the balance because Google will become better at understanding relevancy and it will be harder to game it.
How to SEO a page for NLP
So what should you be doing to search engine optimize with natural language processing in mind?
- Test your content using a tool like Google's natural language text analysis to see your Entity Salience (what Google thinks your page is about)
- Look at Google's current top 20 for a keyword you are trying to rank higher for and see what words they are using and what related topics they are covering - there is a good chance Google expects to see these on 'quality' pages.
- Look at related phrases that a page on your website already ranks for but poorly in Google Search Console - these are phrases with a lot of impressions but few or no clicks. Could you add more comprehensive content to address these? i.e. to increase the salience of those entities.
- Check spelling and grammar - yes NLP is about going beyond these things but why make it harder for search engines to figure out?
- Remove waffle. Sentences which mean nothing dilute your content. "Cheap umbrellas break quickly. You wouldn't want that happening to you right?" The second part of this sentence is padding and contains no value. Google can see it. It just dilutes your entity salience.
- Remove ambiguity. "This product is now reduced in price, its unbeatable". What is unbeatable, the product or the price? "This product is now reduced in price, its unbeatable value for money".
- Use headings (h tags) to assist search engines understand that different parts of your content have different salient entities (subject areas).
- Provide scripts of films or audio - Google is getting better at understanding these but jump ahead of the game. For video uploaded to YouTube also upload the script.
- Understand Latent Symantic Analysis - what words mean the same thing and are you using them to show your content is rich and easy to read text.