What does Second Wave Indexing mean in SEO?
Second Wave Indexing is how search engines cope with websites which are difficult to read. What exactly is it and are you holding back your Search Engine Optimizing because your website demands it?
To understand Second Wave Indexing first you need to understand how your device shows you a page on a website. So ....
- You decide to open howtoseo.link2light.com
- Your browser requests the source code of that web page from my hosting company and reads it line by line
- Early on in the code it will come across a reference to another file (the CSS file) so it will request this from the server. The CSS file defines what things should look like. How big the text should be, what should be the background colors of various areas on the page, how wide certain elements should be, etc., etc., etc. There might be multiple CSS files that it needs to request.
- Then it will come across references to Javascript files. Each one of these then needs to be requested from my hosting company and downloaded. Javascript files generally tell the browser how to handle certain user reactions. For example you scroll the page and the red box on the right hand side which says 'Get Help Ranking' scrolls with you.
- Your browser puts all these things together and then shows you the page.
Google's crawler robot wanders around the web searching for new websites it doesn't know about or collecting changes that have occurred on the ones it already has in its database. The crawler doesn't open the website in the same way as you do.
Here's how it works:
- The robot only reads the source code, it ignores any of the other files referenced in the source code such as the CSS file which dictates how it should look (text sizes, background colors, etc. etc.) or JS files (Javascript which usually drives interactive parts of the page)
- Later, and this could mean days, weeks or months, Google then looks at the page as a human would -i.e. it renders the page with reference to all the external files taken into account.
The first step is often referred to as
Headless Browsing because the robot is ignoring a lot of information in the Header area of the website code. In reality there are plenty of other places on the page which can be affected so don't take that term too literally.
The rendering which happens later is referred to as a
Second Wave Indexing. It is looking to see if the first crawl missed anything which can only be seen when external files are loaded and the page is fully rendered as it would be on a browser in front of a person.
Second Wave Indexing has a fundamental impact on SEO because if you are not showing all your content in the basic source code Google isn't going to rank you anywhere until it has done its full rendering. If links to other parts of your website are only displayed after full rendering Google will only become aware of these once it does its full render which means it can take months for it to discover all your pages.
In fact sometimes it never does because at some point it will go back to the first page it discovered and start over (just to check nothing has changed). It won't keep going deeper and wider because Google has a crawl budget for each domain - a finite amount of resource it can spend on your website. The quieter your site, the smaller the crawl budget.
Those with websites that use heavy Javascript to display their main content suffer most from this. Here's an example:
- You decide to open samplewebsite.com
- Your browser requests the source code of that web page from the hosting company and reads the source code line by line.
- An external Javascript file tells the browser to display different content depending on your screen size.
Now the Google crawler never fires this Javascript in the first visit so to Google the page would look virtually empty ... nothing of interest here. That's why for all crucial content I do one of the following:
- Avoid using Javascript to load crucial content - this is my approach on howtoseo.link2light.com
- Make sure your Javascript code is not in an external file - not guaranteed to work though. In my example the Javascript said what content to load depending on screen size but a crawler has no screen size.
- Have comprehensive content in the basic code which then gets replaced depending on the users screen size - this way the crawler always see at least one version of the content.
To see if your website is hiding anything from Google open the site, find a blank space, right hand click and select 'View Source'. It might be something slightly different to 'View Source' depending on the browser but it should be quite obvious.
You'll usually find hundreds or thousands of lines of code here. Press Ctrl and F at the same time to open a search box and then try searching the text of key elements of your content - the title of one of your menu options, a sentence in the main body, etc. If you are missing anything that you can see on the website then it is probably because it is hidden away in something like Javascript.
If you want to see an example of the issue look at the source code of https://growth.org/discuss/should-neil-patel-take-down-this-ad-is-it-sexist-downright-offensive
None of the main content or the hundreds of comments can be seen in the source code. Only when a search engine carried out their second wave indexing will they get some idea what the page is about.
Web Developers love using Javascript to make sexy websites and who can blame them - you can do all sorts of fun things that make for a better user experience ... like showing more relevant content depending on screen sizes. But Web Developers don't always understand SEO and so can end up making life very difficult for search engine robots. There are ways you can do both, any good Web Developer will know how.
Summing up
Second Wave Indexing is a way search engines make sure they get all of a websites content but it takes time. If you can't see your key content in the source code then your site needs second wave indexing which means its going to take longer for your work to appear in the search results - and longer for updates to be seen
I've never seen a reason where this needs to be. If achieving high levels of visitors via the search results is your aim then you need to make sure you aren't on a javascript platform or move off one if you are.
Perhaps one day someone will show me a website where it is absolutely essential. In this case just make sure that some default content is added before the Javascript call to the search engines get something they can crawl straight away.