Strategies, research, industry trends — your pulse on the marketplace
The   Bazaar   Voice
Strategies, research, industry trends — your pulse on the marketplace

If you have ever been around someone doing a technical SEO audit, it’s likely that you are familiar with the technique of using View Source in a web browser, which allows you to see the initial HTML markup used to create the page. Since the beginning of SEO time (about 1995) it has been well accepted that search engines read this version of web pages, which is commonly referred to as server-side markup. In many cases, server-side markup only includes a small portion of the total content seen by humans – client-side technologies like JavaScript and AJAX are used to finish the page. Because of this limitation, search engines have been blind to a significant percentage of the content on the Internet for nearly 20-years – since Excite, the first modern search engine, originally launched.

In 2011, Google announced plans to solve this problem. Most SEOs were intrigued and curious but very doubtful, despite a Matt Cutts tweet about the growing capabilities. Skepticism around this goal continued to increase as very little change was seen in the subsequent years.

On October 27th, 2014, Google posted a seemingly innocuous update to their Technical Webmaster Guidelines that went mostly unnoticed by the SEO community. It instructed webmasters to stop blocking JavaScript and CSS files in robots.txt files, and warned of sub-optimal rankings if the crawler was unable to read these files. However, the gravity of the change was not evident in the blog post.

For the SEO team at Bazaarvoice, Google’s announcement confirmed and explained unusual Googlebot crawl behavior that we had already noticed a few weeks prior. The changes were so unusual that we started daily monitoring Google search results for a number of client websites. Site after site, we witnessed the emergence of the new Googlebot, which was clearly interpreting web pages in an entirely new way. The most visible impact? The appearance or disappearance of rich snippet stars in search results.

We then experimented with a few client sites and confirmed that Google no longer indexes the version of web pages that we see when clicking View Source.

For anyone familiar with SEO audits, this is groundbreaking. A core, 20-year old principle of SEO is no more.

Now, the Google algorithm analyzes HTML markup that is available when using Inspect Element, a feature that is part of the developer toolkit in popular browsers like Chrome and Safari. When it comes to SEO for Google, it’s time to stop looking at View Source. You must audit using the Inspect Element code to fully-understand the content and markup hierarchy that Google is interpreting.

Of course, Google isn’t the only search engine. We are also monitoring the behavior of Bing, Yahoo, Yandex, and other search engines. We have evidence that Bing, Yahoo and Yandex are experimenting with JavaScript-enabled bots. However, our data indicates that these engines are still using traditional bots over 95% of the time. We are optimistic that these search engines will catch up before 2016. However, until that time, SEO professionals must perform technical audits on both the View Source HTML and Rendered HTML (Inspect Element).

For the SEO community, this is the beginning of a new era. Our understanding of how search engines crawl websites has not changed significantly since the beginning. While algorithms have changed many times, the bots that crawl and collect the data have been mostly consistent.

Google’s October 27th blog post marks a pivotal moment in search engine history – 20 years of assumptions, habits and tools must now be challenged. Also note, a massive number of SEO auditing tools, such as the Google Structured Data Testing Tool, have not yet been updated to reflect this new paradigm.

As we wrap up the first blog post in this series, take note that this massive Google change was not named. What does that mean? Is that significant? I believe it is. Updates such as Panda, Penguin, Hummingbird, Freshness, Page Layout, are all affected the Google algorithm, and therefore should affect content strategies. This one is different. As you seek to gain clarity about this change, understand that the Google Algorithm is separate from the Googlebot. The Googlebot’s job is to crawl, collect, and place content in the Google Index. The Google Algorithm’s job is to interpret what is in the index. This update did not affect the Algorithm and therefore does not impact content strategies. This update is about the technologies used to build a complete web page, and the Googlebot’s ability to read that content.

This blog is the first in an ongoing series around SEO and the impact of consumer-generated content on search engines. The next post in this series will address updates in Bazaarvoice’s Technical Best Practices related to this update and schema.org markup.

 
Want the latest content delivered straight to your inbox? Join our monthly newsletter.