PostHeaderIcon How to Use Search Analytics in Google Sheets for Better SEO Insights

Posted by mihai.aperghis

This post was originally in YouMoz, and was promoted to the main blog because it provides great value and interest to our community. The author’s views are entirely his or her own and may not reflect the views of Moz, Inc.

As an SEO, whether you’re working in-house or handling many clients in an agency, you’ve likely been using this tool for a bunch of reasons. Whether it’s diagnosing traffic and position changes or finding opportunities for optimizations and content ideas, Google Search Console’s Search Search Analytics has been at the core of most SEOs’ toolset.

The scope of this small guide is to give you a few ideas on how to use Search Analytics together with Google Sheets to help you in your SEO work. As with the guide on how to do competitive analysis in Excel, this one is also focused around a tool that I’ve built to help me get the most of Search Analytics: Search Analytics for Sheets.

The problem with the Search Analytics UI

Sorting out and managing data in the Google Search Console Search Analytics web UI in order to get meaningful insights is often difficult to do, and even the CSV downloads don’t make it much easier.

The main problem with the Search Analytics UI is grouping.

If you’d like to see a list of all the keywords in Search Analytics and, at the same time, get their corresponding landing pages, you can’t do that. You instead need to filter query-by-query (to see their associated landing pages), or page-by-page (to see their associated queries). And this is just one example.

Search Analytics Grouping

Basically, with the Search Analytics UI, you can’t do any sort of grouping on a large scale. You have to filter by each keyword, each landing page, each country etc. in order to get the data you need, which would take a LOT of time (and possible a part of your sanity as well).

In comes the API for the save

Almost one year ago (and after quite a bit of pressure from webmasters), Google launched the official API for Search Analytics.

Official Google Webmaster Central Blog Search Analytics API

With it, you can do pretty much anything you can do with the web UI, with the added benefit of applying any sort of grouping and/or filtering.

Excited yet?

Imagine you can now have one column filled with keywords, the next column with their corresponding landing pages, then maybe the next one with their corresponding countries or devices, and have impressions, clicks, CTR, and positions for each combination.

Everything in one API call

Query Page Country Device Clicks Impressions CTR Position
keyword 1 usa DESKTOP 92 2,565 3.59% 7.3
keyword 1 usa MOBILE 51 1,122 4.55% 6.2
keyword 2 gbr DESKTOP 39 342 11.4% 3.8
keyword 1 aus DESKTOP 21 55 38.18% 1.7
keyword 3 usa MOBILE 20 122 16.39% 3.6

Getting the data into Google Sheets

I have traditionally enjoyed using Excel but have since migrated over to Google Sheets due to its cloud nature (which means easier sharing with my co-workers) and expandability via scripts, libraries, and add-ons.

After being heavily inspired by Seer Interactive’s SEO Toolbox (an open-source Google Sheets library that offers some very nice functions for daily SEO tasks), I decided to build a Sheets script that would use the Search Analytics API.

I liked the idea of speeding up and improving my daily monitoring and diagnosing for traffic and ranking changes.

Also, using the API gave me the pretty useful feature of automatically backing up your GSC data once a month. (Before, you needed to do this manually, use a paid Sheets add-on or a Python script.)

Once things started to take shape with the script, I realized I could take this public by publishing it into an add-on.

What is Search Analytics for sheets?

Simply put, Search Analytics for Sheets is a (completely free) Google Sheets add-on that allows you to fetch data from GSC (via its API), grouped and filtered to your liking, and create automated monthly backups.

If your interest is piqued, installing the add-on is fairly simple. Either install it from the Chrome Web Store, or:

  • Open a Google spreadsheet
  • Go to Add-ons -> Get add-ons
  • Search for Search Analytics for Sheets
  • Install it (It’ll ask you to authorize a bunch of stuff, but you can sleep safe: The add-on has been reviewed by Google and no data is being saved/monitored/used in any other way except grabbing it and putting it in your spreadsheets).

Once that’s done, open a spreadsheet where you’d like to use the add-on and:

Search Analytics for Sheets Install

  • Go to Add-ons -> Search Analytics for Sheets -> Open Sidebar
  • Authorize it with your GSC account (make sure you’re logged in Sheets with your GSC account, then close the window once it says it was successful)

You’ll only have to do this once per user account, so once you install it, the add-on will be available for all your spreadsheets.

PS: You’ll get an error if you don’t have any websites verified on your logged in account.

How Search Analytics for Sheets can help you

Next, I’ll give you some examples on what you can use the add-on for, based on how I mainly use it.

Grab information on queries and their associated landing pages

Whether it is to diagnose traffic changes, find content optimization opportunities, or check for appropriate landing pages, getting data on both queries and landing pages at the same time can usually provide instant insights. Other than automated backups, this is by far the feature that I use the most, especially since it’s fairly hard to replicate the process using the standard web UI.

Best of all, it’s quite straightforward to do this and requires only a few clicks:

  • Select the website
  • Select your preferred date interval (by default it will grab the minimum and maximum dates available in GSC)
  • In the Group field, select “Query,” then “Page”
  • Click “Request Data”

That’s it.

You’ll now have a new sheet containing a list of queries, their associated landing pages, and information about impressions, clicks, CTR, and position for each query-page pair.

Search Analytics for Sheets Example 1

What you do with the data is up to you:

  • Check keyword opportunities

Use a sheets filter to only show rows with positions between 10 and 21 (usually second-page results) and see whether landing pages can be further optimized to push those queries to the first page. Maybe work a bit on the title tag, content and internal linking to those pages.

  • Diagnose landing page performance

Check position 20+ rows to see whether there’s a mismatch between the query and its landing page. Perhaps you should create more landing pages, or there are pages that target those queries but aren’t accessible by Google.

  • Improve CTR

Look closely at position and CTR. Check low-CTR rows with associated high position values and see if there’s any way to improve titles and meta descriptions for those pages (a call-to-action might help), or maybe even add some rich snippets (they’re pretty effective in raising CTR without much work).

  • Find out why your traffic dropped
    • Had significant changes in traffic? Do two requests (for example, one for the last 30 days and one for the previous 30 days) then use VLOOKUP to compare the data.
    • Positions dropped across the board? Time to check GSC for increased 4xx/5xx errors, manual actions, or faulty site or protocol migrations.
    • Positions haven’t dropped, but clicks and impressions did? Might be seasonality, time to check year-over-year analytics, Google Trends, Keyword Planner.
    • Impressions and positions haven’t dropped, but clicks/CTR did? Manually check those queries, see whether the Google UI has changed (more top ads, featured snippet, AMP carousel, “In the news” box, etc.)

I could go on, but I should probably leave this for a separate post.

Get higher granularity with further grouping and filtering options

Even though I don’t use them as much, the date, country and device groupings let you dive deep into the data, while filtering allows you to fetch specific data to one or more dimensions.

Search Analytics for Sheets Grouping

Date grouping creates a new column with the actual day when the impressions, clicks, CTR, and position were recorded. This is particularly useful together with a filter for a specific query, so you can basically have your own rank tracker.

Grouping by country and device lets you understand where your audience is.

Using country grouping will let you know how your site fares internationally, which is of course highly useful if you target users in more than one country.

However, device grouping is probably something you’ll play more with, given the rise in mobile traffic everywhere. Together with query and/or page grouping, this is useful to know how Google ranks your site on desktop and mobile, and where you might need to improve (generally speaking you’ll probably be more interested in mobile rankings here rather than desktop, since those can pinpoint problems with certain pages on your site and their mobile usability).

Search Analytics for Sheets Grouping Example

Filtering is exactly what it sounds like.

Choose between query, page, country and/or device to select specific information to be retrieved. You can add any number of filters; just remember that, for the time being, multiple filters are added cumulatively (all conditions must be met).

Search Analytics for Sheets Grouping Example

Other than the rank tracking example mentioned earlier, filtering can be useful in other situations as well.

If you’re doing a lot of content marketing, perhaps you’ll use the page filter to only retrieve URLs that contain /blog/ (or whatever subdirectory your content is under), while filtering by country is great for international sites, as you might expect.

Just remember one thing: Search Analytics offers a lot of data, but not all the data. They tend to leave out data that is too individual (as in, very few users can be aggregated in that result, such as, for example, long tail queries).

This also means that, the more you group/filter, the less aggregated the data is, and certain information will not be available. That doesn’t mean you shouldn’t use groups and filters; it’s just something to keep in mind when you’re adding up the numbers.

Saving the best for last: Automated Search Analytics backups

This is the feature that got me into building this add-on.

I use GSC data quite a bit, from client reports to comparing data from multiple time periods. Unless you’ve never used GSC/WMT in the past, it’s highly unlikely you don’t know that the data available in Search Analytics only spans about the last 90 days.

While the guys at Google have mentioned that they’re looking into expanding this window, most SEOs have had to rely on various ways of backing up data in order to access it later.

This usually requires either remembering to manually download the data each month, or using a more complicated (but automated) method such as a Python script.

The Search Analytics for Sheets add-on allows you to do this effortlessly.

Just like when requesting data, select the site and set up any grouping and filtering that you’d like to use. I highly recommend using query and page grouping, and maybe country filtering to cut some of the noise.

Then simply enable the backup.

That’s it.The current spreadsheet will host that backup from now on, until you decide to disable it.

Search Analytics for Sheets Example 2

What happens now is that once per month (typically on the 3rd day of the month) the backup will run automatically and fetch the data for the previous month into the spreadsheet (each month will have its own sheet).

In case there are delays (sometimes Search Analytics data can be delayed even up to a week), the add-on will re-attempt to run the backup every day until it succeeds.

It’ll even keep a log with all backup attempts, and send you an email if you’d like.

Search Analytics for Sheets Backup Log

It’ll also create a separate sheet for monthly aggregated data (the total number of impressions and clicks plus CTR and position data, without any grouping or filtering), so that way you’ll be sure you’re ‘saving’ the real overview information as well.

If you’d like more than one backup (either another backup for the same site but with different grouping/filtering options or a new backup for a different site), simply open a new spreadsheet and enable the backup there. You’ll always be able to see a list with all the backups within the “About” tab.

For the moment, only monthly backups are available, though I’m thinking about including a weekly and/or daily option as well. However that might be more complicated, especially in cases where GSC data is delayed.

Going further

I hope you’ll find the tool as useful as I think it is.

There may be some bugs, even though I tried squashing them all (thanks to Russ Jones and Tori Cushing, Barry Schwartz from Search Engine Roundtable, and Cosmin Negrescu from SEOmonitor for helping me test and debug it).

If you do find anything else or have any feature requests, please let me know via the add-on feedback function in Google Sheets or via the form on the official site.

If not, I hope the tool will help you in your day-to-day SEO work as much as it helps me. Looking forward to see more use cases for it in the comments.

PS: The tool doesn’t support more than 5,000 rows at the moment; working on getting that improved!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Similar Posts:

Article Source: The Only Yard For The Internet Junkie
If you like all this stuff here then you can buy me a pack of cigarettes.

PostHeaderIcon SEO Trek: The Search for Google RankBrain* [New Data]

Posted by

Rand Fishkin posted another brilliant Whiteboard Friday last week on the topic of optimizing for RankBrain. In it, he explained how RankBrain helps Google select and prioritize signals it uses for ranking.

One of the most important signals Google takes into account is user engagement. As Rand noted, engagement is a “very, very important signal.”

Engagement is a huge but often ignored opportunity. That’s why I’ve been a bit obsessed with improving engagement metrics.

My theory has been that RankBrain *and/or other machine learning elements within Google’s core algorithm are increasingly rewarding pages with high user engagement. Not always, but it’s happening often enough that it’s kind of a huge deal.

Google is looking for unicorns – and I think that machine learning is Google’s ultimate Unicorn Detector.

Now, when I say unicorns, I mean those pages that have magical engagement rates that elevate them above the other donkey pages Google could show for a given query. Like if your page has a 5 percent click-through rate (CTR) when everyone else has a 1 percent CTR.

What is Google’s mission? To provide the best results to searchers. One way Google does this is by looking at engagement data.

If most people are clicking on a particular search result – and then also engaging with that page – these are clear signals to Google that people think this page is fascinating. That it’s a unicorn.


RankBrain: Into Darkness

RankBrain, much like Google’s algorithm, is a great mystery. Since Google revealed (in a Bloomberg article just under a year ago) the important role of machine learning and artificial intelligence in its algorithm, RankBrain has been a surprisingly controversial topic, generating speculation and debate within the search industry.

Then, we found out in June that Google RankBrain was no longer just for long-tail queries. It was “involved in every query.”

We learned quite a few things about RankBrain. We were told by Google that you can’t optimize for it. Yet we also learned that Google’s engineers don’t really understand what RankBrain does or how it works.

Some people have even argued that there is absolutely nothing you can do to see Google’s machine learning systems at work.

Give me a break! It’s an algorithm. Granted, a more complex algorithm thanks to machine learning, but an algorithm nonetheless. All algorithms have rules and patterns.

When Google tweaked Panda and Penguin, we saw it. When Google tweaked its exact-match domain algorithm, we saw it. When Google tweaked its mobile algorithm, we saw it.

If you carefully set up an experiment, you should be able to isolate some aspect of what Google is proclaiming as the third most important ranking factor. You should be able to find evidence – a digital fingerprint.

Well, I say it’s time to boldly go where no SEO has gone before. That’s what I’ve attempted to do in this post. Let’s look at some new data.

The search for RankBrain [New Data]

What you’re about to look at is organic search click-through rate vs. the average organic search position for three separate 30-day periods ending April 30, July 12, and September 19 of this year. This data, obtained from the Google Search Console, tracked the same keywords in the Internet marketing niche.

I see some of the most compelling evidence of RankBrain (and/or other machine learning search algorithms!) at work.

The shape of CTR vs. ranking curve is changing every month – for the 30 days ending:

  • April 30, 2016, the average CTR for top position was about 22 percent.
  • July 12, 2016, the average CTR rose to about 24 percent.
  • By September 19, 2016, the average CTR increased to about 27 percent.

The top, most prominent positions are getting even more clicks. Obviously, they were already getting a lot of clicks. But now they’re getting more clicks than they have in recent history.

This is the winner-take-all nature of Google’s organic SERPs today. It’s coming at the expense of Positions 4–10, which are being clicked on much less over time.

Results that are more likely to attract engagement are pushed further up the SERP, while results with lower engagement get pushed further down. That’s what we believe RankBrain is doing.

Going beyond the data

This data is showing us something very interesting. A couple thoughts:

  • This is exactly the fingerprint you would expect to see for a machine learning-based algorithm doing query interpretation that impacts rank based on user engagement metrics, such as CTR.
  • Essentially, machine learning systems move away from serving up 10 blue links and asking a user to choose one of them and toward providing the actual correct answers, further eliminating the need for lower positions.

Could anything else be causing this shift to the click curve? Could it have been the elimination of right rail ads?

No, that happened in February. I was careful to use date ranges that were after the right rail apocalypse.

Could it be more Knowledge Graph elements creeping into the SERPs? If that were the case, it would look like everything got pushed down by one position (e.g., Position 1 becomes Position 2, Position 2 becomes Position 3, and so on).

The data didn’t show that happening. We see a bending of the click curve, not a shifting of the curve.

Behold the awesome power of CTR optimization!

OK, so we’ve looked at the big picture. Now let’s look at the little picture to illustrate the remarkable power of CTR optimization.

Let’s talk about guerrilla marketing. Here are two headlines. Which headline do you think has the higher CTR?

  • Guerrilla Marketing: 20+ Examples and Strategies to Stand Out

This was the original headline for an article published on the WordStream blog in 2014.

  • 20+ Jaw-Dropping Guerrilla Marketing Examples

This is the updated headline, which we changed just a few months ago, in the hopes of increasing the CTR. And yep, we sure did!

Before we updated the headline, the article had a CTR of 1 percent and was ranking in position 8. Nothing awesome.

Since we updated the headline, the article has had a CTR of 4.19 percent and is ranking in position 5. Pretty awesome, no?

Increasingly, we’ve been trying to move away from “SEO titles” that look like the original headline, where you have the primary keyword followed by a colon and the rest of your headline. They aren’t catchy enough.

Yes, you still need to include keywords in your headline. But you don’t have to use this tired format, which will deliver (at best) solid but unspectacular results.

To be clear: we only changed the title tag. No other optimization tactics were used.

We didn’t point any links (internal or external) at it. We didn’t add any images or anything else to the post. Nothing.

Changing the title tag changed the CTR. Which gave it “magical points” that resulted in 97 percent more organic traffic:

What does it all mean?

This example illustrates that if you increase your CTR, you’ll see a nice boost in traffic. Ranking in a better position means more traffic, which means a higher CTR, which also means more traffic.

What’s so remarkable is that this is on-page SEO. No link building was required! Besides, pointing new links to a page wouldn’t result in a higher click-through rate – a catchier headline, however, would result in a higher CTR.

What’s also interesting about this is that RankBrain isn’t like other algorithms, say Panda or Penguin, where it was obvious when you got hit. You lost half your traffic!

If RankBrain or a machine learning algorithm impacts your site due to engagement metrics (positive or negative), it’s a much more subtle shift. All your best pages do better. All your “upper class donkey” pages do slightly worse. Ultimately, the two forces cancel each other out, to some extent, so that the SEO alarms don’t go off.

The final frontier

When it comes to SEO, your mission is to seek out every advantage. It’s my belief that organic CTR and website engagement rates impact organic rankings.

So boldly go where many SEOs are failing to go now. Hop aboard the USS Unicorn, make the jump to warp speed, and discover the wonders of those magical creatures.

Oh, and…

Are you optimizing your click-through rates? If not, why not? If so, what have you been seeing in your analytics?

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Similar Posts:

Article Source: The Only Yard For The Internet Junkie
If you like all this stuff here then you can buy me a pack of cigarettes.

PostHeaderIcon How to Fix Crawl Errors in Google Search Console

Posted by Joe.Robison

A lot has changed in the five years since I first wrote about what was Google Webmaster Tools, now named Google Search Console. Google has unleashed significantly more data that promises to be extremely useful for SEOs. Since we’ve long since lost sufficient keyword data in Google Analytics, we’ve come to rely on Search Console more than ever. The “Search Analytics” and “Links to Your Site” sections are two of the top features that did not exist in the old Webmaster Tools.

While we may never be completely satisfied with Google’s tools and may occasionally call their bluffs, they do release some helpful information (from time to time). To their credit, Google has developed more help docs and support resources to aid Search Console users in locating and fixing errors.

Despite the fact that some of this isn’t as fun as creating 10x content or watching which of your keywords have jumped in the rankings, this category of SEO is still extremely important.

Looking at it through Portent’s epic visualization of how Internet marketing pieces fit together, fixing crawl errors in Search Console fits squarely into the “infrastructure” piece:

If you can develop good habits and practice preventative maintenance, weekly spot checks on crawl errors will be perfectly adequate to keep them under control. However, if you fully ignore these (pesky) errors, things can quickly go from bad to worse.

Crawl Errors layout

One change that has evolved over the last few years is the layout of the Crawl Errors view within Search Console. Search Console is divided into two main sections: Site Errors and URL Errors.

Categorizing errors in this way is pretty helpful because there’s a distinct difference between errors at the site level and errors at the page level. Site-level issues can be more catastrophic, with the potential to damage your site’s overall usability. URL errors, on the other hand, are specific to individual pages, and are therefore less urgent.

The quickest way to access Crawl Errors is from the dashboard. The main dashboard gives you a quick preview of your site, showing you three of the most important management tools: Crawl Errors, Search Analytics, and Sitemaps.

You can get a quick look at your crawl errors from here. Even if you just glance at it daily, you’ll be much further ahead than most site managers.

1. Site Errors

The Site Errors section shows you errors from your website as a whole. These are the high-level errors that affect your site in its entirety, so don’t skip these.

In the Crawl Errors dashboard, Google will show you these errors for the last 90 days.

If you have some type of activity from the last 90 days, your snippet will look like this:

If you’ve been 100% error-free for the last 90 days with nothing to show, it will look like this:

That’s the goal — to get a “Nice!” from Google. As SEOs we don’t often get any validation from Google, so relish this rare moment of love.

How often should you check for site errors?

In an ideal world you would log in daily to make sure there are no problems here. It may get monotonous since most days everything is fine, but wouldn’t you kick yourself if you missed some critical site errors?

At the extreme minimum, you should check at least every 90 days to look for previous errors so you can keep an eye out for them in the future — but frequent, regular checks are best.

We’ll talk about setting up alerts and automating this part later, but just know that this section is critical and you should be 100% error-free in this section every day. There’s no gray area here.

A) DNS Errors

What they mean

DNS errors are important — and the implications for your website if you have severe versions of these errors is huge.

DNS (Domain Name System) errors are the first and most prominent error because if the Googlebot is having DNS issues, it means it can’t connect with your domain via a DNS timeout issue or DNS lookup issue.

Your domain is likely hosted with a common domain company, like Namecheap or GoDaddy, or with your web hosting company. Sometimes your domain is hosted separately from your website hosting company, but other times the same company handles both.

Are they important?

While Google states that many DNS issues still allow Google to connect to your site, if you’re getting a severe DNS issue you should act immediately.

There may be high latency issues that do allow Google to crawl the site, but provide a poor user experience.

A DNS issue is extremely important, as it’s the first step in accessing your website. You should take swift and violent action if you’re running into DNS issues that prevent Google from connecting to your site in the first place.

How to fix

  1. First and foremost, Google recommends using their Fetch as Google tool to view how Googlebot crawls your page. Fetch as Google lives right in Search Console.

    If you’re only looking for the DNS connection status and are trying to act quickly, you can fetch without rendering. The slower process of Fetch and Render is useful, however, to get a side-by-side comparison of how Google sees your site compared to a user.

  2. Check with your DNS provider. If Google can’t fetch and render your page properly, you’ll want to take further action. Check with your DNS provider to see where the issue is. There could be issues on the DNS provider’s end, or it could be worse.
  3. Ensure your server displays a 404 or 500 error code. Instead of having a failed connection, your server should display a 404 (not found) code or a 500 (server error) code. These codes are more accurate than having a DNS error.

Other tools

  • – Lets you know instantly if your site is down for everyone, or just on your end.
  • – shows you the current HTTP(s) request and response header. Useful for point #3 above.

B) Server Errors

What they mean

A server error most often means that your server is taking too long to respond, and the request times out. The Googlebot that’s trying to crawl your site can only wait a certain amount of time to load your website before it gives up. If it takes too long, the Googlebot will stop trying.

Server errors are different than DNS errors. A DNS error means the Googlebot can’t even lookup your URL because of DNS issues, while server errors mean that although the Googlebot can connect to your site, it can’t load the page because of server errors.

Server errors may happen if your website gets overloaded with too much traffic for the server to handle. To avoid this, make sure your hosting provider can scale up to accommodate sudden bursts of website traffic. Everybody wants their website to go viral, but not everybody is ready!

Are they important?

Like DNS errors, a server error is extremely urgent. It’s a fundamental error, and harms your site overall. You should take immediate action if you see server errors in Search Console for your site.

Making sure the Googlebot can connect to the DNS is an important first step, but you won’t get much further if your website doesn’t actually show up. If you’re running into server errors, the Googlebot won’t be able to find anything to crawl and it will give up after a certain amount of time.

How to fix

In the event that your website is running fine at the time you encounter this error, that may mean there were server errors in the past Though this error may have been resolved for now, you should still make some changes to prevent it from happening again.

This is Google’s official direction for fixing server errors:

“Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.”

Before you can fix your server errors issue, you need to diagnose specifically which type of server error you’re getting, since there are many types:

  • Timeout
  • Truncated headers
  • Connection reset
  • Truncated response
  • Connection refused
  • Connect failed
  • Connect timeout
  • No response

Addressing how to fix each of these is beyond the scope of this article, but you should reference Google Search Console help to diagnose specific errors.

C) Robots failure

A Robots failure means that the Googlebot cannot retrieve your robots.txt file, located at []/robots.txt.

What they mean

One of the most surprising things about a robots.txt file is that it’s only necessary if you don’t want Google to crawl certain pages.

From Search Console help, Google states:

“You need a robots.txt file only if your site includes content that you don’t want search engines to index. If you want search engines to index everything in your site, you don’t need a robots.txt file — not even an empty one. If you don’t have a robots.txt file, your server will return a 404 when Googlebot requests it, and we will continue to crawl your site. No problem.”

Are they important?

This is a fairly important issue. For smaller, more static websites without many recent changes or new pages, it’s not particularly urgent. But the issue should still be fixed.

If your site is publishing or changing new content daily, however, this is an urgent issue. If the Googlebot cannot load your robots.txt, it’s not crawling your website, and it’s not indexing your new pages and changes.

How to fix

Ensure that your robots.txt file is properly configured. Double-check which pages you’re instructing the Googlebot to not crawl, as all others will be crawled by default. Triple-check the all-powerful line of “Disallow: /” and ensure that line DOES NOT exist unless for some reason you do not want your website to appear in Google search results.

If your file seems to be in order and you’re still receiving errors, use a server header checker tool to see if your file is returning a 200 or 404 error.

What’s interesting about this issue is that it’s better to have no robots.txt at all than to have one that’s improperly configured. If you have none at all, Google will crawl your site as usual. If you have one returning errors, Google will stop crawling until you fix this file.

For being only a few lines of text, the robots.txt file can have catastrophic consequences for your website. Make sure you’re checking it early and often.

2. URL Errors

URL errors are different from site errors because they only affect specific pages on your site, not your website as a whole.

Google Search Console will show you the top URL errors per category — desktop, smartphone, and feature phone. For large sites, this may not be enough data to show all the errors, but for the majority of sites this will capture all known problems.

Tip: Going crazy with the amount of errors? Mark all as fixed.

Many site owners have run into the issue of seeing a large number of URL errors and getting freaked out. The important thing to remember is a) Google ranks the most important errors first and b) some of these errors may already be resolved.

If you’ve made some drastic changes to your site to fix errors, or believe a lot of the URL errors are no longer happening, one tactic to employ is marking all errors as fixed and checking back up on them in a few days.

When you do this, your errors will be cleared out of the dashboard for now, but Google will bring the errors back the next time it crawls your site over the next few days. If you had truly fixed these errors in the past, they won’t show up again. If the errors still exist, you’ll know that these are still affecting your site.

A) Soft 404

A soft 404 error is when a page displays as 200 (found) when it should display as 404 (not found).

What they mean

Just because your 404 page looks like a 404 page doesn’t mean it actually is one. The user-visible aspect of a 404 page is the content of the page. The visible message should let users know the page they requested is gone. Often, site owners will have a helpful list of related links the users should visit or a funny 404 response.

The flipside of a 404 page is the crawler-visible response. The header HTTP response code should be 404 (not found) or 410 (gone).

A quick refresher on how HTTP requests and responses look:

Image source: Tuts Plus

If you’re returning a 404 page and it’s listed as a Soft 404, it means that the header HTTP response code does not return the 404 (not found) response code. Google recommends “that you always return a 404 (not found) or a 410 (gone) response code in response to a request for a non-existing page.”

Another situation in which soft 404 errors may show up is if you have pages that are 301 redirecting to non-related pages, such as the home page. Google doesn’t seem to explicitly state where the line is drawn on this, only making mention of it in vague terms.

Officially, Google says this about soft 404s:

“Returning a code other than 404 or 410 for a non-existent page (or redirecting users to another page, such as the homepage, instead of returning a 404) can be problematic.”

Although this gives us some direction, it’s unclear when it’s appropriate to redirect an expired page to the home page and when it’s not.

In practice, from my own experience, if you’re redirecting large amounts of pages to the home page, Google can interpret those redirected URLs as soft 404s rather than true 301 redirects.

Conversely, if you were to redirect an old page to a closely related page instead, it’s unlikely that you’d trigger the soft 404 warning in the same way.

Are they important?

If the pages listed as soft 404 errors aren’t critical pages and you’re not eating up your crawl budget by having some soft 404 errors, these aren’t an urgent item to fix.

If you have crucial pages on your site listed as soft 404s, you’ll want to take action to fix those. Important product, category, or lead gen pages shouldn’t be listed as soft 404s if they’re live pages. Pay special attention to pages critical to your site’s moneymaking ability.

If you have a large amount of soft 404 errors relative to the total number of pages on your site, you should take swift action. You can be eating up your (precious?) Googlebot crawl budget by allowing these soft 404 errors to exist.

How to fix

For pages that no longer exist:

  • Allow to 404 or 410 if the page is gone and receives no significant traffic or links. Ensure that the server header response is 404 or 410, not 200.
  • 301 redirect each old page to a relevant, related page on your site.
  • Do not redirect broad amounts of dead pages to your home page. They should 404 or be redirected to appropriate similar pages.

For pages that are live pages, and are not supposed to be a soft 404:

  • Ensure there is an appropriate amount of content on the page, as thin content may trigger a soft 404 error.
  • Ensure the content on your page doesn’t appear to represent a 404 page while serving a 200 response code.

Soft 404s are strange errors. They lead to a lot of confusion because they tend to be a strange hybrid of 404 and normal pages, and what is causing them isn’t always clear. Ensure the most critical pages on your site aren’t throwing soft 404 errors, and you’re off to a good start!

B) 404

A 404 error means that the Googlebot tried to crawl a page that doesn’t exist on your site. Googlebot finds 404 pages when other sites or pages link to that non-existent page.

What they mean

404 errors are probably the most misunderstood crawl error. Whether it’s an intermediate SEO or the company CEO, the most common reaction is fear and loathing of 404 errors.

Google clearly states in their guidelines:

“Generally, 404 errors don’t affect your site’s ranking in Google, so you can safely ignore them.”

I’ll be the first to admit that “you can safely ignore them” is a pretty misleading statement for beginners. No — you cannot ignore them if they are 404 errors for crucial pages on your site.

(Google does practice what it preaches, in this regard — going to returns a 404 instead of a helpful redirect to

Distinguishing between times when you can ignore an error and when you’ll need to stay late at the office to fix something comes from deep review and experience, but Rand offered some timeless advice on 404s back in 2009:

“When faced with 404s, my thinking is that unless the page:

A) Receives important links to it from external sources (Google Webmaster Tools is great for this),
B) Is receiving a substantive quantity of visitor traffic,
C) Has an obvious URL that visitors/links intended to reach

It’s OK to let it 404.”

The hard work comes in deciding what qualifies as important external links and substantive quantity of traffic for your particular URL on your particular site.

Annie Cushing also prefers Rand’s method, and recommends:

“Two of the most important metrics to look at are backlinks to make sure you don’t lose the most valuable links and total landing page visits in your analytics software. You may have others, like looking at social metrics. Whatever you decide those metrics to be, you want to export them all from your tools du jour and wed them in Excel.”

One other thing to consider not mentioned above is offline marketing campaigns, podcasts, and other media that use memorable tracking URLs. It could be that your new magazine ad doesn’t come out until next month, and the marketing department forgot to tell you about an unimportant-looking URL ( that’s about to be plastered in tens thousands of magazines. Another reason for cross-department synergy.

Are they important?

This is probably one of the trickiest and simplest problems of all errors. The vast quantity of 404s that many medium to large sites accumulate is enough to deter action.

404 errors are very urgent if important pages on your site are showing up as 404s. Conversely, like Google says, if a page is long gone and doesn’t meet our quality criteria above, let it be.

As painful as it might be to see hundreds of errors in your Search Console, you just have to ignore them. Unless you get to the root of the problem, they’ll continue showing up.

How to fix 404 errors

If your important page is showing up as a 404 and you don’t want it to be, take these steps:

  1. Ensure the page is published from your content management system and not in draft mode or deleted.
  2. Ensure the 404 error URL is the correct page and not another variation.
  3. Check whether this error shows up on the www vs non-www version of your site and the http vs https version of your site. See Moz canonicalization for more details.
  4. If you don’t want to revive the page, but want to redirect it to another page, make sure you 301 redirect it to the most appropriate related page.

In short, if your page is dead, make the page live again. If you don’t want that page live, 301 redirect it to the correct page.

How to stop old 404s from showing up in your crawl errors report

If your 404 error URL is meant to be long gone, let it die. Just ignore it, as Google recommends. But to prevent it from showing up in your crawl errors report, you’ll need to do a few more things.

As yet another indication of the power of links, Google will only show the 404 errors in the first place if your site or an external website is linking to the 404 page.

In other words, if I type in, it won’t show up in your crawl errors dashboard unless I also link to it from my website.

To find the links to your 404 page, go to your Crawl Errors > URL Errors section:

Then click on the URL you want to fix:

Search your page for the link. It’s often faster to view the source code of your page and find the link in question there:

It’s painstaking work, but if you really want to stop old 404s from showing up in your dashboard, you’ll have to remove the links to that page from every page linking to it. Even other websites.

What’s really fun (not) is if you’re getting links pointed to your URL from old sitemaps. You’ll have to let those old sitemaps 404 in order to totally remove them. Don’t redirect them to your live sitemap.

C) Access denied

Access denied means Googlebot can’t crawl the page. Unlike a 404, Googlebot is prevented from crawling the page in the first place.

What they mean

Access denied errors commonly block the Googlebot through these methods:

  • You require users to log in to see a URL on your site, therefore the Googlebot is blocked
  • Your robots.txt file blocks the Googlebot from individual URLs, whole folders, or your entire site
  • Your hosting provider is blocking the Googlebot from your site, or the server requires users to authenticate by proxy

Are they important?

Similar to soft 404s and 404 errors, if the pages being blocked are important for Google to crawl and index, you should take immediate action.

If you don’t want this page to be crawled and indexed, you can safely ignore the access denied errors.

How to fix

To fix access denied errors, you’ll need to remove the element that’s blocking the Googlebot’s access:

  • Remove the login from pages that you want Google to crawl, whether it’s an in-page or popup login prompt
  • Check your robots.txt file to ensure the pages listed on there are meant to be blocked from crawling and indexing
  • Use the robots.txt tester to see warnings on your robots.txt file and to test individual URLs against your file
  • Use a user-agent switcher plugin for your browser, or the Fetch as Google tool to see how your site appears to Googlebot
  • Scan your website with Screaming Frog, which will prompt you to log in to pages if the page requires it

While not as common as 404 errors, access denied issues can still harm your site’s ranking ability if the wrong pages are blocked. Be sure to keep an eye on these errors and rapidly fix any urgent issues.

D) Not followed

What they mean

Not to be confused with a “nofollow” link directive, a “not followed” error means that Google couldn’t follow that particular URL.

Most often these errors come about from Google running into issues with Flash, Javascript, or redirects.

Are they important?

If you’re dealing with not followed issues on a high-priority URL, then yes, these are important.

If your issues are stemming from old URLs that are no longer active, or from parameters that aren’t indexed and just an extra feature, the priority level on these is lower — but you should still analyze them.

How to fix

Google identifies the following as features that the Googlebot and other search engines may have trouble crawling:

  • JavaScript
  • Cookies
  • Session IDs
  • Frames
  • Flash

Use either the Lynx text browser or the Fetch as Google tool, using Fetch and Render, to view the site as Google would. You can also use a Chrome add-on such as User-Agent Switcher to mimic Googlebot as you browse pages.

If, as the Googlebot, you’re not seeing the pages load or not seeing important content on the page because of some of the above technologies, then you’ve found your issue. Without visible content and links to crawl on the page, some URLs can’t be followed. Be sure to dig in further and diagnose the issue to fix.

For parameter crawling issues, be sure to review how Google is currently handling your parameters. Specify changes in the URL Parameters tool if you want Google to treat your parameters differently.

For not followed issues related to redirects, be sure to fix any of the following that apply:

  • Check for redirect chains. If there are too many “hops,” Google will stop following the redirect chain
  • When possible, update your site architecture to allow every page on your site to be reached from static links, rather than relying on redirects implemented in the past
  • Don’t include redirected URLs in your sitemap, include the destination URL

Google used to include more detail on the Not Followed section, but as Vanessa Fox detailed in this post, a lot of extra data may be available in the Search Console API.

Other tools

E) Server errors & DNS errors

Under URL errors, Google again lists server errors and DNS errors, the same sections in the Site Errors report. Google’s direction is to handle these in the same way you would handle the site errors level of the server and DNS errors, so refer to those two sections above.

They would differ in the URL errors section if the errors were only affecting individual URLs and not the site as a whole. If you have isolated configurations for individual URLs, such as minisites or a different configuration for certain URLs on your domain, they could show up here.

Now that you’re the expert on these URL errors, I’ve created this handy URL error table that you can print out and tape to your desktop or bathroom mirror.


I get it — some of this technical SEO stuff can bore you to tears. Nobody wants to individually inspect seemingly unimportant URL errors, or conversely, have a panic attack seeing thousands of errors on your site.

With experience and repetition, however, you will gain the mental muscle memory of knowing how to react to the errors: which are important and which can be safely ignored. It’ll be second nature pretty soon.

If you haven’t already, I encourage you to read up on Google’s official documentation for Search Console, and keep these URLs handy for future questions:

We’re simply covering the Crawl Errors section of Search Console. Search Console is a data beast on its own, so for further reading on how to make best use of this tool in its entirety, check out these other guides:

Google has generously given us one of the most powerful (and free!) tools for diagnosing website errors. Not only will fixing these errors help you improve your rankings in Google, they help provide a better user experience to your visitors, and help meet your business goals faster.

Your turn: What crawl errors issues and wins have you experienced using Google Search Console?

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Similar Posts:

Article Source: The Only Yard For The Internet Junkie
If you like all this stuff here then you can buy me a pack of cigarettes.

Free premium templates and themes
Add to Technorati Favorites
Free PageRank Display
Our Partners
Related Links
Our Partners
Resources Link Directory Professional Web Design Template