While building a mobile friendly site with great content is a vital first step in optimising your business website, there are some important technical considerations you must take special attention of. Technical SEO remains one of the most complex aspects of SEO, largely because it remains one of the few tasks that only skilled web developers can carry out.

Content creation and link building can be carried out by trained writers and marketing executives, but technical SEO requires somebody that knows their way around a range of CMSes, understands web languages, and is confident with configuring web servers. Luckily, our web developer has a sound understanding of all these areas and we are able to enhance any SEO campaign with a technical SEO audit. We have already talked about making a website load faster, today we’re going to talk about crawl budget.

What Is Crawl Budget?

To index your web pages, Google first has to “crawl” them. It does this with search bots, also known as web spiders (thus, the term “crawl”) which follow links on the web. Every time Googlebot, the name of Google’s search spider software, comes to your website, it will start following the internal links on your pages to analyse and process all your content with the aim of updating the search index.

Google performs several types of crawl, but to keep things simple, we only need to discuss its deep crawl here. When Google performs a deep crawl on a website it is aiming to analyse every page to seek new information. If you update your page titles, site content, add new pages, change images and photos, or remove pages, Google will detect this and update the index. On the other hand, with a shallow crawl, Google will only look at your newest pages, and maybe also your home page, to see what changes have been made.

For instance, Google may visit your blog every day, or if you are running a very popular site, Google will visit every time a new article is published to your RSS feed or sitemap. These crawls are fast and take up very few resources. Deep crawls, on the other hand, are very resource hungry, and although we do not have the exact details of how Googlebot determines how many pages to crawl, we have good evidence that it will only sets a limited crawl budget for each deep crawl.

Running Out Of Crawl Budget

So, if Google visits a website to perform a deep crawl, it may allocated a set time, or a set number of internal links to follow, before it abandons the crawl. Why would it do this? Google can never know for sure that it has captured all of your websites – in theory, you could have billions of pages on your website, each linking to the next. If Google set itself the task of thoroughly crawling your entire site, it could, in theory, waste its limited resources on poorly ranked websites. So, it sets crawl budgets.

Vast sites such as Wikipedia will be allocated much larger crawl budgets than business websites and personal blogs. You may think that a small website will not exhaust its crawl budget, but unfortunately, badly developed small sites can!

How To Optimise Your Budget

Modern CMSes have many files which Google can access via your web source code. Although readers cannot see these files, they are used to build your site. Files such as CSS and Javascript, as well as individual image files and attachments all complicate your site and eat into your crawl budget.

Updating your server configuration with Robots.txt directives and page meta tags can help to block Googlebot from some pages. However, Google does advise webmasters to be careful what is blocked, as if Google cannot property view your CSS files it cannot determine if your site is web friendly, and this can actually worsen your rankings.

Fix 404 Pages

If your site has many 404 pages, caused by broken links on your site, it will probably impact on your crawl budget. This is why 404 pages are reported in Search Console. Fix all onsite broken links.

Control Dynamic URLs

One of the biggest problems is with dynamic URLs. These are usually generated by poorly written website themes that create duplicate stylesheets, JavaScript files and product pages with dynamic URLs. If your site contains hundreds of dynamically generated pages that are all identical, Google will soon use up its crawl budget, which means your most important content may never been seen.

Keep Your Sitemap Up-To-Date

If your site content has changed significantly, but you have not updated your sitemap, Google will keep returning to the pages listed and this will waste more of your budget. Ensure that your sitemap is up to date, and look into ways of generating sitemaps automatically with your most important pages and blog posts at the top.

Build More Links

One of the factors that Google uses to allocated crawl budget is external links. The more links that you have pointing to your website, the more important your website looks to Google and the greater the crawl budget it will allocate. Link building is not only important for building trust and improving your keyword positions, it also encourages Google to return to your site more often and delve deeper when it does.

Make it easy for Google to find your content, and don’t let Google get lost in a virtual maze. Crawl optimisation is a bit like putting your best products at the front of your shop, ensuring there is no to out-of-date produce, and making sure that each shelf in your store provides something unique to be consumed. If you want to keep Googlebot happy, speak to our technical SEO consultants today.