WordPress Robots.txt File - A User Guide

A Beginner’s Guide to the WordPress robots.txt file

As many website owners do, you’ve likely spent countless hours learning the ins and outs of on-page Search Engine Optimization (SEO), and how to use it to best optimize your WordPress site. And while on-page SEO certainly plays a significant role in your website’s engine rankings, what’s even more important is ensuring that your website allows search engine bots easy and unfettered access to your site.

Fortunately, there’s a pretty quick fix for this universal necessity. Proper optimization of your WordPress website’s robots.txt file (and the creation of one, if necessary) can ensure your website is accessible to and understandable by search engine bots, thus putting you in the best possible position to benefit from increased search engine rankings.

In this post, we’ll start by explaining exactly what the robots.txt file is and how it contributes to your website’s search engine rankings. Then we’ll dive into the meat of the issue by showing you how to create and optimize a robots.txt file suited to your WordPress website’s unique setup. Let’s get to it!

What Is the WordPress robots.txt File?

In simplest terms, the robots.txt file is a file uploaded to your web server that advises search engine bots (such as Googlebot) which files within your website to access, which to ignore, and when. This can have a direct impact on what pages on your site appear within search engine results.

While you don’t needrobots.txt file, there are plenty of good reasons why you should have one. For example, here are three things that the robots.txt file can enable you to do:

  1. Work on your site without it being indexed. Sometimes, you may be required to “go live” with your website while still in the middle of development. At this time, it’s best to disallow your entire WordPress site from being indexed.
  2. Set special instructions for bots. This is a common requirement for those website owners who use paid links or advertisements throughout their site.
  3. Protect content (while making exceptions). The robots.txt file is actually quite customizable. Perhaps you have a folder which you’d like to block from bot indexing, but within that folder is a file you’d like to have included in search engine results. With the robots.txt file, you can make that happen.

Are there cons to having a robots.txt file? Not necessarily, but not every site needs one. Here are a couple of potential reasons why:

  1. You want your whole site indexed. If the entirety of your WordPress site’s folders and files need to be indexed, then the use of a robots.txt file isn’t necessary.
  2. Your site is simple and requires no protection. For websites with few pages and no folders or files to protect, it’s alright to not use a robots.txt file.

It is okay to not use a robots.txt file. However, if the above reasons have persuaded you otherwise, it’s time to learn how to properly use your robots.txt file to its full advantage.

How Can I Optimize the WordPress robots.txt File for My Website’s Needs?

Even if you’re unfamiliar with coding, your can optimize your website’s robots.txt file in just a few simple steps, which we’re going to cover in detail below!

Please note that some of the steps below will require the use of a File Transfer Protocol (FTP) client – we recommend the free, easy to use, and open-source FileZilla.

Step #1: Locate the robots.txt File

As mentioned previously, not all websites will have a robots.txt file. However, even if you’re sure your website does, it’s good to take a closer look at the contents of your site’s robots.txt file before moving on.

The easiest way to check your website’s robots.txt file is to enter yourwebsite.com/robots.txt into your browser’s address bar. If you get a 404 Not Found error, your site probably doesn’t (yet) have the file!

Step #1.1: Create a robots.txt File

If the above step has revealed that your site doesn’t currently have a robots.txt file, creating one yourself is easy enough.

First, create a new file in your plain text editor of choice, and enter the following:

Robots.txt in NotePad

Let’s take a quick look at the two lines we’ve entered so far:

  • User-agent is the aspect of the file that enables you to specify who the instructions are for. If you’d like to instruct all search engine bots to follow the directives in the robots.txt file (which is generally a good idea), then use an asterisk as shown above.
  • Disallow is the directive that tells bots which folders and files to avoid indexing. If you leave it blank (as above), then all files and folders on your site will be indexed. Later on, we’ll discuss how to disallow the indexing of certain files.

Save the file as robots.txt once you’re done.

Finally, you need to upload your brand-new robots.txt file to your website’s root folder. You can do this with the help of FileZilla or similar FTP clients, or by using the cPanel file manager provided by your web host. Here’s a guide for using FileZilla to get you started, if necessary.

Step #2: Determine if Any Files Are Currently Blocked

Assuming you’re working with a preexisting robots.txt file, it’s now time to take a closer look at it to determine if vital pages are currently blocked and inaccessible to Google bots. This is a vital step for those looking to boost their search engine rankings.

You can perform this step manually (by literally reading through the robots.txt file), or use the Google guidelines tool provided by Varvy.

Reading through our own robots.txt file, we can see five folders that are blocked from indexing:

Nimbus Themes Robots.txt file

Of course, these folders were purposely blocked on our end. However, you may find some key web pages are blocked, which could be why your SEO rankings are lacking.

Step #3: Optimize the robots.txt File for Your Website’s Needs

Now then – before we get our hands dirty, let’s take a moment to ensure we understand key instructions commonly seen within robots.txt files.

Full Allow

If you’d like every part of your website to be crawled by search engine bots, then “full allow” is what you’re looking for. If your site doesn’t already have a robots.txt file, then you can actually leave it as-is and bots will crawl your website uninhibited. If your site does have a robots.txt file already uploaded, however, then you’ll want to be sure the text within the file is optimized for full allow.

You have two options when it comes to editing the WordPress robots.txt file, both of which will achieve the same end.

User-agent: *

Disallow:

This first option uses the Disallow directive without listing any instructions. This means the entire site can be accessed by Google bots, and no particular folders or files are blocked.

User-agent: *

Allow: /

The second option uses the Allow directive. The forward slash, when placed after Allow, acts as an ‘all’ instruction. This means all folders and files are accessible.

Full Disallow

Enabling “full disallow” is rarely a good option, but it can (for example) be used by those developing a live site that they’d like to keep from being indexed.

This is done as such:

User-agent: *

Disallow: /

Using Allow and Disallow to Fit Your Website’s Needs

After ensuring your robots.txt file allows access to vital webpages, consider whether your site has any folders or files that you’d like to block from public access. Say, for example, you’d rather Google not index a root directory within your site named photos.

This is an easy enough directive that can be achieved as follows:

User-agent: *

Disallow: /photos

Now let’s say you’d like to disallow the indexing of all images except for one. That’s easy enough, too.

First, we start with the directive mentioned above, and then add in a key element:

User-agent: *

Disallow: /photos

Allow: /photos/prettybird.jpg

As you can see, creative use of Allow and Disallow give you fine control over what search engines crawl within your site.

Step #4: Test Your Optimized robots.txt File

The simplest way to complete this step is with the help of the Google Search Console. You’ll need a (free) Google account to access this tool.

Google Console Robots.txt file Tester

Once the tool shows your code is functional, follow these instructions on uploading and editing your robots.txt file with the help of FileZilla.

Conclusion

Don’t let an improperly optimized WordPress robots.txt file to sabotage your SEO efforts. With the right tweaking, your robots.txt file can work for you to ensure website safety, boost your search engine rankings, and even control which search engine bots have the ability to access and index your site.

To ensure your optimization efforts go smoothly, let’s quickly recap the four steps outlined above:

  1. Locate the robots.txt file.
  2. Determine if any files are currently blocked.
  3. Optimize the robots.txt file for your website’s needs.
  4. Publish and test your optimized robots.txt file.

Do you have any questions about the WordPress robots.txt file and how to optimize it? Ask away in the comments section below!

This post may contain affiliate links, which means Nimbus Themes may receive compensation if you make a purchase using these links.

Leave a Reply

Your email address will not be published. Required fields are marked *