What is a robots.txt file?

 Going back to previous years when the Internet was in its infancy with limited capabilities, developers have devised a way to crawl and index new pages on the Internet. They called them "robots" or "spiders."

Occasionally, these bots roam websites that weren't meant to be crawled and indexed, such as sites that are undergoing maintenance. The founder of Aliweb, the oldest search engine on the Internet, recommended the creation of the first search engine in the world, and decided to create a system that defines what bots should follow.

A convenient solution was arrived at in June 1994 by a group of Internet genius technologists, and it was called the "Bot Exclusion Protocol".

The robots.txt file is an implementation of this protocol. The bots file protocol defines the instructions that each bot must follow. Including Google bots. Some illegal bots such as malware, spyware and the like operate outside these rules.

You can take a look around any site's robots file by typing the site's domain URL and adding: /robots.txt at the end.

This is the default format for robots.txt

User-Agent: *

Allow: /wp-content/uploads/

Disallow: /wp-content/plugins/

Disallow: /wp-admin/

Sitemap: https://example.com/sitemap_index.xml

The robots.txt file is a protocol that instructs robots or search spiders not to archive and index certain pages within your site. Search engine robots consult the robots.txt file before starting to index the site.

Where is the Robots.txt file in the site

The robots.txt file is stored in the main folder along with the site files like the sitemap file to locate the file open cPanel and you will be able to find the file in public_html.

Robots.txt file in cPanel

Why do you need a robots.txt file on your site?

If you don't have a robots.txt file, the search engine will still crawl and index your site. However, you will not be able to tell search engines which pages or folders should not be crawled.

The file won't have much of an impact when you first build your site and you don't have a lot of content.

But as your site grows and the volume of content increases, you'll likely want to have better control over how search spiders crawl and index your site.

this is the reason.

Search spiders crawl a certain number of pages during a crawl session. If they don't finish crawling all of your pages, they will come back and resume crawling next time.

This can slow down your site's indexing rate.

You can fix this by preventing the search bot from trying to crawl unnecessary pages like WordPress file pages, WordPress template files , and  plugins .

By disallowing access to unnecessary pages, you can select the important pages on your site. This helps search engines crawl and index more pages of your site as quickly as possible.

Another reason to use a robots.txt file is when you want to prevent search engines from indexing an article or page within your site.

Not the best way is to hide content from visitors, with a robots file you can prevent them from appearing in search results.

What are the ideal robots.txt commands?

Many blogs use a very simple robots.txt file. Its content may vary depending on the needs of the specific site:

User-Agent: *

Disallow:

Sitemap: https://www.example.com/post-sitemap.xml

Sitemap: https://www.example.com/page-sitemap.xml

This robots file allows all spiders to index the entire site content and provides them with a link to Sitemap XML files .

For WordPress sites, we recommend using the following text in your robots.txt file:

User-Agent: *

Allow: /wp-content/uploads/

Disallow: /wp-content/plugins/

Disallow: /wp-admin/

Disallow: /readme.html

Disallow: /refer/

Sitemap: https://www.example.com/post-sitemap.xml

Sitemap: https://www.example.com/page-sitemap.xml

This code tells search bots to index all WordPress images and files. Search bots are not allowed to index WordPress uploads such as images, wp-admin folder, WordPress readme, and affiliate links.

The process of adding sitemap links to your robots.txt file makes it easy for Google bots to find all the pages on your site.

Now that you know what a robots.txt file looks like let's take a look at how to create a robots.txt file in WordPress.

How to create a robots.txt file for WordPress sites?

There are two ways to create a robots file in WordPress. You can choose the method that suits you.

Method 1: By editing the Robots.txt file with Yoast SEO plugin

If you are using Yoast SEO plugin , then this method is suitable for you to create a robots.txt file with the same plugin.

Simply log in to your site and then hover your mouse pointer on  SEO in the side menu inside the WordPress dashboard, a drop-down menu will appear, choose  Tools  and then click on File Editor.

Editing a Robots.txt File with Yoast SEO

On the next page, inside the Yoast SEO plugin you will find the site's robots file.

If you do not have a robots.txt file, Yoast SEO will generate a robots.txt file for the site.

Create a robots.txt file using the yoast seo plugin

A default robots.txt file will be created, delete the following rules from the file:

* : User-agent

/ : Disallow

It is important that you delete this text because it prevents all search engines from crawling a site.

After deleting the default text, you can go ahead and add your site's bots file rules. We recommend using the perfect robots.txt file format that we shared at the beginning of the article.

Once done, don't forget to click on the “Save robots.txt” button to store your changes.

Method 2: Edit Robots.txt File Manually Using FTP

To edit the file through this method, you will need to use FTP to edit the bots file.

All you have to do is establish a connection to the site's hosting data through FTP.

Once logged in you will be able to see the robots.txt file in your site's home folder.

Modify robots.txt file in WordPress with FTP

If you do not find this file, then this file does not exist on your site. In this case you can just create a new file named robots.txt.

Robots is a plain text file which means you can download it to your computer and edit it using any plain text editor such as Notepad or TextEdit. After saving the changes, you can upload it back to the main site folder.

The video explains the details of the robots file

After completing the file, do the last step, which is to add the file to Google through Webmaster Tools so that the search spiders follow the instructions inside the file.

Through this article, learn how to add a site to Webmaster Tools .

If you have a query regarding robots.txt file, write it to us in the comment below the article.

?What Are The Most Important Meta Tags

Comments
No comments
Post a Comment



    Reading Mode :
    Font Size
    +
    16
    -
    lines height
    +
    2
    -