=== WordPress Robots.txt optimization (+ XML Sitemap) – Website traffic, SEO & ranking Booster === Contributors: the-rock, pagup, freemius Tags: robots, crawler, search engines, seo, robots.txt Requires at least: 4.1 Requires PHP: 5.6 Tested up to: 5.8 Stable tag: 1.4.1.1 License: GPLv2 or later License URI: http://www.gnu.org/licenses/gpl-2.0.html All-in-One SEO plugin for Wordpress Robots.txt optimization with XML Sitemap detection (Yoast, Rank Math, ...), Woocommerce booster, Robots.txt editor, ... == Description == **Better Robots.txt creates a WordPress virtual robots.txt, helps you boost your website SEO (indexing capacities, Google ranking,etc.) and your loading performance –Compatible with Yoast SEO, Google Merchant, WooCommerce and Directory based network sites (MULTISITE)** With Better Robots.txt, you can identify which search engines are allowed to crawl your website (or not), specify clear instructions aboutwhat they are allowed to do (or not) and define a crawl-delay (to protect your hosting server against aggressive scrapers). Better Robots.txt also gives you full control over your WordPress robots.txt content via the custom setting box. Reduce your [site's ecological footprint and the greenhouse gas (CO2) production](https://better-robots.com/how-to-save-the-planet-with-your-website/) inherent to its existence on the Web. **A quick overview:** https://vimeo.com/306589027 **SUPPORTED IN 7 LANGUAGES** Better Robots.txt plugins are translated and available in: Chinese –汉语/漢語, English, French – Français, Russian –Руссɤɢɣ, Portuguese – Português, Spanish – Español, German – Deutsch **Did you know that...** * The robots.txt file is a simple text file placed on your web server which tells web crawlers (like Googlebot) whether they should access a file. * The robots.txt file controls how search engine spiders see and interact with your web pages; * This file and the bots they interact with are fundamental parts of how search engines work; * The first thing a search engine crawler looks at when it is visiting a page is the robots.txt file; **The robots.txt is a source of SEO juice just waiting to be unlocked. Try Better Robots.txt !** **About the Pro version (additional features):** **1. Boost your content on search engines with your sitemap !** Make sure your pages, articles, and products, even the latest, are taken into consideration by search engines ! The Better Robots.txt plugin was made to work with the Yoast SEO plugin (probably the best SEO Plugin for WordPress websites). It will detect if you are currently using Yoast SEO and if the sitemap feature is activated. If it is, then it will add instructions automatically into the Robots.txt file asking bots/crawlers to read your sitemap and check if you have made recent changes in your website (so that search engines can crawl the new content that is available). If you want to add your own sitemap (or if you are using another SEO plugin), then you just have to copy and paste your Sitemap URL, and Better Robots.txt will add it into your WordPress Robots.txt. **2. Protect your data and content** Block bad bots from scraping your website and commercializing your data. The Better Robots.txt plugin helps you block most popular bad bots from crawling and scraping your data. When it comes to things crawling your site, there are good bots and bad bots. Good bots, like Google bot, crawl your site to index it for search engines. Others crawl your site for more nefarious reasons such as stripping out your content (text, price, etc.) for republishing, downloading whole archives of your site or extracting your images. Some bots were even reported to pull down entire websites as a result of heavy use of broadband. The Better Robots.txt plugin protects your website against spiders/scrapers identified as bad bots by Distil Networks. **3. Hide & protect your backlinks** Stop competitors from identifying your profitable backlinks. Backlinks, also called “inbound links” or “incoming links” are created when one website links to another. The link to an external website is called a backlink. Backlinks are especially valuable for SEO because they represent a “vote of confidence” from one site to another. In essence, backlinks to your website are a signal to search engines that others vouch for your content. If many sites link to the same webpage or website, search engines can infer that the content is worth linking to, and therefore also worth showing on a SERP. So, earning these backlinks generates a positive effect on a site’s ranking position or search visibility. In the SEM industry, it is very common for specialists to identify where these backlinks come from (competitors) in order to sort out the best of them and generate high-quality backlinks for their own customers. Considering that the creation of very profitable backlinks for a company takes a lot of time (time + energy + budget), allowing your competitors to identify and duplicate them so easily is a pure loss of efficiency. Better Robots.txt helps you block all SEO crawlers (aHref, Majestic, Semrush) to keep your backlinks undetectable. **4. Avoid Spam Backlinks** Bots populating your website’s comment forms telling you ‘great article,’ ‘love the info,’ ‘hope you can elaborate more on the topic soon’ or even providing personalized comments, including author name are legion. Spambots get more and more intelligent with time, and unfortunately, comment spam links can really hurt your backlink profile. Better Robots.txt helps you avoid these comments from being indexed by search engines. **5. Seo tools** While improving our plugin, we added shortcut links to 2 very important tools (if you are concerned with your ranking on search engines): Google Search Console & Bing Webmaster Tool. In case you are not already using them, you may now manage your website indexing while optimizing your robots.txt ! Direct access to a Mass ping tool was also added, allowing you to ping your links on more than 70 search engines. We also created 4 shortcut links related to the best Online SEO Tools, directly available on Better Robots.txt SEO PRO. So that, whenever you want, you are now able to check out your site’s loading performance, analyze your SEO score, identify your current ranking on SERPs with keywords & traffic, and even scan your entire website for dead links (404, 503 errors, …), directly from the plugin. **6. Be unique** We thought that we could add a touch of originality on Better Robots.txt by adding a feature allowing you to “customize” your WordPress robots.txt with your own unique “signature.” Most major companies in the world have personalized their robots.txt by adding proverbs (https://www.yelp.com/robots.txt), slogans (https://www.youtube.com/robots.txt) or even drawings (https://store.nike.com/robots.txt – at the bottom). And why not you too? That’s why we have dedicated a specific area on the settings page where you can write or draw whatever you want (really) without affecting your robots.txt efficiency. **7. Prevent robots crawling useless WooCommerce links** We added a unique feature allowing to block specific links ("add-to-cart", "orderby", "fllter", cart, account, checkout, ...) from being crawled by search engines. Most of these links require a lot of CPU, memory & bandwidth usage (on hosting server) because they are not cacheable and/or create "infinite" crawling loops (while they are useless). Optimizing your WordPress robots.txt for WooCommerce when having an online store, allows to provide more processing power for pages that really matter and boost your loading performance. **8. Avoid crawler traps:** "Crawler traps” are a structural issue within a website that causes crawlers to find a virtually infinite number of irrelevant URLs. In theory, crawlers could get stuck in one part of a website and never finish crawling these irrelevant URLs. Better Robots.txt helps prevent crawler traps which hurt crawl budget and cause duplicate content. **9. Growth hacking tools** Today's fastest growing companies like Amazon, Airbnb and Facebook have all driven breakout growth by aligning their teams around a high velocity testing/learning process. We are talking about Growth Hacking. Growth hacking is a process of rapidly experimenting with and implementing marketing and promotional strategies that are solely focused on efficient and rapid business growth. Better Robots.txt provide a list of 150+ tools available online to skyrocket your growth. **10. Robots.txt Post Meta Box for manual exclusions** This Post Meta Box allows to set "manually" if a page should be visible (or not) on search engines by injecting a dedicated "disallow" + "noindex" rule inside your WordPress robots.txt. Why is it an asset for your ranking on search engines ? Simply because some pages are not meant to be crawled / indexed. Thank you pages, landing pages, page containing exclusively forms are useful for visitors but not for crawlers, and you don't need them to be visible on search engines. Also, some pages containing dynamic calendars (for online booking) should NEVER be accessible to crawlers beause they tend to trap them into infinite crawling loops which impacts directly your crawl budget (and your ranking). **11. Ads.txt & App-ads.txt crawlability** In order to ensure that ads.txt & app-ads.txt can be crawled by search engines, Better Robots.txt plugin makes sure they are by default allowed in Robots.txt file no matter your configuration. For your information, Authorized Digital Sellers for Web, or ads.txt, is an IAB initiative to improve transparency in programmatic advertising. You can create your own ads.txt files to identify who is authorized to sell your inventory. The files are publicly available and crawlable by exchanges, Supply-Side Platforms (SSP), and other buyers and third-party vendors. Authorized Sellers for Apps, or app-ads.txt, is an extension to the Authorized Digital Sellers standard. It expands compatibility to support ads shown in mobile apps. More to come as always … == Installation == = Installing manually = 1. Unzip all files to the `/wp-content/plugins/better-robots-txt` directory 2. Log into WordPress admin and activate the 'Better Robots.txt' plugin through the 'Plugins' menu 3. Go to "Settings > Better Robots.txt" in the left-hand menu to start work on robots.txt file. == Frequently Asked Questions == = Better Robots.txt plugin is enabled but why can't I see any changes in robots.txt file? = Better Robots.txt creates a WordPress virtual robots.txt file. Please make sure that your permalinks are enabled from Settings > Permalinks. If permalinks are working then make sure that there is no physical robots.txt file on your server. Since it can't write over physical file, so you must connect to FTP and rename or delete robots.txt from your domain root directory. It usually in /public_html/ folder on cPanel hostings. If you can't find your domain root directory, please ask your hosting provider for help. If issue persists after taking these measures, please post it in support section or send a message to support@better-robots.com = Will there be any conflict with robots.txt which I'm already using? = If you have a pshysical robots.txt on your web hosting server, then this plugin will not work. As mentioned, it creates a WordPress virtual robots.txt file. Please follow steps in above answer if you want to use robots.txt file with this plugin. = How to add sitemap in my WordPress robots.txt? = This feature is allowed in Better Robots.txt Pro version, which automatically add sitemap in robots.txt file. It detects sitemap from Yoast SEO plugin. In-case you're using a different sitemap plugin or a manually generated sitemap then you can simply add sitemap URL in sitemap input field. If Yoast XML sitemaps are also enabled then you need to disable it first by simply going to Yoast General Settings > Features and disable XML Sitemaps feature. = Why should I optimize the robots.txt? = Why not? Considering that the robots.txt is the very first file read when your website is loaded by a browser, why not enable crawlers to continuously index your content? The simple fact of adding your Sitemap in the Robots.txt is simply common sense. Why? Did you list your website on Google Search Console, did your webmaster do it? How to tell the crawlers that you have new content available for indexation on your website? If you want this content to be found on search engines (Google, Bing, ...), you have to have it indexed. That's exacly what this instruction (adding the sitemap) aim to. One last point. The main reason why this plugin exists is due to the fact that 95% of the time (based of thousands of SEO analysis), the robots.txt is either missing, empty or misued, simply because it is either misunderstood or forgotten. Imagine now if it was activated and fully functional. = How can this plugin boost my website ranking? = Actually, this plugin will increase your website indexation capacity which leads to improve your ranking on Google. How ? Well, the idea of creating this plugin was taken after making hundreds of SEO optimization on professional and corporative websites. As mentioned before, 95% of analysed websites did not have what we could call an "optimized" robots.txt file and, while we were optimizing these websites, we realized that simply modifying the content of this file was actually "unlocking" these websites (based on daily SEMrush analysises). As we were used to working in 2 steps (periods of time), starting with this simple modification was already generating a significant impact on Google Ranking, and this, even before we started deeply modifying either the content, the site arborescence or META Data. The more you help search engines at understanding your website, the better you help your capacity of getting better results on SERPs. = How to test and validate your robots.txt? = While you can view the contents of your robots.txt by navigating to the robots.txt URL, the best way to test and validate it, is through the robots.txt Tester option of Google Search Console. Login to your Google Search Console Account. Click on robots.txt Tester, found under Crawl options. Click the Test button. If everything is ok, the Test button will turn green and the label will change to ALLOWED. If there is a problem, the line that causes a disallow will be highlighted. = What is a virtual robots.txt file? = WordPress by default is using a virtual robots.txt file. This means that you cannot directly edit the file or find it in the root of your directory. The only way to view the contents of the file, is to type https://www.yourdomain.com/robots.txt in your browser. The default values of WordPress robots.txt are: User-agent: * Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php When you enable the “Discourage search engines from indexing this site” option under Search Engine Visibility Settings, the robots.txt becomes: User-agent: * Disallow: / Which basically blocks all crawlers from accessing the website. = Why Is Robots.txt Important? = There are 3 main reasons that you’d want to use a robots.txt file. * Block Non-Public Pages: Sometimes you have pages on your site that you don’t want indexed. For example, you might have a staging version of a page. Or a login page. These pages need to exist. But you don’t want random people landing on them. This is a case where you’d use robots.txt to block these pages from search engine crawlers and bots. * Maximize Crawl Budget: If you’re having a tough time getting all of your pages indexed, you might have a crawl budget problem. By blocking unimportant pages with robots.txt, Googlebot can spend more of your crawl budget on the pages that actually matter. * Prevent Indexing of Resources: Using meta directives can work just as well as Robots.txt for preventing pages from getting indexed. However, meta directives don’t work well for multimedia resources, like PDFs and images. That’s where robots.txt comes into play. You can check how many pages you have indexed in the Google Search Console. If the number matches the number of pages that you want indexed, you don’t need to bother with a Robots.txt file. But if that number of higher than you expected (and you notice indexed URLs that shouldn’t be indexed), then it’s time to create a robots.txt file for your website. = Robots.txt vs. Meta Directives = Why would you use robots.txt when you can block pages at the page-level with the “noindex” meta tag? As mentioned before, the noindex tag is tricky to implement on multimedia resources, like videos and PDFs. Also, if you have thousands of pages that you want to block, it’s sometimes easier to block the entire section of that site with robots.txt instead of manually adding a noindex tag to every single page. There are also edge cases where you don’t want to waste any crawl budget on Google landing on pages with the noindex tag. = Important things about robots.txt = * Robots.txt must be in the main folder, i.e., domain.com/robots.txt. * Each subdomain needs its own robots.txt (sub1.domain.com, sub2.domain.com, … ) while multisites require only ONE robots.txt (domain.com/multi1, domain.com/multi2, …). * Some crawlers can ignore robots.txt. * URLs and the robots.txt file are case-sensitive. * Crawl-delay is not honored by Google (as it has its own crawl-budget), but you can manage crawl settings in Google Search Console. * Validate your robots.txt file in Google Search Console and Bing Webmaster Tools. * Don’t block crawling to avoid duplicate content. Don’t disallow pages which are redirected. Crawlers won’t be able to follow the redirect. * The max size for a robots.txt file is 500 KB. **PS: Pagup recommends [Site kit by Google](https://wordpress.org/plugins/google-site-kit/) plugin for insights & SEO performance.** == Screenshots == 1. Better Robots.txt Settings Page 2. Better Robots.txt Settings Page 3. Better Robots.txt Settings Page 4. Better Robots.txt Settings Page 5. Robots.txt file output == Changelog == = 1.0.0 = * Initial release. = 1.0.1 = * fixed plugin directory url issue * some text improvements = 1.0.2 = * fixed some minor issues with styling * improved text and translation = 1.1.0 = * added some major improvements * allow/off option changed with allow/disallow/off * improved overall text and french translation = 1.1.1 = * fixed a bug and improved code = 1.1.2 = * added new feature "Spam Backlink Blocker" = 1.1.3 = * fixed a bug = 1.1.4 = * added new "personalize your robots.txt" feature to add custom signature * added recommended seo tools to improve search engine optimization = 1.1.5 = * added feature to detect physical robots.txt file and delete it if server permissions allows = 1.1.6 = * added russian and chinese (simplified) languages * fixed bug causing redirection to better robots.txt settings page upon activating other plugins = 1.1.7 = * added new feature: Top plugins for SEO performance * fixed plugin notices issue to dismiss for define period of time after being closed * fixed stylesheet issue to get proper updated file after plugin update (cache buster) * added spanish and portuguese languages = 1.1.8 = * added new feature: xml sitemap detection * fixed translations = 1.1.9 = * added new feature: loading performance for woocommerce = 1.1.9.1 = * fixed a bug in disallow rules for woocommerce = 1.1.9.2 = * boost your site with alt tags = 1.1.9.3 = * fixed readability issues = 1.1.9.4 = * fixed default robots.txt file issue upon plugin activation for first time * fixed php error upon saving settings and permalinks * refactored code = 1.1.9.5 = * added clean-param for yandex bot * ask backlinks feature for pro users * avoid crawler traps feature for pro users * improved default robots.txt rules = 1.1.9.6 = * added 150+ growth hacking tools * fixed layout bug * updated default rules = 1.2.0 = * Added Post Meta Box to Disable Indivdual post, pages and products (woocommerce pro only). It will add Disallow and Noindex rule in robots.txt for any page you choose to disallow from post meta box options. = 1.2.1 = * Added multisite feature for directory based network sites (pro only). it can duplicate all default rules, yoast sitemap, woocommerce rules, bad bots, pinterest bot blocker, backlinks blocker etc with a single click for all directory based network sites. * Added version timestamp for wp_register_script 'assets/rt-script.js' = 1.2.2 = * Fixed some bugs creating error in google search console * Text improvement = 1.2.3 = * Added "Hide your robots.txt from SERPs" feature * Text improvements = 1.2.4 = * Fixed a bug * Text improvements = 1.2.5 = * Fixed crawl-delay issue * Updated translations = 1.2.5.1 = * Fixed a minor issue = 1.2.6 = * Security patched in freemius sdk = 1.2.6.1 = * Fixed Multisite Issue for pro users = 1.2.6.2 = * Fixed Yoast sitemap issue for Multisite users = 1.2.6.3 = * Fixed some text = 1.2.7 = * Added Baidu/Sogou/Soso/Youdao - Chinese search engines features for pro users * Added social media crawl feature for pro users = 1.2.8 = * Notification will be disabled for 4 months. Fixed some other minor stuff = 1.2.9.2 = * Updated Freemius SDK v2.3.0 * BIGTA recommendation = 1.2.9.3 = * Fixed Undefined index error while saving MENUS for some sites * Removed "noindex" rule for individual posts as Google will stop supprting it from Sep 01 2019 = 1.3.0 = * Added 5 new rules to default config. Removed 4 old default rules which were cuasing some issues with WPML * Added a search rule to Avoid crawling traps * Added several new rules to Spam Backlink Blocker * Fixed security issues = 1.3.0.1 = * VidSEO recommendation = 1.3.0.2 = * Fixed some security issues * Added new rules to Backlink Protector (Pro only) * Multisite notification will be disabled permenantly once dismissed = 1.3.0.3 = * Fixed php notice (in php log) for $host_url variable = 1.3.0.4 = * Fixed php notice (in php log) for $active_tab variable * Fixed some typos = 1.3.0.5 = * Added option to Be part of our worldwide Movement against CoronaVirus (Covid-19) * Fixed several php undefined index notices (in php log) related to Step 7 and 8 options = 1.3.0.6 = * 👌 IMPROVE: Updated freemius to latest version 2.3.2 * 🐛 FIX: Some minor issues = 1.3.0.7 = * 🔥 NEW: WP Google Street View promotion * 🐛 FIX: Some minor text issues = 1.3.1.0 = * 👌 IMPROVE: Admin Notices are set to permenantly dismissed based on user. * 👌 IMPROVE: Top level menu for Better Robots.txt Settings * 🐛 FIX: Styling conflict with Norebro Theme. * 🐛 FIX: Undefined variables php errors for some options = 1.3.2.0 = * 🐛 FIXED: XSS vulnerability. * 🐛 FIX: Non-static method errors * 👌 IMPROVE: Tested up to WordPress v5.5 = 1.3.2.1 = * 🐛 FIXED: Call to undefined method error. = 1.3.2.2 = * 👌 IMPROVE: Update Freemius to v2.4.1 = 1.3.2.3 = * 👌 IMPROVE: Tested up to WordPress v5.6 * 🐛 FIX: Get Pro URL = 1.3.2.4 = * 👌 IMPROVE: Added some more rules for Woocommerce performance * 👌 IMPROVE: Update Freemius to v2.4.2 = 1.3.2.5 = * 🔥 NEW: Meta Tags for SEO promotion = 1.4.0 = * 👌 IMPROVE: Refactored code to MVC * 👌 IMPROVE: New clean design * 👌 IMPROVE: Many small improvements = 1.4.0.1 = * 🐛 FIX: Added trailing backslash for using trait = 1.4.1 = * 🔥 NEW: Search engine visibility feature (Pro version) * 🔥 NEW: Image Crawlability feature (Pro version) = 1.4.1.1 = * 🐛 FIX: Sitemap issue