G

Shopify robots.txt Guide (How to Edit & Why It’s Useful)

Published on
November 20, 2024
|
Updated on
November 20, 2024

Table of Contents

The robots.txt file is one of the most important components of every website, including the ones related to Shopify. It gives instructions to web crawlers of search engines (Google, Bing, Yahoo, etc.) on which parts of the site they should and shouldn’t crawl and index. However, Shopify robots.txt is slightly different compared to the ordinary website.

For Shopify store owners, the difference comes from the fact their website is hosted on Shopify servers. Therefore, Shopify is controlling the hosting environment and makes handling of robots.txt different compared to handling it on the website that is located on the hosting your own.

In this guide, we will cover everything you need to know about Shopify robots.txt, including how to change robot.txt Shopify and what to do if you get blocked by robots.txt Shopify.

Why trust us 

  • We're the creators of Bloggle, a dynamic Shopify blog builder available on the Shopify App Store that fills the gaps in native Shopify blogging capabilities.
  • We're a global force: 2000+ merchants across 60 countries have trusted us to amplify their voices.
  •  Your peers adore us: We have a stellar 4.9/5 rating on the app store.
  •  We've already empowered 55,000+ blogs written using our versatile app.
  • Under our guidance, users have reported up to a 10x boost in Search Engine Optimization (SEO) traffic and revenue

The Basics of robots.txt

Robots.txt is essentially a standard used by websites to communicate with web crawlers and other robots coming from search engines.

It is a part of the Robots Exclusion Protocol (REP) — a group of standards that regulate how robots can crawl the web, index website content, and how that content is presented to the users.

The most common uses of robots.txt on any website are:

  •  Directing Web Crawlers: The robots.txt file instructs the crawlers from search engines they can (or cannot) request from the website.
  •  Preventing Overload: The robots.txt file prevents your site from being overloaded by requests from crawlers and other bots coming from search engines.
  •  Securing Content: Although not a security measure by default, robots.txt can act like one by requesting that crawlers not index certain areas of the website that might be considered sensitive.

Common robots.txt Directives

As we previously mentioned, the most common use of robots.txt is giving directives to web crawlers. The file contains directives and commands that control crawlers’ access to specific areas of your website. Here are the most commonly used directives in robots.txt file:

User-agent

The User-agent directive specifies which web crawler the following rules apply to. If you want your rules to apply to all web crawlers, you can use an asterisk (*) as a wildcard.

Example:

User-agent: * — This applies the rules to all web crawlers.

Disallow

The Disallow directive tells the crawler not to access certain parts of your site. If you want to block access to a specific folder or page, you should use this directive.

Example:

Disallow: /private/ — This prevents crawlers from accessing anything in the "/private/" directory.

Allow

The Allow directive is used to override a Disallow directive, indicating that a crawler can access a specific file or folder within a disallowed directory. This is useful for allowing access to certain content in a directory that is otherwise blocked.

Example:

Disallow: /private/

Allow: /private/public-file.html — This configuration blocks all content in the "/private/" directory except for "public-file.html".

Sitemap

The Sitemap directive points search engines to your XML sitemap, a file that lists all the important pages on your site. This can help crawlers discover pages they might otherwise miss.

Example:

Sitemap: http://www.example.com/sitemap.xml — This tells crawlers where to find your sitemap.

Crawl-delay

The Crawl-delay directive is used to limit how quickly a crawler can request content from your site, preventing server overload. However, not all search engines adhere to this directive.

Example:

Crawl-delay: 10 — This asks crawlers to wait 10 seconds between hits to your server.

Comments

Comments can be added to a robots.txt file using the hash symbol (#). These are for human readers and are ignored by crawlers.

Example:

# This is a comment

Implications of robots.txt Directives

There are several things you need to remember about robots.txt directives and how you use them:

  • Specificity: Allow and Disallow rules are dependent on the order and specificity in the robots.txt file. A more specific rule should come after a general rule to ensure it's correctly applied.
  • Not Enforceable: Robots.txt directives are requests, not enforceable rules. Respectful web crawlers follow them, but malicious bots might ignore them.
  • No Personal or Sensitive Information: Since the robots.txt file is publicly accessible, don't use it to hide sensitive information.

Shopify robots.txt Default File and Settings

As we previously mentioned, Shopify robots.txt file is different from many other robots.txt files used on other websites. The use of the file is the same — optimize search engines to crawl and index the pages on your website efficiently, but these default settings can be limiting for some Shopify merchants.

Shopify robots.txt is optimized in a way to prevent search engines from indexing pages that might be duplicate content, private, or irrelevant to search engine users, such as admin pages, checkout, and cart pages. Here is how the default robots.txt setup looks like on Shopify:

  • Disallow search engines from indexing certain areas of your site, such as the cart, orders, and admin sections. These areas are not beneficial to rank in search engine results and are generally considered private.
  • Allow search engines to index the main content areas of your site, such as product pages, categories (collections), and blog posts. This ensures that the most valuable content is visible in search results.
  • Specify the sitemap location, so search engines can easily find and crawl your site's content. Shopify automatically generates sitemaps for your store, which is referenced in the robots.txt file.

Before June 2021, Shopify didn’t allow to its users to modify or edit these default settings. The limitations and the inability of users to change robot.txt Shopify caused many issues.

That resulted in using third-party apps to override Shopify’s decision, which led to some unfortunate results, even accidentally blocking the entire website from search engines. Shopify received harsh criticism from some users for doing nothing to solve this issue.

The limitations of this system were clear. The most obvious limitation was the inability to edit and further customize the file itself. Also, the default setup might not be optimal for all stores, especially large ones with many products, as it could lead to over-indexing of similar pages, impacting SEO negatively.

Without editing powers, store owners couldn't block specific content from being indexed, such as certain product pages or collections they didn't want to appear in search results.

Customizing robots.txt in Shopify

Finally, in June 2021, Shopify announced that they provided an update allowing the editing and customization of robots.txt file. As a Shopify website owner, you now have better control over your website, and you can suggest the crawlers which pages you want to index, and which ones to hide.

This update completely changed the game, because now there is no need to install various third-party software in order to bypass Shopify’s default robots.txt settings, now you can change robot.txt Shopify in a straightforward way:

  • Access Your Shopify Admin: Log in to your Shopify admin dashboard.
  • Edit Code: Navigate to Online Store > Themes. Here, you'll find your current theme listed with an Actions button associated with it. Click on Actions, and then select Edit code.
  •  Locate or Add robots.txt.liquid: Once in the code editor, you will need to locate or add a new robots.txt.liquid file in the Templates directory. If it doesn't already exist, you can create it by clicking on Add a new template, selecting robots.txt from the dropdown, and then clicking Create a template.
  • Customize Your robots.txt File: Now, in the robots.txt.liquid file, you can start customizing your directives. Shopify uses Liquid, a template language, which means you can also use Liquid objects and tags to dynamically generate content for your robots.txt file if needed.
  • Save Your Changes: Once you've made your desired changes, click Save to apply them. Your robots.txt file is now customized and will guide search engine crawlers according to your specifications.

Common mistakes to avoid when editing robots.txt

If you decide to edit Shopify robots.txt file manually, you need to be careful. Mistakes in this file can lead to SEO problems, even to get blocked by robots.txt Shopify. The most common mistakes are:

1) Using the Disallow command without specifying a particular path you want to block. This results in blocking all crawlers.

2) Using the Disallow command for sensitive pages. It is better to use other methods, such as password protection.

3) Overusing Wildcards. That can accidentally block or allow crawlers access to pages on your website.

4) Not testing changes. Always make sure to use tools such as Google’s Robots Testing Tool to test the changes after you make them.

5) Forgetting the sitemap. Not including a sitemap in your robots.txt file will hurt your SEO. Sitemap helps search engines crawl your website way more efficiently.

6) Using Comments incorrectly. Comments are added with the “#” symbol. If you don’t use it correctly, or you misplace the comments, can lead to confusion when executing your directives. Make sure that your comments are clearly separated from your directives.

How robots.txt Settings Can Impact a Shopify Store's SEO

Shopify robots.txt file obviously has a massive impact on SEO, since it directs search engine bots on how to crawl and index a website.

This is especially important for Shopify stores, since they are geared towards potential buyers. It is critical to get more potential buyers at your shop, and SEO can take a key role in that quest.

The main benefits of optimizing robots.txt correctly are avoiding indexation of duplicate and non-public pages, making your key selling pages visible, and, to some point, enhancing the security and speed of your website.

Can customizing my robots.txt file improve my Shopify store's SEO?

The answer is yes! Guiding search engines towards your most important content will enhance your SEO efforts and result in more traffic and conversions in your shop.

Is it possible to block specific crawlers from accessing my Shopify store?

Yes, you can block specific crawlers using the “Disallow” command in your Shopify robots.txt file.

Can I use robots.txt to hide my Shopify store from search engines completely?

Yes, it can be done with the “Disallow” and “User-agent” directives. However, we don’t recommend doing that, since it won’t benefit your Shopify store.

Is it necessary to have a robots.txt file for my Shopify store?

It is not necessary to have a robots.txt file, but it is highly recommended because indexing your website is highly beneficial for the organic traffic you will receive, and subsequently, the number of your customers.

Conclusion

Managing robots.txt of your Shopify score is key for your SEO efforts. It guides search engines towards your most important content, potentially increasing the traffic and sales of your shop.

However, optimizing it incorrectly can do more harm than good. Shopify default settings will be enough for most of the users, but if you need to customize it further, don’t forget to always test the changes you made.

Easily add products
to your blog posts.

Try Bloggle for free
Blog Smarter
Rank Higher
  • Ready-to-use industry templates
  • Live UX and SEO scoring tool
  • Drag-and-drop builder
4.9 / 5 •