release

New robots.txt control in MODX Cloud

Today we’re announcing a new feature in Cloud to streamline its handling of robots.txt files, bringing a new capability for unique robots.txt files per hostname for multisite installations.

What is robots.txt?

A /robots.txt is an optional file that lets a webmaster explicity tell well-behaving web robots, like search index spiders, about how they should crawl a website. If there is no robots.txt file present, most robots should proceed with crawling and indexing a site.

This is useful when site owners use a dev or staging site for ongoing work, and an isolated production site where changes and updates are deployed. You can tell web robots to ignore the dev site, while allowing indexing on the production site.

Robots.txt in MODX Cloud

Previously, MODX Cloud gave users control over the behavior of allowing a custom robots.txt file to be served based on a toggle in the Dashboard. While this was useful, it was possible to accidentally allow indexing on staging/dev sites by toggling the option in the Dashboard. Similarly, it was possible to easily disallow indexing on a production site.

Today, we’re removing this interface completely, and relying on the presence of robots.txt files on the filesystem with the following exception: any domain that ends in modxcloud.com will be served a Disallow: / directive to all user agents, irrespective of the presence or absence of a robots.txt file.

For production sites (ones that get real visitor traffic) you’ll need to use a custom domain if you desire your site to be indexed.

Serve unique robots.txt files per hostname in MODX Cloud

Some organizations use MODX Revolution to run multiple websites from a single installation using Contexts. Cases where this might apply would be a public facing marketing site combined with landing page microsites and possibly a non-public intranet.

Most site owners want their sites indexed. In MODX Cloud all sites with custom hostnames will fall back to serving any uploaded robots.txt file in the web root, usually with the following content:

User-agent: *
Disallow: 

 
However, for a hypothetical intranet using intranet.example.com as its hostname, you wouldn’t want it indexed. Traditionally, this was tricky to accomplish on multisite installs because they shared the same web root. However in MODX Cloud, it’s easy. Simply upload an additional file to your webroot named robots-intranet.example.com.txt with the following content and it will block indexing by well behaving robots, and all other hostnames will fall back to the standard robots.txt file if no other hostname-specific ones exist:

User-agent: *
Disallow: /

Do I need to do anything?

All new Clouds will work as described above starting now. Please refer to our note on robots.txt behavior for Clouds created prior to October 19, 2017.

Learn more

Understanding how robots.txt affects your sites in the search engines is an important aspect of website management. Learn more about robots.txt at robotstxt.org. Also bookmark our documentation on robots.txt handling in MODX Cloud. And if you want to start using this new capability in MODX Cloud login to your Dashboard or create an account today.

Sign Me Up for MODX Cloud!

Hi. We’re MODX.

We’re here to help you fix, build and grow fantastic sites. How can we help?




How can we help?

Tell us the general reason for reaching out so we can connect you with the right team.

MODX Diagnostics

MODX’s Open Source software is 100% free for anyone to download and use. As the team behind it for more than a decade, we know it inside, out, and then some.

Like any software, sometimes things break; we can usually fix them very fast. But, we do have to charge for our time to support our families and fund its ongoing development. There are almost an unlimited variety of things that can cause problems, including server upgrades, corrupt files, accidental changes, outdated software, database hiccups and more. We will save you a lot of time and frustration, and get you back in action.

With our MODX Diagnostic service, we determine the source of issues, and often fix them on the spot. For more extensive problems needing more time, like hacked sites or overdue upgrades, we provide additional estimates and guidance. MODX Diagnostics cost $99 for standard business hours support (US Central Time), or $500 for priority, rush or after-hours emergencies.

If you don’t have budget for professional support from the source, you look for answers in the MODX Forums or Documentation, or seek help from MODXers in the Community Slack, or from MODX Professionals near you.

  I’m not ready to pay, let’s talk…

After submitting this form and completing payment, we will collect your access credentials in a secure support ticket. We look forward to helping restore your site back to full health.

Hi! We’d love to work together.

If you have a simple problem that needs our assistance, please request quick fix help here.

What should we keep in mind?

The project involves:
(select all that apply)
What are you planning?
(select all that apply)

Some other considerations

Specific project information

Commercial Support Customers

Customers with a current Commercial Support agreeement can get help using this form. Learn more about MODX Preferred Support.

Let’s get started

What seems to be the issue?

Contact MODX

We welcome conversations, ideas, inquiries and even the occassional cold sales call, but support and requests about how to use MODX software sent via this form cannot be guaranteed a response. That said, we try to respond to everyone that reaches out to us within two business days.

To report a security issue or file a bug for MODX software, please email security [at] modx.com to reach our security team. If you are looking for help with MODX, many times you can find an answer in the MODX Forums or MODX Documentation, from MODXers in realtime at the MODX Community Slack Channel, or from a MODX Professional near you.

How can we help?