Resolved: Intermittent Service Interruptions for Cloud Customers

By Jay Gilmore  |  September 26, 2017  |  3 min read
Resolved: Intermittent Service Interruptions for Cloud Customers

In the last few weeks, more than a few (any is too many) MODX Cloud customers experienced unexplained intermittent outages and errors (502/504 errors) on our MODX Cloud public platforms in Amsterdam and Texas. We know how upsetting this is for you and for your clients. We hate downtime as much as you do and we've been pained by this frustrating challenge.

We have been working on this since the outages began. We even brought in additional expertise to add new monitoring and scrutiny to the operation and performance of the MODX Cloud platforms.

We recently discovered that the issues were only affecting people with sites on PHP 5.6, but we could not figure out exactly why it was affecting 5.6 but not 7.1. We initially chalked it up to the massive differences in PHP 7.1 and how much more efficient it is.

Eureka!

On Friday, September 22, 2017, Elizabeth, our Systems Administrator, was reviewing some data and found something that, to her, looked like it was related to PHP's OPcache code (a standard code caching mechanism in PHP) that corresponded to an event on the server. I had an idea to check Google and found a report of a recent issue with PHP's OPcache. It turns out that a change in the behavior of how OPcache works was introduced in PHP 5.6.29. This change created a condition that would cause a deadlock in all running PHP instances, across all pools and masters, thus returning errors related to PHP vanishing or timing out (and the 502s/504s you'd see on the front end).

This answered our question as to why restarting PHP for the affected Clouds would resolve the issue for customers but then recur. Under normal circumstances, PHP would time out and recover resources on its own. Along the way, we found several other issues that we thought were the cause (but, ultimately not the root cause while definitely contributing in their own way).

In the interests of stabilizing the servers we downgraded all public platforms back to PHP 5.6.26. Our monitoring shows that the incidents of PHP locking up are now gone. We will continue to monitor performance and health on our servers, and we're hopeful that this critical bug is resolved in an upcoming PHP 5.6.x release.

The End is Near for PHP 5.6

In recent years, the PHP oversight group has increased the frequency of releases and published the support schedule for releases of PHP. PHP 5.6 is the last release of PHP 5 and will no longer be supported after December 31, 2018.

Current MODX Revolution, and most modern CMS and web applications, will run better and faster on PHP 7+. You can use PHP 7 now in MODX Cloud by switching to it in the Cloud Edit Interface on the Web Server Tab. Soon we'll be switching new Clouds to be PHP 7 by default.

We'll continue to support PHP 5.6 into 2018, however, we encourage all our customers to start moving their sites to PHP 7 now.

If you're not sure if your site or app can run on PHP 7, you can always clone your site by creating a new Cloud and Restoring a backup into that new Cloud. Once you've verified it's working, you can enable PHP 7 on the site and test it thoroughly.