Optimization with getCache and Custom Cache Partitions

The Problem: Entire Cache Cleared When Anything Changes

A-Always, B-Be, C-Caching.

A-Always, B-Be, C-Caching.

A more and more common complaint I've been hearing about the MODX Revolution Caching features are that anytime someone edits anything, the entire cache is cleared. This results in the next visitor experiencing longer load times for the Resource. And the most common reaction is to want only the specific item being edited to have it's cache removed. From the surface, that does seem like a problem rather than a feature. I mean, as advanced and flexible as MODX is, only updating the parts of the cache that are affected by the change would seem a fairly simple challenge.

Though my first reaction to this is to point out that this means your Resources are simply not efficient enough when not cached, I'd like to better explain why it is a particularly stubborn hurdle to clear consistently, as well as exemplify an alternative way to address the problem without having to develop the solutions from scratch.

A-B-C. A-Always, B-Be, C-Caching. Always Be Caching, Always Be Caching.

The Roadblock: Unstructured Cache Dependencies

It is precisely because of the flexibility of MODX that updating only changed parts of the cache becomes a daunting task. You can do just about anything within MODX Resources and/or Elements, and so attempting to predict which Resources and/or Elements are affected by a particular change is simply impossible in a generic solution. Though we will certainly be expanding and changing the way the cache is manipulated by many of the core processors, providing additional options for more advanced control over what is and isn't flushed from or updated in the cache, the better solution requires you, as a site owner, to develop a caching strategy tailored to your site content and publishing requirements.

So if we accept the fact that the unstructured dependencies inherent in a MODX site prevents easy calculation of when and where to refresh which parts of the cache, and that MODX does a pretty good job of handling the cache dependencies best it can already, our next step is to find the best path around this roadblock. Certainly there are many to choose from, but for now let's explore one of these possible paths I have been experimenting with.

The Detour: getCache and Custom Cache Partitions

A Snippet I created and released a couple of years ago, early in the MODX Revolution release history, along with the ability to define and use custom cache partitions in 2.1 and later releases, provides one of the most direct paths around the problem I can think of. The getCache Extra, which I featured in this blog post, was created to cache the output from any MODX Element uniquely based on the Element, it's properties, and any parameters used in the request that calls getCache. In addition, it allows you define exactly where, how, and for how long to cache that output. By taking advantage of this Extra, you can apply a highly granular caching strategy to your MODX site by simply caching the output of resource-intensive or long-running Elements outside of the standard cache partitions.

Let's look at a simple example using getCache which can reduce load times for Resources regardless if your publishing activities cause the entire Resource cache to be flushed more often than you would care for.

[[!getCache@myGetResourcesPropertySet?
    &element=`getResources`
    &cacheKey=`persist_forever`
    &cacheExpires=`0`
    &parents=`[[*id]]`
]]

So what exactly does this do when it gets executed? Let's break it down:

  1. getCache calculates a unique hash for the request based on the Element, it's properties, and all $_REQUEST parameters. Alternatively, you can specify your own &cacheElementKey property to use instead.
  2. getCache checks for the cached data in the cache partition specified by the cacheKey property, i.e. persist_forever.
  3. getCache gets the cached output if found and not expired (cacheExpires=0 means it never expires), or processes the Element, caches the data for all subsequent matching requests, and returns it as the Element normally would.

Pretty simple, but the reduction in overhead on a site where the Resource cache is flushed more often can be incredible, since all of the output from the Elements you wrap with getCache can be cached permanently, or until you decide it needs to be flushed. This gives you the potential for incredibly precise control over just about any aspect of your site's caching.

You can find additional information on getCache at the following links:

The Continuing Adventure: When and How to Refresh Custom Cache Partitions

So now that we know we can cache the output outside of the standard MODX partitions, which are not automatically refreshed by any action in the core, it's time to return to the original problem. When do we clear the data we have decided to store in our custom partition(s)? The easy answer is we clear it when it needs to be, and that can be easily handled via human intervention by attaching a plugin to the OnSiteRefresh event or creating a custom Menu item to trigger the refresh on the custom partition(s). The harder answer is automating it when specific items are edited in the manager that should force recalculation of the content, and unfortunately that will depend entirely on the content structure of your site and the publishing habits of your editors.

Once again, with great freedom comes great responsibility.

You can learn more about MODX Caching and the xPDO features under the hood at:

About
Jason is one of the founding members of MODX, and now holds the title of Chief Architect. Because of his love for green chile and the mountains, he resides in Taos, New Mexico, with his wife, daughter, and son. From Taos, he enjoys hiking, camping, snowshoeing, exploring, and photographing the American West with his family. Jason is also an avid drummer and capable pool player. Author of xPDO and architect of MODX Revolution, he leads the development and innovation team for the MODX Content Management System.

http://www.jasoncoward.com/


6 Comments


    1. Gauke Pieter
      Nov 02, 2012 at 03:26 AM
      Nice one Jason. We've been using getCache for exactly that. Works like a charm!

      1. YJ Tso
        Nov 12, 2012 at 03:33 PM
        This is great, Jason!

        Question: if you've got &cacheExpires=`0` what's the best way to manually refresh the cached element? Change a property and change it back? Or can you pass it a property it doesn't accept, like &refresh=`1` and just change that every time you want to do a "hard-reset"?

        1. Jason Coward
          Nov 12, 2012 at 07:22 PM
          That's what the plugin/menu item is for; flush the custom partitions manually. You'll have to develop a strategy using this approach, or develop a custom manager page for more detailed management of said custom cache partitions, but as usual, there are a lot of options available. Generally though, attaching a plugin to the main OnSiteRefresh event let's you do it manually at the click of a button.

        2. Oliver Haase-Lobinger
          Nov 13, 2012 at 02:41 AM
          Hi there,

          Maybe every resource could have something like a "list of ingrediances" where all external elements are listed with their status (unchanged, changed). If any of them was changed, then the resource needs a complete cache refresh. This list might have to be created recursiv to catch all subdependencies. Easy to say but not as easy to code.

          So, maybe an addition to the resource option "clear cache on save" could be a "and rebuild" function. For better accessability there could be a button in the header, saying "Save + Update Cache". This way a page would get save AND the cache would be created/overwritten "instantly" (after some processing time).

          Also, a system wide menu function "Refresh Cache for all Resources" would lay more cache control into the hands of the admins. This could be done right now without a change of the current caching strategy. This might take a little more processing time but then the admin could be sure that all pages are chached now. THis wouldn't be a function one should use every five minutes but something that would help right now. For better useability this "Cache Refresh" could say how many resources would have to be recached. When the process was started you see a "Not cached" marker changing to "Cached". Maybe some AJAX-voodoo could help with that and prevent script timeouts.

          cheers and still: "MODX rocks!"
          Oliver

          1. Sascha Merkofer
            Nov 13, 2012 at 03:31 AM
            The thing with the «list of ingredients» is that:

            1) this would need to be an API

            2) every and all Extras would need to hook-in for this to work



            So even a little snippet, that grabs some resources, would then need to get/set the «I changed something» flag in the right places.



            I can see the «rebuilding cache» as an option for some scenarios. This falls apart on big sites though, since every save would take a huge hit on the server.

            Imagine 10 users saving a resource at the same time. That would rebuild the cache 10 times – at the same time.



            Don't want to destroy all your ideas here, but I think the «solution» is

            A) not a «rule them all» one and

            B) should involve thoughts on limiting the consumption of resources a single page-view can cause in the first place.



            So, for example: not listing 400 rendered resources in one go – but paginate and lazy-load them.



            Cheers, Sascha

          To leave a comment, please Login.