ETC2

CS290F Fall 2006 - UCSB Computer Science - Thorsten von Eicken

Jump to: navigation, search

How to implement and expire caches on paginated pages

Implementing "caches_page" or "caches_action" on your project will significantly benefit your service performance since your app will access the database much less frequently. In RoR, you can simply declare methods to be "caches_page" or "caches_action" in the controller as below :

  caches_action :show_cuisines
  caches_action :show_reviews
  caches_action :user_info

The declaration as above will cause the app to create "html" pages in the /tmp/cache directory under the machine name -- if you use "caches_page", it creates in /public" directory under the model name. For instance, the app that runs on my development machine clippers.cs.ucsb.edu will have a directory

/tmp/cache/clippers.cs.ucsb.edu.3000/reviewer

,which has 3 sub directories

show_cuisines  show_reviews  user_info

Each of these directories has "cache" files, or "html" files if you use "caches_page", that contain html body of the pages, which have been requested by users. So, when the next request arrives on one of those pages, the app will simply return the "cache" html file rather then accessing the database to generate a new html page for the same request.

If those methods always return static pages, you don't need to expire those pages. However, if those methods requre to update pages under certain circumstance, you must delete those old cached pages in the directory accordingly. For example, in my project, whenever a user write a new review or update an existing review on some restaurant, my app must expire all the cached pages that contain information about that particular restaurant. This operation can be done by using "expire_action". For instance, in my 'update method, I added:

expire_action :action=>"show_reviews", :id=>@review.restaurant_id
expire_action :action=>"user_info", :id=>session[:username]

Now, the problem that I was facing was that I wanted to cache pages that are paginated.

When you apply the 'paginate' method, it generates a URI looking like below:

/reviewer/show_cuisines/0?page=1&cuisine_type=bars

Now, the problem here is that all the pages, such as "page=1", "page=2", "page=52" and so on, share the same id, in this case "id=0".

So, the requests

/reviewer/show_cuisines/0?page=1&cuisine_type=bars
/reviewer/show_cuisines/0?page=2&cuisine_type=bars
... 
/reviewer/show_cuisines/0?page=52&cuisine_type=bars

will receive the idential page, whichever was initially cached under the "id = 0"; when the app caches pages, it ignores all the other parameters, such as "page" and "cuisine_type", and save the cache using only the "id" value. For instance, in the directory " /tmp/cache/clippers.cs.ucsb.edu.3000/reviewer/show_cuisines", it will have a sinlge cache file "0.cache", which contains "?page=1&cuisine_type=bars" request. After this cache is save, all the following requests that has a URI prefix of "/reviewer/show_cuisines/0?" will receive this "0.cache" file, regardless of "page" and "cuisine_type" values.

In order to get around this issue, you need to modify the "config/route.rb" file.

map.show_cuisines 'reviewer/show_cuisines/:id/:cuisine_type/:page', :controller => 'reviewer', :action => 'show_cuisines'
map.show_cuisines 'reviewer/show_cuisines/:id/:cuisine_type', :controller => 'reviewer', :action => 'show_cuisines'

I added above lines in my "route.rb" file, which converts the previous URI

/reviewer/show_cuisines/0?page=1&cuisine_type=bars

to

/reviewer/show_cuisines/0/bars/1

Now, my cache file in the directory looks like

/tmp/cache/clippers.cs.ucsb.edu.3000/reviewer/show_cuisines/0/bars/1.cache

If the request that I wanted before the "route.rb" modification was below:

/reviewer/show_cuisines/0?page=5&cuisine_type=fastfood

Now, I need to request it using the URL form

/reviewer/show_cuisines/0/fastfood/5

and it generates a new cache file in the directory

/tmp/cache/clippers.cs.ucsb.edu.3000/reviewer/show_cuisines/0/fastfood/5.cache

So, even with the same id, "id=0", it can generate multiple caches according to the other parameters.

Next issue is how to expire those cache pages.

Apparently, using simple 'expire_action :action => "", :id => ""' will not work; you must specify more much detailed information on which cache under the "action/id" directory you'd like to expire -- it could be under "/bars" or "/fastfood".

In my case, if a review is submitted or modified for the restaurant id = 1001, I need to delete all the cached pages under

/reviewer/show_reviews/1001/

This directory might contain 1.cache, 2.cache, 3.cache and so on.

To solve this issue, I found this beautiful expire fuction "expire_fragment", which is what "expire_action" and "expire_page" use internally.

It takes many form of inputs, but I liked its feature that it allows you specify regular expression. So, my expire function in my "update" or "submit" methods look like:

expire_fragment( %r{.*/reviewer/show_reviews/#{@review.restaurant_id}/.*} )
expire_fragment( %r{.*/reviewer/user_info/#{session[:username]}/.*} )

The functions above delete every file in the matching directory, thus eliminating all the cache pages generated by "paginate" method.

Problem in Distributed Application Servers

A problem arises when application servers run on multiple, distributed machines where cached pages are disseminated across machines; the RoR cache expiration mechanism decribed above only removes local cache. Thus, in order to expire outdated cache on all app machines, I created a simple bash script that deletes targeted cache files using ssh command:

#!/bin/bash

# be sure that every machine's name is in root's .ssh/known_host
# change the hostnames accordingly

#web host
webserver="domu-12-31-33-00-03-8c.usma1.compute.amazonaws.com"

#app host
hostname1="domu-12-31-33-00-03-8c.usma1.compute.amazonaws.com"
hostname2="domu-12-31-33-00-03-41.usma1.compute.amazonaws.com"
hostname3="domu-12-31-33-00-03-76.usma1.compute.amazonaws.com"

review=$1
user=$2

chmod 600 /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key


ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname1 rm -fr
/home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/show_reviews/$review
ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname1 rm -fr
/home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/user_info/$user

ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname2 rm -fr
/home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/show_reviews/$review
ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname2 rm -fr
/home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/user_info/$user

ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname3 rm -fr
/home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/show_reviews/$review
ssh -i /home/user/Lib/Rails/u/apps/hotspots/current/ec2.key root@$hostname3 rm -fr
/home/user/Lib/Rails/u/apps/hotspots/current/tmp/cache/$webserver/reviewer/user_info/$user

Then, in the method, you can run the script above by adding a line as below:

system "./clean_cache.sh #{@review.restaurant_id} #{session[:username]}"

Thanks and let me know if you have any question

Kyo

Personal tools